[libvirt] Redesigning Libvirt: Adopting use of a safe language

Martin Kletzander mkletzan at redhat.com
Mon Nov 20 16:36:24 UTC 2017


On Mon, Nov 20, 2017 at 03:25:33PM +0000, Daniel P. Berrange wrote:
>On Mon, Nov 20, 2017 at 12:24:22AM +0100, Martin Kletzander wrote:
>> On Tue, Nov 14, 2017 at 05:27:01PM +0000, Daniel P. Berrange wrote:
>>
>> [...]
>>
>> > I don't have direct experiance in Rust, but it has the same kind of benefits over
>> > C as Go does, again without the downsides of languages like Python or Java. There
>> > are some interesting unique features to Rust that can be important to some apps.
>> > In particular it does not use garbage collection, instead the user must still do
>> > manual memory management as you would with C/C++. This allows Rust to be used in
>> > performance critical cases where it is unacceptable to have a garbage collector
>> > run. Despite a requirement for manual allocation/deallocation, Rust still
>> > provides a safe memory model. This approach of avoiding abstractions which will
>> > introduce performance overhead is a theme of Rust. The cost of such an approach
>> > is that development has a higher learning curve and ongoing cost in Rust, as
>> > compared to Go.
>> >
>> > I don't believe that the unique features of Rust, over Go, are important to the
>> > needs of libvirt. eg while for QEMU it would be critical to not have a GC
>> > doing asynchronous memory deallocation, this is not at all important to libvirt.
>> > In fact precisely the opposite, libvirt would benefit much more from having GC
>> > take care of deallocation, letting developers focus attention other areas. In
>> > general, as from having a memory safe language, what libvirt would most benefit
>> > from is productivity gains & ease of contribution. This is the core competancy
>> > of Go, and why it is the right choice for usage in libvirt.
>> >
>
>> So the first thing I disagreed on is that in Rust you do manual
>> allocations.  In fact, you don't.  Or, depending on the point of view,
>> you do less or the same amount of manual allocation than in Go.  What is
>> the clear win for Rust it the concept of ownership and it's related to
>> the allocation mentioned before.
>
>I shouldn't have used the word "allocation" in my paragraph above. As
>you say, both languages have similar needs around allocation. The difference
>I meant is around deallocation policy - in Rust, object lifetime is is a more
>explicit decision on control of the progammer, as opposed to Go's garbage
>collection.  From what I've read Rust approach to deallocation is much
>closer to the C++ concept of "smart pointers", eg this
>
>  http://pcwalton.github.io/blog/2013/03/18/an-overview-of-memory-management-in-rust/
>

This is kind of old, that code wouldn't run with newer Rust.  I guess
that is from far ago when it was not stabilized at all.  It is a bit
smarter now.  The fact that you have control over when the value is
getting freed is true, however you rarely have to think about that.
What's more important is that the compiler prevents you from accessing
value from multiple places or not knowing who "owns" (think of it as
"who should take care of freeing it") the variable.  If you give the
ownership to someone you can't access it.  The difference I see is that
if you access it after some other part of the code is responsible for
that variable in Rust the compiler will cut you off unless you clearly
specify how the memory space related to the variable is supposed to be
handled.  In Go it will just work (with potential bug) but it will not
crash because GC will not clean it up when someone can still access it.
Granted this is usually problem with concurrent threads/coroutines (which
I don't know how they handle access concurrent access in Go).  Also, for
example Rust doesn't allow you to have value accessible from multiple
threads unless it is guarded by thread-safe reference counter and does
not allow you to modify it unless it is guarded by Mutex, RWLock or s
one of the Atomic types.  Again, I don't know Go that much, I'm still
yet to delve into the deep unknowns of it, but I haven't heard about it
providing such safety.

>> I am standing strongly behind the opinion that the learning curve of
>> Rust is definitely worth it.  And coming from the C world, it is easy to
>> understand.  To me, it is very easy to explain that concept to great
>> detail to someone who has background in libvirt.  And the big benefit
>> (and still a huge opportunity for improvement WRT optimizations) is that
>> the compiler must know about it and so it is resolved compile-time.
>> Dereferencing or destructors are run at the end of their scope,
>> automatically.  You can nicely see that when realizing that Rust doesn't
>> need any `defer` as Go has.
>
>Nb the 'defer' concept isn't really about memory management per-se, rather
>it is focused on cleanup of related resources - eg deciding when to close
>an open file handle, or when to release another resource. Everything thats
>just Go object memory is handling by the GC.
>
>> One of the things is that the kinks we have can be ironed out in C as
>> well.  It might be easier in other languages, but it is harder when you
>> have to switch to one.  We have bunch of code dealing with backwards
>> compatibility.  And I argue that this is something that causes issues on
>> its own.  What's even worse, IMHO, is that we are so much feature-driven
>> that there is no time for any ironing.  I see too much potential for
>> refactoring in various parts of libvirt that will never see the lights
>> of day because we need X to be implemented.  And contributors sending
>> feature requests that they fail to maintain later don't help much with
>> that.  Maybe we could fix this by saying the next Y releases will just
>> be bugfix releases.  Maybe we could help bringing new contributors by
>> devoting some of our time to do an actual change that will make them
>> want to help us more.  I know some of you will be sick and tired hearing
>> about Rust once more, but have you heard about how much their community
>> is inclusion-oriented?  I guess what I'm trying to say is that there are
>> other (and maybe less disruptive) ways to handle the current problems we
>> are facing.
>
>I'm not going to debate that there's plenty of problems we could be
>tackling, and changing language is not a magic bullet for all of
>them. My primary motivation is to start to get out of the world where
>we have random crashes & security bugs due to overflowing buffers,
>double frees, use after free, and so on. Problems that have been
>solved by every language invented since C in the last 30-40 years.
>
>On the choice of language, I will say C is a turn off to many people
>as it is (not unreasonably) viewed as archaic, hard to learn and
>difficult to write good code in.
>

I would say this highly depends on what area you are coming from.  In
the cloud world it would be viewed way differently than in the
dark^Wlow-level side of things.  I don't want to compare libvirt to the
kernel for example, but I haven't heard about C being a turn-off there.

>When I worked in OpenStack it was a constant battle to get people to
>consider enhancements to libvirt instead of reinventing it in Python.
>It was a hard sell because most python dev just didn't want to use C
>at all because it has a high curve to contributors, even if libvirt
>as a community is welcoming. As a result OpenStack pretty much reinvented

I'm sorry, but I think only handful of us (yeah, I think and hope I
could count myself amoungst that group) are welcoming.  But it's
actually where I see one of the big turn-offs.

>its own hypervisor agnostic API for esx, hyperv, xenapi and KVM instead
>of enhancing libvirt's support for esx, hyperv or xenapi. They only end
>up using libvirt for KVM and libxl really. I hear similar comments from
>people working in virt related projects in Go. So use of C really does
>have an impact on our pool of potential contributors. This is disappointing

It's hard to guess how that would turn out if C wasn't turn-off for
them.  Maybe we would get bunch of patchsets submitted that would be in
better shape language-wise, but would that help with understanding
various internal structures and behaviours of libvirt?  Not considering
the fact that some language would make it more readable.  Or would we
just get bunch more drive-by patches that we would have to fix/maintain?
I don't think we can answer that.

>as there are huge numbers of people working on virt related projects, they
>just don't ever go near C - virt-viewer & virsh are probably the only
>"apps" using libvirt from C - all the others use a higher level language
>via one of our bindings.
>
>
>> And then there are the "issues" with Go (and unfortunately some with
>> Rust as well :'( ).
>
>Yep, no choice is ever perfect.
>
>> Lot of the code for libraries is written with permissive licences, but
>> if there is some that is LGPL-incompatible we can't use them.  And in
>> ecosystems such as Rust and Go there are fewer alternatives, so we might
>> not find one that we'll be able to use.  If that happens, there goes
>> bunch of our time.  Like nothing.
>
>I've not looked at the rust ecosystem, but from I've seen in Go most
>devs tend to go for even more permissive licensing, ie BSD/Apache/MIT.
>Disappointly few people pick GPL variants :-(  So side from the
>complication with our virtualbox code being GPLv2-only, I don't think
>there's a license problem to worry about with Go, anymore than we
>have to worry about with C today.
>

OK, and ...

>> How do we deal with problems/bugs in dependence libraries?  I know, all
>> the projects are pretty new, so they might be nicer to contributors.  If
>> they are not, we might need to fork or rewrite the code.  Bam, another
>> chance of losing workforce.
>
>Many C libs we depend on have been around along time so are more
>mature, but are also more conservative  in accepting changes,
>especially if they touch API. On balance I don't think there would
>be a big difference either way in this area.
>

... OK, I hope so and if you say so I will blindly believe that ;)

>
>> How does Go handle updates in dependency libs?  Does it automatically
>> pull newest version from public repositories where some unknown person
>> can push whatever they want?  Or can they be hash- or version-bound?
>
>Originally it was quite informal (and awful) - your 3rd party deps
>just had to be present in $GOPATH, and there was no tracking of
>versions. No significant sized project works this way anymore
>because it is insane.
>
>Instead go introduced the concept they call the "vendoring" where
>you have a top level dir called vendor/ where all your deps live.
>You app provides a metadata file (in JSON typically) where you
>list the deps and the preferred versions you need. The tool then
>populates vendor/ with the right code to build against. Think
>of vendor/ has been sort of like GIT submodules, but not using
>GIT submodules, and you'll be thinking along the right lines.
>Go libraries are being strongly encouraged to adopt semver for
>their versioning.
>

Oh, good, I was hoping for this.

>> The build process is that all the binaries are static, right?  Or all
>> the go code is static and it only has dynamic dependencies on C
>> libraries? In such a project as libvirt, wouldn't that mean that the
>> processes we run will be pretty heavy-weight?
>
>Yes, all the Go code is statically linked - only C libs are dyn
>loaded. This doesn't have any impact on runtime, because even if
>the binary was 100's of MB in size, the kernel is only ever going
>to page in sections of that file which are actually executed.
>

But the code that is used by each binary will be present as many times
as that binary is running since it is not dynamically loaded, right?

>The main impact is that if a dependancy gets an update (eg for
>a security fix) all downstream apps need rebuilding.
>

Well, that sucks, but it's not a deal-breaker.

>> How is it with rebuilding after a small change.  I know Go is good when
>> it comes to compilation times.  That might be something that people
>> might like a lot.  Especially those who are trying to shave off every
>> second of compilation.  However if you cannot use ccache and you always
>> need to rebuild everything, it might increase the build-time quite a
>> lot, even thought indirectly.
>
>Compile times are great - the compiler is very fast, and it does
>caching on a per-package basis. NB in go a "package" is an individual
>directory in your source tree, so a typical app would have 10's
>or 100's of packages, each corresponding to a separate subdir.
>Source dirs are probably more fine grained that we use in libvirt.
>eg what's in src/qemu in libvirt currently would likely end up
>being spread across as many as 5 packages if it were idiomatic
>Go.
>
>The biggest win though comes from not needing autoconf, automake.
>Of course libvirt wouldn't see that benefit as long as any of our
>code were still C, so I won't claim that's a win in this particular
>case.
>
>> You can't have /tmp mounted with noexec option unless you have TMPDIR
>> set to some other directory.  And sometimes not even in that case.  I
>> guess non-issue for some distros, but bunch of people deal with that and
>> if seems like something that could be taken care of in Go itself and it
>> is just not.
>
>I've not heard of that one before, and so obviously not hit it.
>From Google it seems this applies if you use 'go run' or 'go test'
>commands. The former is not something you typically use, but the
>latter is. It can be avoided by having the make/shell run that
>invokes 'go test' set a local TMPDIR - no need to set it globally
>in your bash profile. I'm not sure what distros have /tmp with
>noexec - Fedora doesn't at least.
>

Usually if you don't have SELinux and want to be guarded a tiny bit
more.  Like me.  Maybe I'll cahnge that someday.  I managed to fix that
by having entry and exit scripts that mangle the TMPDIR env var for me.
And it works in some cases.

>> You can't just clone a repo, cd into it and build it.  You have to get
>> the dependencies in a special manner, for that you have to have GOPATH
>> set and based on that you have to have your directories setup similarly
>> to what Go expects, then you need GOBIN if you want to build something
>> and other stuff that's just not nice and doesn't make much sense.  At
>> least for newcomers.  Simply the fact that it seems to me like Go is
>> trying to go against the philosophy of "Do one thing and do it well".  I
>> know everyone is about having the build described in the same language
>> as the project, but what comes out of it is not something I prefer.
>
>This is relataed to the thing I mention above where historically
>everything was just splattered into $GOPATH. Most apps have gone
>towards the vendoring concept where every dep is self-contained
>in your local checkout.
>

I have to read up on the basics on how to do a proper first-time setup
for Go.  And maybe everything will be sunshine and rainbows from that
point forward.  It's just that I don't like the fact that it's
non-intuitive.  Is it possible somehow to just build some code without
actually dealing with bunch of dependencies and all the setup?  To give
an example, is there a way to differentiate the compiler from the
dependency handling and build processes?  Like gcc and
autoconf/automake/make or rustc and cargo?

>> I'm not against using another language to make some stuff better.  I
>> guess it is kind of visible from the mail that I like Rust, but I'm not
>> against other languages as well.  I just want this to be full on
>> discussion and I want my opinion to be expressed.  Thanks for
>> _listening_ ;)
>
>Either Rust or Go would be a step forward over staying exclusively
>with C IMHO, so at least we agree that there are potential benefits
>to either :-)
>

Yeah.  Good luck with evaluating the responses.  I hope we won't need to
change our Code of Conduct or even resort to voting...

>Regards,
>Daniel
>-- 
>|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20171120/fd32943b/attachment-0001.sig>


More information about the libvir-list mailing list