[Libguestfs] translating CD-ROM device paths from i440fx to Q35 in virt-v2v

Laszlo Ersek lersek at redhat.com
Fri Oct 1 11:29:39 UTC 2021


On 10/01/21 13:03, Richard W.M. Jones wrote:
> On Fri, Oct 01, 2021 at 12:35:50PM +0200, Laszlo Ersek wrote:
>> Hi Rich,
>>
>> (dropping libvirt-devel)
>>
>> On 09/30/21 13:53, Richard W.M. Jones wrote:
>>
>>> Also we don't currently try to find or rewrite /dev/disk/ paths in
>>> guest configuration files.  The only rewriting that happens is for
>>> /dev/[hs]d* block device filenames and a few others.  The actual code
>>> that does this is convert/convert_linux.ml:remap_block_devices
>>>
>>> So I wouldn't over-think this.  It's likely fine to identify such
>>> devices and rewrite them as "/dev/cdrom", assuming that (I didn't
>>> check) udev creates that symlink for any reasonably modern Linux.  And
>>> if there's more than one attached CD to the source, only convert the
>>> first one and warn about but drop the others.
>>
>> So I've started reading "convert/convert_linux.ml" in parallel with the
>> OCaml manual. (I dabbled for a few weeks in Haskell when everyone else
>> did a few years ago, so it's not 100% unfamiliar.)
>>
>> Question:
>>
>>   let family =
>>     match inspect.i_distro with
>>     | "fedora"
>>     | "rhel" | "centos" | "scientificlinux" | "redhat-based"
>>     | "oraclelinux" -> `RHEL_family
>>     | "altlinux" -> `ALT_family
>>     | "sles" | "suse-based" | "opensuse" -> `SUSE_family
>>     | "debian" | "ubuntu" | "linuxmint" | "kalilinux" -> `Debian_family
>>
>> Here, RHEL_family, ALT_family, SUSE_family, Debian_family are not plain
>> constructors ("fixed" variants) [1] but polymorphic ones [2].
>>
>> Why?
>>
>> Was this done *only* in order so that an explicit type definition such
>> as
>>
>>   type os_family = RHEL_family | ALT_family | SUSE_family | Debian_family
>>
>> could be avoided?
> 
> There are two types of variants in OCaml.  The first is the normal one
> and it requires a type definition.  eg:
> 
>   type t = A | B | C of int | D of string * int
> 
> with values like “A” or “C 3” or “D ("hello", 1)”.
> 
> These are ones we'd normally use and have the (arguable?) advantage
> that you have to write (and therefore, document) the full type somewhere.
> 
> The second type are polymorphic variants.  These are actually way more
> complicated than I'm going to describe below because they have a load
> of set theory behind them (if you really want to know see the docs
> that you've already linked to).  They are written with backtick before
> the name of the variant.  The brief idea is that you let the OCaml
> compiler work out the type for you.
> 
> For example:
> 
>   let f = function 1 -> `A | 2 -> `B | i -> `C i
> 
> creates a function f which returns the type:
> 
>   type pv = [ `A | `B | `C of int ]
> 
> Actually if you type this into the toplevel it says:
> 
>   $ rlwrap ocaml
>         OCaml version 4.12.0
> 
>   # let f = function 1 -> `A | 2 -> `B | i -> `C i  ;;
>   val f : int -> [> `A | `B | `C of int ] = <fun>
> 
> As you observe below there's an extra ">" character.  That means that
> the type is "at least" that one, and there might be more cases, but
> the function that I wrote only contains the 3 cases.
> 
> Anyway the short of this is that these open-ended, automatically
> resolved types don't require a type definition so they're less
> documented but more flexible.
> 
> (And another thing: It's possible to reuse the constructors in
> unrelated types.  eg. `A might be part of another type.  But don't do
> this unless it's for a very good reason).
> 
> In all cases the OCaml compiler enforces type safety, so although
> these types are less explicit, they're still safe.  If I extend my
> example above and forget about one of the cases, the compiler will
> warn (in the v2v code, we've set up the flags to give a hard error):
> 
>   # let g j = match f j with `A -> 1 | `B -> 2 ;;
>   Warning 8 [partial-match]: this pattern-matching is not exhaustive.
>   Here is an example of a case that is not matched:
>   `C _
>   val g : int -> int = <fun>
> 
>> Based on my (incomplete understanding) of the ocaml docs, this looks
>> like a bad idea. We need no polymorphism here, and using a static type
>> (a fixed variant) is generally beneficial [3].
>>
>> The following (minimally modified) definition at the OCaml REPL:
>>
>>   let family1 distro =
>>     match distro with
>>     | "fedora"
>>     | "rhel" | "centos" | "scientificlinux" | "redhat-based"
>>     | "oraclelinux" -> `RHEL_family
>>     | "altlinux" -> `ALT_family
>>     | "sles" | "suse-based" | "opensuse" -> `SUSE_family
>>     | "debian" | "ubuntu" | "linuxmint" | "kalilinux" -> `Debian_family
>>     | _ -> assert false;;
>>
>> deduces the following type:
>>
>>   string -> [> `ALT_family | `Debian_family | `RHEL_family | `SUSE_family ] = <fun>
>>
>> with the nasty "[>" mark at the start (allowing for further type
>> refinement [2], which I think we don't need here). Conversely.
>>
>>   type os_family = RHEL_family | ALT_family | SUSE_family | Debian_family;;
>>
>>   let family2 distro =
>>     match distro with
>>     | "fedora"
>>     | "rhel" | "centos" | "scientificlinux" | "redhat-based"
>>     | "oraclelinux" -> RHEL_family
>>     | "altlinux" -> ALT_family
>>     | "sles" | "suse-based" | "opensuse" -> SUSE_family
>>     | "debian" | "ubuntu" | "linuxmint" | "kalilinux" -> Debian_family
>>     | _ -> assert false;;
>>
>> (note the explicit os_family type definition, and the removal of the
>> backticks from the constructor names) comes back with the type:
>>
>>   val family2 : string -> os_family = <fun>
> 
> Yes, this is all fair comment.
> 
>> Basically converting a string to an enum constant.
>>
>> So, what's the reason for the polymorphic variant?
> 
> I guess you could look at git history, but most likely it was "just
> because".  It's a bit quicker and adding a new cases involves less
> code churn.
> 
> I'm not opposed if you want to make the change above if you think it
> improves readability.

Thanks for the help! I don't want to leave a trail of such patches
marking my learning of OCaml :) Just wanted to understand the reason.

Laszlo

> In general I try to avoid tricksy features of
> OCaml, like functors, and this could (arguably) be one of those cases.
> 
> BTW as an aside the implementation of the two types of constructors is
> completely different, and polymorphic variants have a slower
> implementation.  For ordinary constructors they generally map to
> integers, which makes them very efficient (eg. they can be stored in
> registers etc).  Polymorphic variants use a hash of the name instead
> making them slower overall.
> 
> (see table in https://rwmj.wordpress.com/2009/08/05/ocaml-internals-part-2-strings-and-other-types/)
> 
> But a fun thing is you can cause hash collisions (although it is
> picked up by the compiler - it's still type safe!):
> 
> # type collision = [`Eric_Cooper | `azdwbie];;
> Error: Variant tags `azdwbie and `Eric_Cooper have the same hash value.
>        Change one of them.
> 
> (https://groups.google.com/g/fa.caml/c/68oTntkRLyU)
> 
> There's even a link-time pass to find hash collisions across the whole
> program.
> 
> Rich.
> 
>> [1] https://ocaml.org/manual/coreexamples.html#s%3Atut-recvariants
>> [2] https://ocaml.org/manual/polyvariant.html#sec48
>> [3] https://ocaml.org/manual/polyvariant.html#s%3Apolyvariant-weaknesses
>>
>> Thanks!
>> Laszlo
> 




More information about the Libguestfs mailing list