[libvirt] Valid characters in domain names?

Richard W.M. Jones rjones at redhat.com
Mon Oct 4 10:35:47 UTC 2010


On Mon, Oct 04, 2010 at 08:38:57AM +0200, Daniel Veillard wrote:
> On Sun, Oct 03, 2010 at 11:51:12PM +1100, Justin Clift wrote:
> > On 10/03/2010 08:33 PM, Richard W.M. Jones wrote:
> > <snip>
> > >Indeed.  I'm sure we need a whitelist, not a blacklist as suggested by
> > >the other comment.  All domains I'd ever want to create would match
> > >the regexp
> > >
> > >^[[:alpha:]][-_[:alnum:]]*$
> > >
> > >This might break existing users however.
> > 
> > Wonder if there are characters supported by some hypervisors, but not
> > others?
> 
>   I remember we had troubles with Xen, a long time ago, yes
> So unfortunately this is really hypervsor specific... Maybe we could
> have a generic checking routine but only providing a warning when
> the name isn't a simple name the XML way. One of the problem of the
> checking too is that most of the hypervisor APIs don't say a word about
> encoding, so you're not manipulating characters but 0 terminated byte
> strings. From there even your simple regexp goes havoc because what is
> an alphanumeric character, requires character analysis and you need the
> encoding for this. At least at libvirt API things are rather clear,
> in XML data there is no ambiguity possible, and outside we expect
> strings to be UTF-8.
>   Actually I think that for ESX since all exchanges with the hypervisor
> are XML based there isn't that ambiguity about encoding at least.
> 
> >   ie maybe Xen supports '/', '*', '+' in guest names, but ESX doesn't
> > 
> > That could lead to some interesting guest import problems. :(
> 
>   goes beyond that, someone using any non-ascii name will hit hypervisor
> specific behaviour, ISO-Latin, asian language ... and we habe no control
> over this except for some checking and the possibility of a warning.

I think any reasonable analysis of this should start with where the
names come from:

- virDomainDefineXML (eg. virsh define, virt-install, V2V import etc)

- a list of existing domains from a hypervisor API
  (eg. /etc/xen files, Xen hypercall, ESX XMLRPC call)

- already defined in an older version of libvirt which didn't do checking

- [any others?]

For the virDomainDefineXML route, we (a) know the names are UTF-8, and
(b) know that these domains are being created for the first time.  And
I think for this route we should add a regexp-like restriction.  (Note
when I wrote [:alnum:] before, that ought to cover all Unicode
characters in the alphanumeric classes, so it doesn't exclude non-US
characters).

There are further points that may need to be fixed within the drivers.
The drivers are probably just passing the UTF-8 strings through to
everything, but may need to do conversion.  eg. If I've learned
anything about Microsoft developers, then a hypothetical Hyper-V
driver would almost certainly need to convert between UTF-8 and
UTF-16LE.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into Xen guests.
http://et.redhat.com/~rjones/virt-p2v




More information about the libvir-list mailing list