Format/style of UI message

Wed Jul 22 13:48:57 UTC 2020

On a Friday in 2020, Daniel P. Berrangé wrote:
>On Fri, Jul 17, 2020 at 05:01:47PM +0200, Pino Toscano wrote:
>> Hi,
>>
>> I recently took a look at the UI/user visible messages from libvirt,
>> which are translated using gettext. They are extracted in a single
>> libvirt.pot catalog, which includes messages from libvirt.so itself
>> (mostly, if not all, errors), the separate daemons, the helper tools,
>> and from virsh.
>>
>> I noticed there is plently of room for improvements: what strikes is
>> the lack of consistency among the messages. Let me state first: I
>> understand that not all the people are native English speakers
>> (I am not), so I'm not picking against anyone.
>
>Yes, the lack of consistency is pretty bad and makes more work for
>our translators.
>

Also, I'm sure a portion of our translatable strings are in unreachable
error paths (i.e. we are looking up some data that we just succesfully
put there a few lines above) and by Murphy's law, there are code paths
missing an error completely or having an undescriptive message.
Hopefuly aborting on OOM will help us erase more messages.

>> Some examples:
>>
>> a) different capitalization:
>> - "cannot open %s"
>> - "Cannot open %s"
>>

I vote for the capitalized version, see below.

>> b) different quoting for files/identifiers/etc:
>> - "Cannot open %s"
>> - "Cannot open '%s'"
>>

Yes, sometimes the error is worded in a way that prevents this,
e.g.
    current vcpus count must be an integer
for
    <vcpu current='x'>

We could even pass the hardcoded identifiers via %s, e.g.
   _("Invalid value of '%s': %s"), cpuset, tmp
instead of:
   _("Invalid value of 'cpuset': %s"), tmp
to prevent the identifier from being translated.

>> c) different verbs for failed actions:
>> - "Cannot frobnicate ..."
>> - "Could not frobnicate ..."
>> - "Did not frobnicate ..."
>> - "Failed to frobnicate ..."

"Failed to" seems most factual here

>> - "Unable to frobnicate ..."
>> depending on the message, also "frobbing failed"

Frobbing failed takes one extra character compared to that.

>>
>> d) sometimes contractions ("couldn't", "don't", etc), sometimes not
>> ("could not", "do not", etc)
>>
>> e) what QEMU/etc supports:
>> - "... by this QEMU binary"
>> - "... for this QEMU binary"
>> - "... in this QEMU binary"
>> - "... with this QEMU binary"
>> - "... by this QEMU"
>> - "... for this QEMU"
>> - "... with this QEMU"
>> - "... with this binary" [in a QEMU file]
>> - "... [supported] by qemu"

There are possibly subtle nuances there:
   "by this QEMU binary" -> the particular QEMU does not support it at all - it was not
      impleneted yet or it was compiled out
   "with this QEMU binary" -> it might but libvirt does not bother to do the legacy part
   "by qemu" -> not in QEMU at the moment of writing this error message
   "for this QEMU binary" just sounds wrong to me, maybe a native speaker
   can correct me on that?

   (but I bet most of the uses did not care about those and just copied
   and pasted it from somewhere)

Also, does 'QEMU binary' vs. 'QEMU' bring any extra clarity?

>> there is also "qemu does not support ...", which I think it can stay

Most of these are quarded by QEMU_CAPS so they fall into one of the
first two categories above. I think I found only 'accel2d' that was
never intended to be supported by QEMU.

>> for now; also both "available [by/for/etc]" and "supported [by/for/etc]"
>> are used

That should be 'supported for <functionality>', not 'supported for
QEMU'.

>>
>> I can give it a try in fixing the messages to be more consistent all
>> around; before I start the mass editing, I need to know which style to
>> follow:

If you put the style in writing first, other people might help too.

>>
>> a) it seems like the virError fields @message, @str1, @str2 and @str3
>> are joined together in reporting/log strings like "error: <text>";
>> hence, should they be not capitalized? It may look OK in English, but
>> less nice and hard to fix in translations.
>> Obviously, sentences as shown in tools (e.g. virsh) definitely need to
>> be properly capitalized.
>
>I think there is no correct answer here, because even with the error
>messages, the <text> is not always used in the "error: <text>" scenario.
>eg an application like virt-manager will merely display "<text>" in a
>dialog box.
>
>On the one hand I'd suggest lowercase text for error mesages, but if
>the message is multiple sentances that would involve a capital. Probably
>don't have many of the latter though, so standardizing in lowecase is
>likely fine.

Starting with a lowercase letter feels more UNIX-like and helps if the
message starts with a lowercase identifier, but if some apps use the
text on their own, starting with uppercase would be more consistent.

>
>> b) should identifiers such as filenames, paths, XML tags, JSON fields,
>> etc be always quoted?
>
>Generally user data that may go missing should be quoted because it makes
>it more obvious when there is an accidentally empty string provided. I've
>gone back to add quotes every time I've debugged a problem where the empty
>string was involved.  To make it easier as a policy, it is fine to expand
>that to all filenames/path, regardless of whether they come from the user
>data or not. For XML / JSON field names, if it is just a bare word, then
>I'd probably suggest quoting too, as some field names could accidentally
>lead to grammatically correct but misleading error messages if unquoted.
>
>
>> c) which verb to use when something failed? "could not" is a subjective
>> thing, not a past action; "failed" seems to imply that something was
>> attempted; "did not" seems to imply that it was not done, but nothing
>> whether it was attempted; the rest sort of indicate the ability to do
>> something.
>

This one seems like more complicated question than the others and should
not let us from e.g. quoting the identifiers first.

Jano

>I don't especially care which we use, as long as we're pretty
>consistent. Perhaps the thing todo is just see which is the most
>popular  usage today, so we invalidate the fewest translations
>when changing.
>
>> d) allow contractions or not? They are generally used in spoken/informal
>> language, and while libvirt is not that formal it should not be that
>> colloquial either IMHO; also, they make the text slightly harder to
>> understand by non-native speakers, and they are lost when translating.
>> A POV on the matter is:
>> https://www.businesswritingblog.com/business_writing/2006/04/dont_use_contra.html
>
>Yeah, I think I've seen enough recommendations about not using
>contractions, that we should apply that rule.
>
>> e) which message to use to indicate that QEMU does not support
>> something?
>
>I don't have a strong preference. Perhaps again just let a popularity
>contest decide it.
>
>
>
>I wonder if there's any clever python code we can pull in that reports
>on "similar" strings that we could usefully run across the pot file
>to identify candidates for sanitizing.
>
>Also if there are many cases where we use roughly the same string
>message, then that's a candidate for creating a wrapper function
>to standardize on message text.
>
>eg we added a virReportEnumRangeError() so that we got guaranteed
>identical error messages for all enum range problems.
>
>Regards,
>Daniel
>-- 
>|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20200722/95478ac2/attachment-0001.sig>