[zanata-users] push/pull should not add whitespace

Sean Flanigan sflaniga at redhat.com
Fri Jun 3 04:24:42 UTC 2011


On 06/03/2011 12:58 AM, Bryan Kearney wrote:
> I pushed up po files, and then pulle dhtem back down. The attached file 
> is the diff. It appears to be mostly whitespac issues. I see two bugs here:
> 
> 1) The push/pull should be lossless.
> 2) There should be no extraneous white spaces introduced by the tool.

Hi Bryan,

Zanata doesn't guarantee that translation files (eg PO) will come back
bit-for-bit identical, just the content: strings, their translations,
and metadata.

Zanata isn't storing PO files as PO files, it's storing the content.  It
tries to keep the output files pretty similar when re-generating them,
but it's virtually impossible to get them identical in all cases.

For instance, the line-wrapping rules used by gettext aren't documented
anywhere, so Zanata can't guarantee to follow them.  (If you *really*
want to minimise diffs, you could try a commit-hook which passes your PO
files through msgcat before check-in.)

There are also a few cases where Zanata introduces empty comment lines,
which we really shouldn't...

https://bugzilla.redhat.com/show_bug.cgi?id=710321
https://bugzilla.redhat.com/show_bug.cgi?id=710322


Your diff also shows some source references disappearing, eg

-#:·src/main/java/org/fedoraproject/candlepin/resource/EntitlementResource.java:115
-#:·src/main/java/org/fedoraproject/candlepin/resource/ConsumerResource.java:651
#:·src/main/java/org/fedoraproject/candlepin/resource/SubscriptionResource.java:98


That would be more concerning, but I think I've worked out what's happened.

I can't work directly with your diff, but I think I found your PO files
in git:
http://git.fedorahosted.org/git/?p=candlepin.git;a=tree;f=proxy/po;h=ae6153354fe4fa02402670ceba44e104547ddf11;hb=HEAD

I think most of the changes you're hitting are caused by the fact that
your PO files are out of date compared to the POT file.  If you run this:
  for f in *.po; do msgmerge --update $f keys.pot; done
you'll find that many of the source references change or disappear, the
key-value pairs are re-arranged, and some messages actually get
commented out by msgmerge, like this:

#~ msgid "Owner with UUID '{0}' could not be found"
#~ msgstr "UUID '{0}' থকা অধিকাৰী পোৱা নগল"

These are messages which no longer exist in the POT file, ie they are
obsolete.

So the other big thing that can lead to differences is inconsistencies
between POT and PO.  As with msgmerge, the POT file is considered
authoritative for most things, so the order of key-value pairs in the
header (msgid "") will be carried across when generating PO files.  The
same goes for the source references, and even the list of messages and
their order.

Regards

Sean.

-- 
Sean Flanigan

Senior Software Engineer
Engineering - Internationalisation
Red Hat

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 554 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/zanata-users/attachments/20110603/671293fe/attachment.sig>


More information about the zanata-users mailing list