[Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

Petr Viktorin pviktori at redhat.com
Fri Jul 20 09:09:50 UTC 2012


On 07/19/2012 10:52 PM, John Dennis wrote:
> On 06/25/2012 07:17 AM, Petr Viktorin wrote:
>> The translation files we currently store in Git are full of redundant
>> information: source strings for untranslated messages, and file
>> locations.
>> The first causes unnecessarily huge files. The second makes diffs
>> unreadable: when code is edited and line numbers change, metadata for
>> all messages shows up as changed. This makes reviewing translation
>> patches, and merging possible conflicts, hard -- it requires specialized
>> tools.
>>
>> This patch changes the Makefile to strip the unneeded data from .po
>> files.
>>
>> Translators using Git must now run msgmerge (or, `make merge-po`) to get
>> .po files they can work with. Transifex users are unaffected, as the
>> source .pot file is not changed.
>>
>> The i18n tests use file locations for producing nice error reports¹.
>> To make this work as before, the .pot is merged in before validation to
>> restore comments.
>> Currently this takes a noticeable amount of time, because polib uses a
>> particularly naïve algorithm for merging. I've sent a patch to polib to
>> resolve this; once that makes it downstream merging will be fast again.
>>
>> Updating the translations with the new Makefile will cause a >5MB patch.
>> I don't want to pollute the mailing list with it, at least until the
>> Makefile patch is reviewed. It's available
>> https://github.com/encukou/freeipa/commit/65e2e4.patch
>>
>>
>> https://fedorahosted.org/freeipa/ticket/2435
>>
>>
>> --
>> ¹ And for divining the programming language messages come from, but that
>> is only done on the .pot file, unaffected by this patch.
>
> Good work and it's very close to getting an ACK.
>
> There is now a discrepancy between what the Makefile thinks is the list
> of po files and the actual list of po files after running strip-po. This
> causes confusing errors.
>
> I think the source of this problem is the Makefile has a list of po
> files in the variable $(po_files)
>
> For starters why is:
>
> strip-po:
>      @for po_file in $$(ls *.po); do \
>
> instead of:
>
> strip-po:
>      @for po_file in $(po_files); do \

Good catch, I'll update it to be consistent with the status quo.
But see below.

> If you run "make validate-po" before running "make strip-po" you get:
>
> 5 errors in 21 files
>
> After stripping the po files "make validate-po" gives you:
>
> 14 errors in 21 files

I left updating the files to a subsequent patch 
(https://github.com/encukou/freeipa/commit/65e2e4.patch); the LINGUAS 
update is part of that.

> The extra 9 errors are due to the fact validate-po is being asked to
> validate a non-existent po file which it considers an error (which I
> believe is a correct check).
>
> "make msg-stats" gets confused for the same reason, it's asked to
> examine files that no longer exist.
>
> "make mo-files" now fails catastrophically for the same reason, it's
> being asked to operate on files that don't exist.
>
> In general large parts of the Makefile will now be confused or generate
> errors because the file list is incorrect.
>
>
> Somehow we have to align the list of po files. That presents all sorts
> of interesting questions:
>
> * does the list come from the LINQUAS file? (current method)
>
> * does the list come from git? Doesn't work if you're not in a git
> development tree. This problem is easily seen when the RPM's are built.
> No file list can be generated because there is no git repo so you end up
> with 0 files being passed to the validation commands. Since validation
> is not critical when building RPM's this hasn't been a show stopper but
> it really needs to be fixed in some way at some point.

I agree that tying ourselves to Git isn't a nice thing to do. I know I 
am never happy when I can't compile some project in Mercurial after 
importing it to Git :)

If we use the ls-files strategy then that should at least write the list 
to a version-controlled file, which we fall back to in case we're not in 
a git tree.

> * does the list come from the current directory contents? What you did
> with strip-po, but that also has a potential for errors. What if someone
> deletes or adds a file in their tree by mistake?

I personally would do this -- the most straightforward way to do it. If 
someone adds or deletes a file by mistake, a `git status` will reveal 
it. We could have a sanity check that refuses to build if there is a 
discrepancy between Git and the working tree (of course outside of a Git 
repo it would just warn).

There's one more reason for going with directory contents: when you're 
pulling from Transifex or otherwise adding/removing the translation 
files, you have to carefully keep LINGUAS in sync with the tree, 
otherwise the tools can either blow up or do too little. Debugging that 
could be frustrating. Having the tools look in the directory itself, and 
only doing sanity checking at a point where everything should be in 
order, should make everything easier.

> * should "make strip-po" edit the LINGUAS file? (maybe the best
> solution). Maybe when it detects an empty file and removes it it should
> run a sed command to delete the line in LINGUAS?

That strikes me as too much automation, the case where all translations 
for a language disappear should be pretty rare.
I did add a message saying you should update LINGUAS.

> It may not be evident from Makefile.in but over the years there has been
> competing strategies for how to get our list of files. Simo added the
> "git ls-files" strategy because he didn't want to have an explict list
> which had to be maintained (a valid concern) that still left us with the
> PY_EXPLICIT_FILES list, so how much did that really accomplish? Maybe
> PY_EXPLICIT_FILES can be removed in favor of a utility that tests the
> file type (or the hashbang interpreter line). But that still ties things
> to a git tree (ugh).

As for PY_EXPLICIT_FILES, I hope it dies. Actual code should be in 
importable modules, and the scripts should be one-liners that call the 
code. I hope IPA goes that way and I'll do what I can to make it happen.

> If you have any great ideas on how to address the file list issue it
> would be good to hear. However in the interim we have to somehow adjust
> the po file list after strip-po runs, once that's done I'm happy to ACK it.

> I wouldn't be surprised if you responded "Well, the file list
> discrepancy only occurs when a maintainer is explicitly stripping po
> files and they should know they have to adjust the LINGUAS file
> therefore these confusing errors won't be seen by someone who would be
> confused by them". Maybe yes, maybe no. I can think of plenty of times I
> debugged some build/configure/make failure and groaned because it was in
> some area that was totally cryptic and unknown to me, took a long time
> to unravel and had a trivial adjustment fix that would have only been
> known to an expert in that part of the code.
>
> But here are some other issues I saw when exercising the patch, not a
> problem with the patch but something that needs fixing.
>
> "make validate-pot" produces 11 errors, it's because the pot file is
> old, we have to a "make update-pot" and commit it to git.

As I said the actual language updates are left for another patch. I 
don't like mixing logic and data changes :)

> "make validate-po" shows 5 errors in es.po. I could have sworn I fixed
> these months ago. Maybe it only got committed to the 2.2. branch? We
> either have to fix the po file and push it to Transfix (but downloading
> a current TX version first!) or we have to go into TX and edit the
> offending msgstr's. Some of these errors are exactly the kind of thing
> which will cause us to throw an exception at run time if the locale is ES.

You'll also notice that there are a few more errors in the new messages 
on Transifex.

> Anyway, in summary the patch is great, the idea is good, we just need to
> fix the file list somehow. And at some point the other two issues above
> will need some attention.
>



-- 
Petr³


-------------- next part --------------
A non-text attachment was scrubbed...
Name: freeipa-pviktori-0066-02-Arrange-stripping-.po-files.patch
Type: text/x-patch
Size: 8081 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/freeipa-devel/attachments/20120720/178a7582/attachment.bin>


More information about the Freeipa-devel mailing list