[Freeipa-devel] [PATCH 70] validate i18n strings when running "make lint"

Thu Apr 19 11:04:08 UTC 2012

On 04/18/2012 09:32 PM, John Dennis wrote:
> On 04/18/2012 07:33 AM, Petr Viktorin wrote:
>> On 04/16/2012 10:32 PM, John Dennis wrote:
>>> On 04/12/2012 09:26 AM, Petr Viktorin wrote:
>>>> On 03/30/2012 03:45 AM, John Dennis wrote:
>>>>> Translatable strings have certain requirements for proper translation
>>>>> and run time behaviour. We should routinely validate those strings. A
>>>>> recent checkin to install/po/test_i18n.py makes it possible to
>>>>> validate
>>>>> the contents of our current pot file by searching for problematic
>>>>> strings.
>>>>>
>>>>> However, we only occasionally generate a new pot file, far less
>>>>> frequently than translatable strings change in the source base. Just
>>>>> like other checkin's to the source which are tested for correctness we
>>>>> should also validate new or modified translation strings when they are
>>>>> introduced and not accumulate problems to fix at the last minute. This
>>>>> would also raise the awareness of developers as to the requirements
>>>>> for
>>>>> proper string translation.
>>>>>
>>>>> The top level Makefile should be enhanced to create a temporary pot
>>>>> files from the current sources and validate it. We need to use a
>>>>> temporary pot file because we do not want to modify the pot file under
>>>>> source code control and exported to Transifex.
>>>>>
>>>>
>>>> NACK
>>>>
>>>> install/po/Makefile is not created early enough when running `make
>>>> rpms`
>>>> from a clean checkout.
>>>>
>>>> # git clean -fx
>>>> ...
>>>> # make rpms
>>>> rm -rf /rpmbuild
>>>> mkdir -p /rpmbuild/BUILD
>>>> mkdir -p /rpmbuild/RPMS
>>>> mkdir -p /rpmbuild/SOURCES
>>>> mkdir -p /rpmbuild/SPECS
>>>> mkdir -p /rpmbuild/SRPMS
>>>> mkdir -p dist/rpms
>>>> mkdir -p dist/srpms
>>>> if [ ! -e RELEASE ]; then echo 0> RELEASE; fi
>>>> sed -e s/__VERSION__/2.99.0GITde16a82/ -e s/__RELEASE__/0/ \
>>>> freeipa.spec.in> freeipa.spec
>>>> sed -e s/__VERSION__/2.99.0GITde16a82/ version.m4.in \
>>>>> version.m4
>>>> sed -e s/__VERSION__/2.99.0GITde16a82/ ipapython/setup.py.in \
>>>>> ipapython/setup.py
>>>> sed -e s/__VERSION__/2.99.0GITde16a82/ ipapython/version.py.in \
>>>>> ipapython/version.py
>>>> perl -pi -e "s:__NUM_VERSION__:2990:" ipapython/version.py
>>>> perl -pi -e "s:__API_VERSION__:2.34:" ipapython/version.py
>>>> sed -e s/__VERSION__/2.99.0GITde16a82/ daemons/ipa-version.h.in \
>>>>> daemons/ipa-version.h
>>>> perl -pi -e "s:__NUM_VERSION__:2990:" daemons/ipa-version.h
>>>> perl -pi -e "s:__DATA_VERSION__:20100614120000:" daemons/ipa-version.h
>>>> sed -e s/__VERSION__/2.99.0GITde16a82/ -e s/__RELEASE__/0/ \
>>>> ipa-client/ipa-client.spec.in> ipa-client/ipa-client.spec
>>>> sed -e s/__VERSION__/2.99.0GITde16a82/ ipa-client/version.m4.in \
>>>>> ipa-client/version.m4
>>>> if [ "redhat" != "" ]; then \
>>>> sed -e s/SUPPORTED_PLATFORM/redhat/ ipapython/services.py.in \
>>>>> ipapython/services.py; \
>>>> fi
>>>> if [ "" != "yes" ]; then \
>>>> ./makeapi --validate; \
>>>> fi
>>>> make -C install/po validate-src-strings
>>>> make[1]: Entering directory `/home/pviktori/freeipa/install/po'
>>>> make[1]: *** No rule to make target `validate-src-strings'. Stop.
>>>> make[1]: Leaving directory `/home/pviktori/freeipa/install/po'
>>>> make: *** [validate-src-strings] Error 2
>>>
>>> Updated patch attached.
>>>
>>> The fundamental problem is that we were trying to run code before
>>> configure had ever been run. We have dependencies on the output and side
>>> effects of configure. Therefore the solution is to add the
>>> bootstrap-autogen target as a dependency.
>>>
>>> FWIW, I tried an approach that did not require having bootstrap-autogen
>>> run first by having the validation occur as part of the normal "make
>>> all". But that had some undesirable properties, first we really only
>>> want to run validation for developers, not during a normal build,
>>> secondly the file generation requires a git repo (see below).
>>>
>>> But there is another reason to require running bootstrap-autogen prior
>>> to any validation (e.g. makeapi, make-lint, etc.) Those validation
>>> utilities need access to generated source files and those generated
>>> source files are supposed to be generated by the configure step. Right
>>> now we're generating them as part of updating version information. I
>>> will post a long email describing the problem on the devel list. So
>>> we've created a situation we we must run configure early on, even as
>>> part of "making the distribution" because part of "making the
>>> distribution" is validating the distribution and that requires the full
>>> content of the distribution be available.
>>>
>>> Also while trying to determine if the i18n validation step executed
>>> correctly I realized the i18n validation only emitted a message if an
>>> error was detected. That made it impossible to distinguish between empty
>>> input (a problem when not running in a development tree) and successful
>>> validation. Therefore I also updated i18n.py to output counts of
>>> messages checked which also caused me to fix some validations that were
>>> missing on plural forms.
>>>
>>>
>>
>> This still doesn't solve the problem: Make doesn't guarantee the order
>> of dependencies, so with the rule:
>> > lint: bootstrap-autogen validate-src-strings
>> > ./make-lint $(LINT_OPTIONS)
>> the validate-src-strings can fire before bootstrap-autogen:
>
> Fixed by having bootstrap-autogen be a dependency of the lint target and
> adding validate-src-strings to the lint rules.
>
>>
>> $ make lint
>> if [ ! -e RELEASE ]; then echo 0> RELEASE; fi
>> /usr/bin/make -C install/po validate-src-strings
>> make[1]: Entering directory `/home/pviktori/freeipa/install/po'
>> make[1]: *** No rule to make target `validate-src-strings'. Stop.
>> make[1]: Leaving directory `/home/pviktori/freeipa/install/po'
>> make: *** [validate-src-strings] Error 2
>> make: *** Waiting for unfinished jobs....
>>
>> When I move the bootstrap-autogen dependency to validate-src-strings, it
>> works. (Quite a lot of needed for a lint now, but discussion about that
>> is elsewhere.)
>
> I'm not sure if you're concern is with having to run bootstrap-autogen
> for lint or adding the string validation to the lint check.
>
> As I tried to point out in my email irrespective of validating the i18n
> strings we should have run bootstrap-autogen prior to lint because that
> is what is supposed to create the generated Python file(s) that we're
> asking pylint to validate.
>
> If the concern is with validating the i18n strings as part of lint then
> I'll make these observations:
>
> 1) validating the source strings is logically part of the lint check.
> Why? Because lint validates source files. The i18n string validation is
> also validating the contents of the same source files, it's just doing a
> job traditional lint can't perform, thus it's a logical extension of a
> lint type validation.
>
> 2) If the concern is with performing extra steps, performance, elapsed
> time to run the lint check etc. then
>
> a) On my laptop i18n validation takes 0.77s elapsed time
>
> b) On my laptop make-lint takes 1m17s or 77s elapsed time.
>
> c) Validating i18n strings is 100x faster than lint'ing the source
> code, or put another way adding i18n validation adds a mere 1% to
> the elapsed time needed to run lint.

Right, your solution is correct here.

>> Now that there are warnings, is pedantic mode necessary?
>
> Great question, I also pondered that as well. My conclusion was there
> was value in separating aggressiveness of error checking from the
> verbosity of the output. Also I didn't think we wanted warnings showing
> in normal checking for things which are suspicious but not known to be
> incorrect. So under the current scheme pedantic mode enables reporting
> of suspicious constructs. You can still get a warning in the normal mode
> for things which aren't fatal but are provably incorrect. An example of
> this would be missing plural translations, it won't cause a run time
> failure and we can be definite about their absence, however they should
> be fixed, but it's not mandatory they be fixed, a warning in this case
> seems appropriate.

If they should be fixed, we should fix them, and treat them as errors 
the same way we treat lint's warnings as errors. If the pedantic mode is 
an obscure option of some test module, I worry that nobody will ever run it.

Separating aggressiveness of checking from verbosity is not a bad idea. 
But since we now have two severity levels, and the checking is cheap, 
I'm not convinced that the aggressiveness should be tunable.
How about always counting the pedantic warnings, but not showing the 
details? Then, if such warnings are found, have the tool say how to run 
it to get a full report. That way people will notice it.

>> The validate_file function still contains some `return 1` lines, you
>> should update them all. Or just raise exceptions in these cases.
>>
>> $ tests/i18n.py --validate-pot nonexisting_file
>> file does not exist "nonexisting_file"
>> Traceback (most recent call last):
>> File "tests/i18n.py", line 817, in<module>
>> sys.exit(main())
>> File "tests/i18n.py", line 761, in main
>> n_entries, n_msgids, n_msgstrs, n_warnings, n_errors =
>> validate_file(f, validation_mode)
>> TypeError: 'int' object is not iterable
>
> Good catch, thank you! (see below)
>
>> I'd also return a namedtuple instead of a plain tuple, and have the
>> caller access the items by name. That way, if validate_file decides to
>> return more items in the future, we won't have to rewrite all calls to
>> it.
>
> Yeah, FWIW I thought about making the return values named tuple when I
> modified the code (are you a mind reader?).

There should, after all, be one obvious way to do it :)

> Not sure why I didn't,
> probably because I wanted to limit the engineering effort. Anyway, it's
> now a named tuple and the return values have been sanitized.
>
> Revised patch attached.

-- 
Petr³