Planning a future L10N infrastructure (including Fedora)

Asgeir Frimannsson asgeirf at redhat.com
Tue Sep 16 06:16:42 UTC 2008


On Mon, Sep 15, 2008 at 10:48 PM, "Sankarshan (সঙ্কর্ষণ)" 
<sankarshan.mukhopadhyay at gmail.com> wrote:
>
> Asgeir Frimannsson wrote:
>
> > Some of the immediate needs that could be addressed within the existing
> > framework (some of which are on the Transifex roadmap) are:
> > - Consolidation of Damned Lies and Transifex, allowing retrieving and
> > submitting translations through the same interface
> > - Allowing retrieving and submitting multiple-files at once (e.g. for
> > translating a publican document with many PO files)
> > - Simple workflow on top of Transifex (porting features from Vertimus)
> > - Better usability and easier user registration process (Fedora specific)
>
> Which ones are not on Tx roadmap ? And, how are those elements proposed
> to be met ?

Background:
http://groups.google.com/group/transifex-
devel/browse_thread/thread/a637310e8ff63555
http://transifex.org/wiki/Development/Roadmap
http://transifex.org/roadmap

Dimitris is a better person to answer this, but I believe we already have 
basic statistics support finished in Transifex upstream. Also, I know Stéphane 
Raimbault is involved in integrating the concepts found in Vertimus into 
Transifex. What the future regarding integration of Damned Lies concepts such 
as Teams and Releases is, I am not sure.

> > Looking at the bigger picture, some of the core requirements we have 
identified
> > for Red Hat and community L10N going forward are:
> > - Customizable Translation Workflows and integration with e.g. Content
> > Authoring Workflows
> > - Infrastructure easily adaptable to support new File formats and project
> > types (e.g. OpenOffice formats, CMS formats, DTP formats, Wiki, Dita, Java
> > formats), rather than relying on 'upstream' projects to fit a certain L10N
> > infrastructure.
> > - Managing the life-cycle of a translation project across releases and
> > iterations
> > - Translation Reuse and Terminology Management across projects and 
iterations
> > - Job management, scoping, tracking and resourcing
> > - Managing and/or Tracking upstream translation projects, pushing changes 
back
> > upstream.
>
> Since Tx is gaining traction with other communities as well, is it
> prudent to open the net wider and ask about the requirements from such
> communities ?

Yes. I would however add that this project is not directly linked with Tx at 
this point. Dimitris has done a great job in networking with other 
communities, and have a plan for Tx that goes way beyond Fedora.

> > These requirements require a system where the translation lifecycle would 
be
> > managed within 'Translation Repositories' (similar to e.g. Pootle or 
Launchpad
> > Translations), rather than directly through e.g. upstream version control
> > systems. With a repository-based approach, we would be able to track and
> > manage changes to a project on a translation unit level, and manage e.g.
> > translation reuse and terminology within and across projects. We could 
still
> > retain a link with upstream repositories (like with Transifex/Damned 
Lies).
> > However, this would not be the 'core datamodel', but on a different layer
> > through plug-ins. This link to external repositories could also go beyond
> > traditional version control systems, communicating with external sources 
like
> > wikis and CMSs.
>
> Does Transifex allow such a set of 'plug-ins' ? If yes, how would one go
> about integrating them within the plans of Transifex ? If not, how does
> the integration happen ?

The existing Transifex handles very different concepts than what is described 
here, and writing this on top of transifex would be hard. 

Take for example the 'submission' page, this page is centered around 
submitting a file to a repository (or to bugzilla, email as a result of 
Christos' SoC work). In a Repository-model, the 'submit' action would be more 
about updating the internal state of a project within a repository. To achieve 
the same 'workflow' as Transifex, an external plugin could then 'listen' to 
these changes and transparently submit changes upstream (even by interacting 
with Transifex?). In this sense, the project we are proposing could use the 
submission logic of Tx, but handle that in the background. There will be 
little reuse of actual code from Transifex (most of which is UI-Model 
interaction which is linked with file submission).

It is also important to note that the internal format of the repository will 
not be PO, but a much richer format more similar to e.g. XLIFF, that 
accommodates features such as change tracking and terminology-annotations 
within translation units. PO would still be supported as an input format, as 
well as an intermediate format that is sent to translators using existing PO 
tools. However, in the long term, we aim to provide translators with richer 
tools that can make use of the additional meta-data that is part of the 
repository. 

Dimitris mentioned on IRC the other day that the concept of a Translation 
Repository similar to e.g. Pootle had been briefly discussed, and could be part 
of Transifex in the future. This is exciting news, utilizing Translate Toolkit 
more is something that could take Tx to the next level quickly. I think 
Transifex with its existing Roadmap serves a very useful purpose, and we are 
not trying to 'hijack' that project in any way. In fact, I am also pushing 
towards putting more resources into transifex development (read between the 
lines whatever you want here). 

What we are about to develop is a new way of doing localisation repositories 
and workflow, more similar to what happens in many commercial tools than what 
we see in open source communities. I feel a bit 'uneasy' about pushing that 
onto the Tx roadmap at this stage, and also uneasy about developing such a 
'workflow system' in Turbogears.

> > We have evaluated a number of existing open source L10N frameworks and
> > systems, but haven't found any (yet) that stands out or satisfies our needs 
or
> > requirements as a development platform. Technology-wise, we are aiming to
> > develop a Java-based(!) system, using technology such as JBoss Seam,
> > Hibernate, jBPM and RichFaces. A java based platform will enable us to 
make
> > best use of internal expertise in these technologies, as well as making 
use of
> > technology we are developing (as open source) through collaboration with
> > partners in the L10N industry.
>
> Can the results of the evaluation be shared ?

So the alternatives would be Pootle (Translate Toolkit), and Transifex 
(Tx+DL+Vertimus), pootle clearly being the more mature from a resource-
management perspective. Pootle works with PO, XLIFF and many other formats. 
Still, it is very limited in its use of e.g. workflow support and translation 
memory management. One of the main architectural limitations of Translate 
Toolkit is it's inheritance hierarchy, where all resource-formats (e.g. PO, 
Properties, XLIFF, TMX) inherit from a base resource class. A 'pivot' format 
(similar to e.g. XLIFF) with converters to and from the native format is what 
we're looking for. Nevertheless, Translate Toolkit (and even Damned Lies) has 
a lot of knowledge vested in it in how to handle specific project types 
(intltool, gnome-doc-utils, firefox, openoffice). This is reusable across 
solutions.

Feature-wise, it is much more interesting to compare with e.g. Idiom 
WorldServer and Lionbridge Freeway, which are commercial solutions in the L10N 
space. 

cheers,
asgeir




More information about the Fedora-trans-list mailing list