Planning a future L10N infrastructure (including Fedora)
Asgeir Frimannsson
asgeirf at redhat.com
Mon Sep 15 06:09:22 UTC 2008
Hi infrastructure wranglers,
(cc transifex-devel)
Over the last few months, a few of us involved in Red Hat L10N engineering
have discussed how to best ensure we have Localisation Infrastructure and
Tools that can serve the needs of Red Hat, JBoss, Fedora and 'upstream'
communities in years to come. Let me first describe some of the background and
requirements behind this project:
Up until now, we have managed translations through version control systems
such as CVS, Svn and Git. This has ensured that all contributions are pushed
upstream, as we always store translations within the upstream repositories and
projects. 'Damned Lies' further gave us a tool to view language-specific
translation statistics for modules, branches and releases, as well as
convenient information about people, teams and projects. This has been a great
help for translators in their work. Dimitris' (and others) work on Transifex
has in addition given the translation community a way to submit translations
upstream without ever touching a developer-centric version control system,
which has been of great help to translators.
Some of the immediate needs that could be addressed within the existing
framework (some of which are on the Transifex roadmap) are:
- Consolidation of Damned Lies and Transifex, allowing retrieving and
submitting translations through the same interface
- Allowing retrieving and submitting multiple-files at once (e.g. for
translating a publican document with many PO files)
- Simple workflow on top of Transifex (porting features from Vertimus)
- Better usability and easier user registration process (Fedora specific)
Transifex is gaining some traction upstream (e.g. within Gnome), and we hope
development will continue strong, serving Fedora and potentially other
upstream communities.
Looking at the bigger picture, some of the core requirements we have identified
for Red Hat and community L10N going forward are:
- Customizable Translation Workflows and integration with e.g. Content
Authoring Workflows
- Infrastructure easily adaptable to support new File formats and project
types (e.g. OpenOffice formats, CMS formats, DTP formats, Wiki, Dita, Java
formats), rather than relying on 'upstream' projects to fit a certain L10N
infrastructure.
- Managing the life-cycle of a translation project across releases and
iterations
- Translation Reuse and Terminology Management across projects and iterations
- Job management, scoping, tracking and resourcing
- Managing and/or Tracking upstream translation projects, pushing changes back
upstream.
These requirements require a system where the translation lifecycle would be
managed within 'Translation Repositories' (similar to e.g. Pootle or Launchpad
Translations), rather than directly through e.g. upstream version control
systems. With a repository-based approach, we would be able to track and
manage changes to a project on a translation unit level, and manage e.g.
translation reuse and terminology within and across projects. We could still
retain a link with upstream repositories (like with Transifex/Damned Lies).
However, this would not be the 'core datamodel', but on a different layer
through plug-ins. This link to external repositories could also go beyond
traditional version control systems, communicating with external sources like
wikis and CMSs.
We have evaluated a number of existing open source L10N frameworks and
systems, but haven't found any (yet) that stands out or satisfies our needs or
requirements as a development platform. Technology-wise, we are aiming to
develop a Java-based(!) system, using technology such as JBoss Seam,
Hibernate, jBPM and RichFaces. A java based platform will enable us to make
best use of internal expertise in these technologies, as well as making use of
technology we are developing (as open source) through collaboration with
partners in the L10N industry.
We hope some of these requirements and ideas will excite some of you, and
ultimately lead to something that can be of use to open source communities.
While we have certain requirements and goals for this internally within the
company, there is no need for this to be an 'internal' Red Hat project, and
most of the requirements and needs overlap with those of community projects
like Fedora. In other words, by developing this in collaboration with the
community from a very early stage, we are more likely to develop something
that may be of use to the greater community.
Thoughts and comments, all sorts of comments, are very welcome.
cheers,
asgeir frimannsson
(Senior Software Engineer, I18N Engineering, Red Hat APAC)
More information about the Fedora-infrastructure-list
mailing list