[zanata-bugs] [Bug 999729] RFE: Support TM Export as TXT file

bugzilla at redhat.com bugzilla at redhat.com
Tue Sep 17 07:59:34 UTC 2013


https://bugzilla.redhat.com/show_bug.cgi?id=999729



--- Comment #6 from Sean Flanigan <sflaniga at redhat.com> ---
Thanks.

Here's an excerpt:

<Segment>0000013719
<Control>
00011800000001122533351English(U.S.)JAPANESEXXXXXX_.000XXX_.dita
</Control>
<Source>Click <uicontrol outputclass="XXXguicontrol">OK</uicontrol>.</Source>
<Target><uicontrol
outputclass="XXXguicontrol">「OK」</uicontrol>をクリックします。</Target>
</Segment>
...

It looks a bit like XML, except without the top-level element, and with random
nested tags (like "uicontrol").  I think its main virtue is that each Source
and Target segment is on a line by itself, which should help with grep.

We could look at making sure our TMX is exported in a neatly formatted way,
with only one string per line.  It would look something like this:

<tu srclang="en-US"
tuid="myproject:1.0:myproject:edbc3dc4ac083b40418f0dee7f552177">
  <tuv xml:lang="en-US">
    <seg>Disk Usage Analyzer</seg>
  </tuv>
  <tuv xml:lang="ja">
    <seg>ディスク使用量の解析</seg>
  </tuv>
</tu>
...

In the meantime, you could run Zanata's exported TMX through an XML pretty
printer.  If you have XMLStarlet installed, you can format TMX like this:

$ xmlstarlet fo zanata-myproject-1.0-allLocales.tmx

As root, you can install xmlstarlet from fedora or EPEL with:
# yum install xmlstarlet

But I'm sure there are other XML formatters too.

Would something like that work?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=LgIi06TmOt&a=cc_unsubscribe




More information about the zanata-bugs mailing list