Answered by:
Best practices for translating XML documents?

Question
-
What are best practices for translating XML documents with MT Hub?
And how can I export the translated document to validate the output markup?
Right now (translating the documents as plain text), I'm getting:
- spaces inserted into tags (<em> > < em >)
- significant reordering of tags vs text
- tag names are getting translated
Examples follow (enu > ptb):
<pair name="position" type="integer">1</pair> < nome par = "cargo" type = "inteiro" > 1 < / par >
<li>Go to the <em>Payment</em> section of your <a href="https://www.linkedin.com/secure/settings" target="_blank">Settings</a> page.</li>
< li > vá até a seção de < em > < /em > do pagamento da sua < a href = "https://www.linkedin.com/secure/settings" target = blank"> página de configurações de </a >. </li >
<li>Click <em>Manage Billing</em> Info in the upper right corner.</li>
Clique em < li > < em > Gerenciar cobrança < /em > informações no canto superior direito canto. </li >
<li>Click the <em>Edit</em> link and make your updates.</li>
Clique no link de < em > Editar < /em > de < li > e faça suas atualizações. </li >
- Edited by MikeD_AMTA Wednesday, February 18, 2015 7:17 PM
Wednesday, February 18, 2015 7:10 PM
Answers
-
Made an edit to my above answer. Important to consider the sentence breaking vs sentence internal nature of your XML elements.
Chris Wendt
Microsoft Translator
- Marked as answer by Microsoft TranslatorMicrosoft employee Friday, March 27, 2015 12:44 AM
Friday, February 20, 2015 4:42 PM
All replies
-
- Transform your XML document to XHTML, using an XSL transform. Choose the appropriate HTML elements for your untranslatable elements. Say <code> or <script> for code. Consider which elements are sentence internal and which ones are sentence ending. Use the appropriate HTML tags. Example: <title> is sentence ending, <a> is sentence internal. Look at your HTML document in a browser after the transform: if it doesn't look right, it won't translate right. Avoid at all cost to introduce sentence breaks where they don't belong, or glue words into sentences, where they shouldn't, say in a table.
- Make sure the resulting document is not longer than 10000 characters. If it is, split into smaller, but complete elements.
- Translate with content-type="text/html".
- Transform (XSLT) back to your original XML schema.
Content-type text/html is designed to preserve HTML formatting and nesting. text/plain will not do that. However, you are pretty much forced into text/plain if you need to translate incomplete XML elements.
Let us know how it goes,
Chris Wendt
Microsoft Translator
- Edited by Chris WendtMicrosoft employee Friday, February 20, 2015 4:44 PM
Thursday, February 19, 2015 5:24 PM -
Thanks once again, Chris!
: )
Best,
Mike
Thursday, February 19, 2015 5:28 PM -
Thanks. That said, we should not be messing with the individual tags even in plain text. Let me check....Thursday, February 19, 2015 5:31 PM
-
Made an edit to my above answer. Important to consider the sentence breaking vs sentence internal nature of your XML elements.
Chris Wendt
Microsoft Translator
- Marked as answer by Microsoft TranslatorMicrosoft employee Friday, March 27, 2015 12:44 AM
Friday, February 20, 2015 4:42 PM