Translation Markup Language (TML)

Translation Markup Language (TML) is used to identify the non-translatable or dynamic data within the labels. It provides a way to mark data and decoration tokens within the strings that need to be translated. There are different types of applications that can use TML - web, mobile and desktop. Some use HTML, others use Wiki-Like syntax for decorating the labels. TML aims at abstracting out the decoration mechanisms of the string used by the applications and instead provides its own simple, but powerful syntax. This allows for translation sharing across multiple applications.

Basics

The TML Client SDK provides functions like tr for translation. The function can be called using any of the following ways:

tr(label, description, tokens, options)

// You can skip description:
tr(label, tokens, options)  

You can also pass parameters as a hash:

tr({label: LABEL, tokens: TOKENS, options: OPTIONS})  

Alternatively, you can use string extensions:

"some text".translate(tokens, options, language)
"some text".translate(description, tokens, options, language)
  • label is the only required parameter.
  • description is an optional parameter, but should always be used if the label by itself is not sufficient enough to provide the meaning of the phrase.
  • tokens is an optional parameter that contains a hash (or a dictionary) of token values to be substituted in the label.
  • options provides a mechanism for passing additional directives to the translation engine.

Let's start with a sample phrase: We'll be using javascript for the examples.

tr("Hello World")  
// Hello World

The description of a phrase is not mandatory, but it should be used in cases when the label alone is not sufficient enough to determine the meaning of the sentence being translated. As a general rule, you should always provide description to words, phrases and sentences that are only meaningful within a specific context. TML uses label and description together to create a unique key for each phrase. The description serves two purposes: it creates a unique key for each label and it also gives a hint to the translators for the context in which the label is used.

For example, the following two phrases will be registered as two independent entries in a database even though the have the same label, but a different description. The user will have to translate each one of them separately as they will have different translated labels in other languages.

tr("Invite", "Link to invite your friends to join the site")  
// Invite
tr("Invite", "An invitation you received from your friend")  
// Invite

It is important to provide the best possible description for each phrase from the start. Keep in mind that changing a description in the future, after it has already been translated, will register a new phrase in the database and invalidate all of its translations. On the other hand, labels that are complete sentences may not need a description as they are fully self-contained.

Decorations

Decoration tokens are used to inject styling into translations. In other libraries, like in iOS or Android, the tokens can be substituted with a native decoration framework.

Decorations can be defined as strings, where {$0} indicates the translated value being processed.

tr("Hello [bold: World]", {bold: "<strong>{$0}</strong>"})

// Hello <strong>World</strong>

The token values can be passed as functions/lambdas.

tr("Hello [bold: World]", {bold: function(value){  
  return "<strong>" + value + "</strong>";
})

// Hello <strong>World</strong>

Predefined Decorators

The TML SDK comes with a number of predefined decorators like bold, italic and link. Some predefined decorators can take extra parameters. For instance, the link decorator:

tr("Hello [link: World]", {link: {href: "/world", class: "btn-link"});

// Hello <a href="/world" class="btn-link">World</a>
Name Template
strong <strong>{ $0 }</strong>
bold <strong>{ $0 }</strong>
b <strong>{ $0 }</strong>
em <em>{ $0 }</em>
italic <em>{ $0 }</em>
i <i>{ $0 }</i>
link <link href='{ $href }' class='{ $class }' style='{ $style }' >{ $0 }</link>
br <br>{ $0 }
strike <strike>{ $0 }</strike>
div <div id='{ $id }' class='{ $class }' style='{ $style }'>{ $0 }</div>
span <span id='{ $id }' class='{ $class }' style='{ $style }'>{ $0 }</span>
h1 <h1>{ $0 }</h1>
h2 <h2>{ $0 }</h2>
h3 <h3>{ $0 }</h3>

View the configuration options for your specific SDK to learn more.

Some users may prefer to use the long notation for decorators. The long notation can be easier for translators who are used to dealing with html to understand.

tr("<link>Click here</link> to view the docs", {link: {href: "/docs"}})

// <a href='/docs'>Click here</a> to view the docs

Decorations can also be nested.

tr("<link><bold>Click here</bold> to view <italic>this section</italic> of the document</link>", {  
  link: {href: "/docs"}
})

// <a href="/docs">
//   <strong>Click here</strong> to view <em>this section</em> of the document
// </a>

Data Tokens

Many dynamic projects will want to abstract translations so that variables can be interpolated into the translation. This can easily be done with data tokens.

In many cases your variables will be strings that get substituted directly into the translated sentence.

tr("Hello {user}", {user: "Michael"})

// Hello Michael

Translations can be nested inside your data tokens.

tr("Welcome to {city}", {city: tr("Los Angeles")})

// Welcome to Los Angeles

But we need to make sure not to take translations out of context.

tr("Please visit our {registration} to join our site.", {registration: link_to(tr("registration page"), "")})

// Please visit our <link>registration page</link> to join our site

The problem with the above example, is that the "registration page" link text would be translated differently based on the context where it appears. You must keep the two parts together to make sure the translations are accurate. You will later see how you can use decoration tokens to fix the above problem.

You can also get the substitution value by invoking a method on an object by using a symbol in the second parameter.

tr("Dear {user}", {user: [currentUser, "name"]})  
// Dear Michael

or you can use hashes for the token values as well.

tr("Hello {user}", {user: {object: currentUser, attribute: "name"})  
// Hello Michael

Pluralization

TML includes a powerful engine for handling pluralization.
Data tokens work in conjunction with context rules and allow you to provide the appropriate substitution values.

tr("You have {count || one: message, other: messages}", {count: 1})  
// You have 1 message
// You have 2 messages

In this case, if the count value meets the criteria for the rule "one", then it will display the word set to the rule. For all other cases it would display the "other" value.

Since the sequence of parameters is mapped to the sequence of rules, you can omit naming the parameters.

tr("You have {count || message, messages}", {count: 1})  
// You have 1 message
// You have 2 messages

Double pipe "||" means that the value would be displayed, followed by the word that depends on the value. A single pipe "|" will not display the value.

tr("You have {count | a message, messages}", {count: 1})  
// You have a message
// You have messages

Some languages, like english, comes with default pluralizers, which do not require that you provide the plural form. It will be automagically generated for you.

tr("You have {count || message}", {count: 1})  
// You have 1 message
// You have 2 messages
// You have 10 messages

Genders

The same exact concept applies to other token types and context rules.

tr("{user} updated {user | his, her} profile.", {user: currentUser})  
// Michael updated his profile
// Mary updated her profile

Single pipe "|" means to not display the actual token value, but display the value that follows based on the context rules.

tr("{user | male: He, female: She} likes this movie.", {user: currentUser})  
// He likes this movie.

Similar to the previous examples, you don't have to provide the named parameter values.

tr("{user | He, She} likes this movie.", {user: currentUser})  
// He likes this movie.

Even though the base language does not have a gender specific dependency in some cases, it is always good to wrap it with an implied token.

tr("{user | Born on}: ", {user: currentUser})  
// Born on:

As a general rule, if any of the words of your translation keys depend on a user, use implied tokens. It won't affect default translations, yet it would give translators an option make the translation accurate.

Nested Tokens

Decoration tokens can be nested and they may contain data tokens as well.

tr("You have <link>{count||message}</link>", {  
  count: 10,
  link: {href: "/messages"}
})
// You have <a href="/messages">10 messages</a>

Gender

Similarly to the numeric rules, some language have dependencies on the gender.

tr("{user} uploaded {user | his, her} photo", {user: {name: "Michael", gender: "male"}})  
// Michael uploaded his photo

tr("{user} uploaded {user | his, her} photo", {user: {name: "Anna", gender: "female"}})  
// Anna uploaded his photo

Dates

Dates can also be used for contextual evaluation. Consider the following example:

tr("{user} {date| past: celebrated, present: celebrates, future: will celebrate} {user| his, her} birthday {date | on #date#, today, on #date#}", {  
  user: currentUser,
  date: date
});

// date is Today
// Michael is celebrating his birthday today

// date is in the past
// Michael celebrated his birthday on 2/7/2016

// date is in the future
// Michael will celebrate his birthday on 2/9/2016