Dealing with HTML tags and inline items in Passolo

I’ve been using Passolo to deal with UI strings for nearly four years now. Our relationship hasn’t always been the happiest one, but so far, Passolo has been a good enough tool to deal with the .properties style UI strings that I work with:

service.common.username=Username
service.common.phone=Phone number

All it takes to get underway in Passolo is a simple parser that recognizes the part before the = sign as the ID and the part after it as the string. Easy peasy.

Dealing with parameters

Sometimes, we need to present dynamic content like dates, usernames and document IDs on our UI. In these cases, our developers drop in the necessary values as parameters using curly brackets:

service.common.loggedInAs=You're logged in as {0}
service.common.viewingDocument=Viewing document {0} from {1}

Each parameter takes up just three characters, so you’d think there’s no need to have any special treatment for these, right? I have noticed that sometimes – not very often, though – some of our translators have accidentally deleted either the opening or the closing bracket. When we bake strings like those into the software, stuff tends to break and cause me grey hair. And that’s not nice.

The Inline Patterns tool

To combat broken parameters and grey hair, SDL has added a tool to Passolo that lets you define inline patterns that may appear in your UI strings. You can find it under the Project tab:

passolo-inline-patterns-menuitem

When you click Inline Patterns, Passolo pops up the Inline Patterns window. This is where all the magic happens.

passolo-inline-patterns-window

Adding a new pattern

Adding new patterns is simple. If I wanted to create a simple pattern that marks the {0} parameter as an inline tag, I could do this:

  1. Click Add.
  2. Enter a name for the pattern in the Pattern Name field.
  3. Enter the pattern that I want to mark as an inline tag in the Text to Search field. In this case, I would enter {0} in the field.
  4. Tick the Convert to Inline Tag checkbox

After this, the pattern {0} will get highlighted in the translation view every time it appears. As it is marked as an inline tag, it cannot be edited at all, so the tranlators cannot accidentally remove any of the brackets or change the number inside the brackets. They can, however, delete the whole tag. To add a extra protection to your tags, you can set a few other options when you define patterns:

  1. Under Checking, select the Must exist in translation if also exists in source radio button
  2. Check the Number of matches must be the same checkbox

After this, Passolo makes sure that patterns used in the source text must also be used in the translation. Passolo also makes sure that the number of patterns is the same in both the source text and the translation.

Making smarter rules with regular expressions

Adding a new inline pattern rule is simple and relatively quick, but that doesn’t mean that you should create separate rules for {0}, {1}, {2}, and so forth. This is where regular expressions come in handy. Regular expressions let you create powerful search patterns that match multiple similar items. Let’s create an inline pattern rule using regular expressions and see how they work in practise:

  1. Click Add.
  2. Enter a name for the pattern in the Pattern Name field.
  3. Tick the Use Regular Expression checkbox
  4. Enter {[0-9]} in the Text to Search field
  5. Tick the Convert to Inline Tag checkbox

When we use regular expressions, we have a number of characters and patterns that serve a special purpose at our disposal. To clear things out, let’s look at how our search pattern works. First, there’s the { character. This tells Passolo that the pattern we’re looking for starts with this specific character. Next, we have the [0-9] part. Square brackets have a special purpose in regular expressions; inside them you can define a range of characters, one of which can appear in this position. Simply put, it means that any one character from 0 to 9 can appear at this spot. Finally, there’s the } character that ends our regular expression.

In our new rule, the part between { and } can be any number from 0 to 9. This means that the single rule matches the {0} that we started with, but it will also match other single digit parameters, too. One rule to, erm, rule them all!

What about HTML?

Recently, we added a more fancy looking design to our UI and due to this the developers have started to include HTML tags inside the UI strings. This means that nowadays, I often come across strings that look like this:

service.common.loggedInAs=<div>You're logged in as <b>{0}</b><div>
service.common.benefitsList=<ul><li>First item</li><li>Second item</li><li>Third item</li></ul>

Now this is where things start to get really messy. If left untouched, the HTML tags make it difficult to decipher what’s actually being said in the string, and since they take up a lot of space, editing around them gets difficult and time consuming. Not to mention the fact that it’s fairly easy to accidentally delete a bracket and mess up the HTML syntax. This is where inline patterns and regular expressions come in really handy.

Paragraph tags

The paragraph tag comes in two varieties, the <p> tag that starts the paragraph and the </p> that ends it. These are fairly simple to deal with using a single rule. We start the pattern with a <, since both tags start with it. Next, we add a forward-slash, /, to our pattern. We know that only one of the tags uses the forward-slash character, so we’ll need to make it an optional character. We’ll do this by adding a question mark character after it.

The question mark is a special character in regular expressions. It means that the character that precedes it must appear 0 or 1 times in the pattern. Here, it just means that the < character may or may not be followed by a forward slash. The rest of the pattern is the same in both cases, so we can wrap up our regular expression with p>. The end result is </?p> that matches both the opening and the closing paragraph tags.

Formatting tags

Formatting tags are easily dealt with using a similar regular expression that we use to find the paragraph tags. </?b> deals with the b tag, </?i> takes care of italics, and </?u> covers underlined bits.

Lists

List items use <li> and </li> tags, and can be dealt with a simple </?li> following the previous examples. The parent tags, <ul>, </ul>, <ol>, and </ol> need a bit more work, however.

The regular expression starts off like the previous ones, with </?. The next letter can be either an o or a u. To match both, we’ll use the square brackets that we already used with the parameters, and define a range of acceptable letters: [o,u]. We’ll finish off the regular expression with parts that are the same for both list types, l>. The full expression is </?[o,u]l> – instead of creating two, or even four separate rules, we can manage with just one. Neato!

Line breaks

Another type of tag that I often see is the line break tag. It’s a difficult one to deal with, since at least our devs have several ways of typing it, for instance <br>, <br/>, and <br />.

So, we’ll start the regular expression with the part that’s common to all, <br. We know that this may or may not be followed by a whitespace, so we’ll first add the whitespace using a special markup, \s, and make that into an optional character by following it with a question mark. There may or may not be a forward-slash next, so let’s add notation for that using the same pattern, /?. After this, we can wrap up the regular expression with a >. Now, we have <br\s?/?> and we can match all the different styles of line break tags, too.

TL;DR

For those with limited attention spans, here’s a list of regular expressions that you can use to convert HTML tags and parameters into nice, manageable inline patterns in Passolo:

PATTERN          WHAT IT MATCHES

</?p>            <p> and </p>
</?li>           <li> and </li>
</?b>            <b> and </b>
{[0-9]}          {0}, {1}, {2} ... {9}
<br\s?/?>        <br>, <br/>, and <br />
</?[o,u]l>       <ol>, </ol>, <ul> and </ul>

Leave a Reply

Your email address will not be published. Required fields are marked *