Localizing dynamic content
< Takaisin blogi sivulle

Localizing dynamic content

Localizing text content of application is a common problem in applications with multilingual user base. Typically all the static content, which basically means all the "in build" texts of applications are provided by key value pair type properties files and retrieved from there as needed.

User generated dynamic content provides slightly more complicated challenges. Localized values are usually saved to database as key value pairs. The way those values are retrieved depends on multiple factors, for example structure and amount of data.

Following solution is developed for use case, where user generated content is defined by hierarchical object structure that contains potentially large amounts of different entities and different data. Additionally data structure is likely to expand in future by both new entity types and new localizable text fields in existing entities. 

Original content and translator

In some systems the amount of supported languages is not expected to change. In addition to that all the content might be translated to all available languages. In those situations it makes sense to save also the original content to same data structure as the translations. However this kind of solution easily makes API between content and user interface fields sometimes and it might also introduce unnecessary overhead while retrieving content.

Other approach to this problem would be saving the default language version content to original data structure fields. User interface can then display the required data directly from the provided data objects without the need to address specific API or even without any knowledge of which language is currently in use.

For translating the content to any additional languages we would then need two things. First is a translator object, whose task is to translate content retrieved from data storage to different language before sending the data to user interface. In addition we need a dictionary - collection of translations from default language to another language.

Ideally the solution should not be application specific and it should be easily expandable. This introduces some requirements for translated data objects, as translator needs to be able to translate the content of data object without actually knowing which kind of content it is currently translating. Solution should also support creating, maintaining and expanding translation without need for large amounts of application specific code.

Defining entities for localization solution

Following solution for above described problem is implemented in Java. Similar pattern can be used on any object-oriented language barring the limitations of said languages.

We start by defining interface Localizable. Each entity that contains localizable data must implement Localizable interface to allow the translator object to translate content inside the entity.

public interface Localizable {
    String getLocalizerKey();
    List<String> getLocalizedFields();
    String getLocalizedFieldValue(String field);
    setLocalizedFieldValue(String field, String value);
    void localize(Localizer localizer);
}

String getLocalizerKey() returns a unique String that is used to identify the object.

String getLocalizedFields() returns a list of names of fields whose content can be localized.

String getLocalizedFieldValue(String field) is used to return value of field identified by it's name.

void setLocalizedFieldValue(String field, String value) is used to set a value of localized field.

void localize(Localizer localizer) is used by Localizer object to do the actual translation.

Following is an example of localizable object:

public class Page implements Localizable {
    private static final String NAME = "NAME";
    private static final String DESCRIPTION = "DESCRIPTION";
    private static final List<String> LOCALIZED_FIELDS = Arrays.asList(new String[] {NAME, DESCRIPTION});
....
    public String getLocalizerKey() {
        return LocalizerKeys.LOCALIZABLE_PREFIX_PAGE.getKey() + getId() + "_";
}
    public List<String> getLocalizedFields() {
        return PageUI.LOCALIZED_FIELDS;
    }
    public String getLocalizedFieldValue(String field) {
        if(NAME.equals(field)) {
            return getName();
        } else if(DESCRIPTION.equals(field)) {
            return getDescription();
        } else {
            return "";
        }
    }
    public void setLocalizedFieldValue(String field, String value) {
        if(NAME.equals(field)) {
            setName(value);
        } else if(DESCRIPTION.equals(field)) {
            setDescription(value);
        }
    }
....
}

Note the definition of localizable fields in start of object definition. Also note that actual fields and their getters and setters are omitted from the example to save space. We are also presenting the implementation of localize-method a bit later.

Next we define a localized value, which is basically a simple key value pair that is connected to certain translation object. Translation object is a dictionary that contains all the translations for one language and one object structure. Definition of Translation-object is not presented here, but it can be assumed to contain identifying id, Collection of LocalizedValue-objects and reference to parent level structure that is to be translated.

public class LocalizedValue {
    private long translationId;
    private String localizerKey;
    private String localizedValue;
}

Finally comes the actual translator object, Localizer.

public class Localizer {
    private final Map<String, String> valueMap;
    public void localize(Localizable localizable) {
        for(String field : localizable.getLocalizedFields()) {
            String localizedValue = valueMap.get(localizable.getLocalizerKey() + field);
            if(localizedValue != null) {
                localizable.setLocalizedFieldValue(field, localizedValue);
            }
        }
    }
}

Note that Localizer owns a list of key value pairs. When entity requests Localizer to localize it's content, Localizer first requests the list of all fields in entity that are localizable. After that it searches key value pairs for the key that is constructed by combining objects key and field name. Finally it sets the value of the said field in localized entity to be of new localized value. All this can be done in simple loop, because Localizable interface allows for localizing all necessary fields without knowing the actual type of the localized objects.

How does the actual localizing process work?

How does all this add up then on actual implementation? Let's say that we are willing to display a Page object, which in turn contains some localizable child entities, and we want to translate the content before displaying it to user. First we retrieve the actual object from data storage. When we notice that content needs to be localized, we create a Localizer object and populate it with translations. We'll take a closer look on how these could be localized efficiently a bit later.

Next step is to do the actual localization. To do that we call the localize-method in Page object passing the populated Localizer as parameter.

public void localize(Localizer localizer) {
    localizer.localize(this);
    for(PageContent content : pageContents) {
        localizer.localize(content);
    }
}

Note that Page-entity knows that it holds a Collection of PageContent-entities, so when it is localized, it tells Localizer to do the translation to those child objects too. In that way whole object hierarchy gets localized, once again without Localizer needing to know what it is actually translating.

Whenever content is displayed using something else than default language, every localizable object is passed through Localizer object before it is forwarded to user interface. Needless to say this data is now read only. For editing we need to implement separate editing interface.

As you can see, this solution is extremely easy to expand as localizable data structure expands. Adding fields to existing objects and making them localizable is as easy as adding those new fields to list of localizable fields and expanding the implementation of getLocalizedValue and setLocalizedValue to include new fields. As far as entirely new localizable objects go, it is enough for them to implement Localizable interface.

Performance and editing

Couple of open questions still remains though. First is populating Localizer. How do we retrieve the localized values that are needed without causing too much of an overhead?

The answer depends on size and structure of localized data. If the amount of fields is not too extensive, we might get away with retrieving all key value pairs for one translation and just using them. For larger amount of data we might do application specific extension of LocalizedValue by adding tags and references that can be referred when retrieving smaller subset of values.

In above described example with Pages that have multiple PageContent objects in them we might add field "pageId" that refers to page to all LocalizedValue-instances that belong to either said Page or PageContent object in it. By doing this we make it possible to retrieve all the needed LocalizedValue-instances that belong to said Page or it's children by pageId without ending up retrieving LocalizedValues that belong to other pages. Same solution can be expanded as needed to reflect the actual data structure and way the data is retrieved.

As for creating, maintaining and editing translations and Localized values I'm not going to present complete solution here. Instead I'm providing couple of hints and leaving the actual implementation as a small exercise for anyone who would be interested.

- Simplest case of user interface for creating a new translation and maintaining all the LocalizedValues in it could be just a text area with key value -pairs, where keys would be provided automatically.

- For more sophisticated solution the user interface for maintaining single LocalizedValue can and should be done with same code regardless of which objects value or values are currently edited.

- Creating the user interface for editing all the LocalizedValues for single translation can be constructed with minimal application specific code by utilizing traditional localization files and field names for LocalizedValue captions. Application specific code is only needed for conveniently organizing editable LocalizedValues so that they mirror the actual data structure and for providing captions for object class based organization.

- If our LocalizedValue-class contains references to different objects in our object hierarchy (see the pageId-example above), creating new LocalizedValue-entities should be set to correct values. One way to do this is to add method void setEntityRelations(LocalizerEntityRelationSetter relationSetter) to Localizable-interface. Implementation of LocalizerEntityRelationSetter has implementations for each entity type for setting correct relations. Once again application specific code can be isolated to single implementing class.

Above described solution is certainly not feasible for all localization needs. However it has been proven in at least one large project to be easily expandable along with application and data structure. Separating translations and their management from actual content creation also mirrors common real life situations where translations are often created way after the actual content has been originally created. After all we are making applications for real life use, so imitating real life workflow usually yields good results.

 


Seuraava: Mutaatiotestaus
Edellinen: Välitä Välittämisestä