I thought I'd spend some time this week looking at how to add localization support to web applications, specifically client-side heavy web applications. The original intent for this article was to look at the subject matter in terms of PhoneGap applications, but there's no reason why this can't be applied to desktop sites as well.
The purpose of this blog entry is to introduce and discuss some of the basic concepts, as well as provide a few examples of the concepts in practice. It is not my intent to cover every detail. I hope, though, that this blog entry will give you an idea of what's involved and get you started along the process with your own work.
Before going any further, I wish to thank Paul Hastings for his advice and help. He is an expert on the subject and has offered his support in this article and many times in the past. Any mistakes I make here our my fault entirely (and knowing Paul, he will rip me a new one with corrections ;).
Some Terms
Before we begin, it helps to define a few basic terms so we can ensure we're on the same track. When it comes to localization, there's really two main things going on:
- Internationalization (often abbreviated as i18n) is the process by which you "prepare" your code to be localized. So for example, it isn't the actual display of something in French, but rather preparing your code so that it could be displayed in French, or Japanese, or any language. Internationalization will be the main focus for this blog entry. At least to me, this is the fun part.
- Localization (often abbreviated as l10n) is the actual process by which you make your code available in a language. So if step one made it possible for my application to be usable by the French and the Japanese, this step actually gets it done. This is the not so fun aspect. It involves the creating of language files and other fun translation type services. Do not assume you can just go to Google Translate and be done. This is the part that will take far longer than you expect.
- And finally, just to add one more rhyming word, the combination of internationalization and localization leads to an end result of Globalization (often abbreviated as g11n).
Identifying Internationalization Targets
So what can internationalize? For web apps, I think this falls into three main categories:
- Static UI components: This includes things like Submit buttons and form fields. Even though I say "Static", some of these items may be displayed dynamically. For example, consider a web application that lists your friends. Next to each friend is a button that lets you delete that friend. While the button is displayed dyamically, the text is static, "Delete". This is something that could be localized into French or Japanese.
- Numbers and Dates. Quick, what day is 2/3/2012? If you said February 3rd, 2012, congrats. You are an American and should be proud. If you said March 2nd, I pity you for not living in the greatest nation on Earth. All kidding aside, numeric dates are very prone to this type of confusion, and while it's mostly non-Americans who get screwed by this, I know it's bitten me on a few web sites as well. Numbers are not necessarily that big of a deal. Some places switch out the comma and period so that 1,209.21 would be 1.209,21. However, I think most folks would recognize either form. That being said, it's a concern. Currency also falls into this as well.
- Dynamic content: Technically this is not part of the web app, instead, it represents the database content your web app is a front end for, or the remote API your web app may be making use of. This is not something that will be covered for this blog entry. However, as I'll be talking about how to detect the user's language, note that you can pass that value with your calls to the server.
Our Demo
So given the targets above, let's talk about a simple web application we can use to help demonstrate these concepts. Our web application is a simple product status checker. It provides a form with a few basic options, a search button, and will hit the server to return matched products as well as their date of availability, the quantity in stock, and a price. You can view the demo here:
http://www.raymondcamden.com/demos/2012/feb/15/v1/ - Old ColdFusion demo removed - sorry!
To begin, simply type "e" and notice how the products are displayed. Feel free to try other search terms of course. The web application is pretty simple and most of the content is dynamic. That makes it an ideal candidate for our purposes. Let's get started!
Localizing Static Strings
The first thing we want to look at is how we could localize some of the static content. Consider the basic layout of the site:
In this screen capture, the "Product Search" is a title and would - most likely - not be something we'd care to translate. But the search and introductory text are good candidates for localization. To begin, I'm going to add a simple drop down to my top header to support selecting a language. For the purposes of this demo I'll support English, French, and Japanese.
So how do I handle the actual changing of the strings into localized versions? For that I'm going to use a jQuery plugin: jquery-i18-properties. This plugin allows the use of "Resource Bundles", a Java-standard way of created localization resources. These bundles are simply text files based on a key and a translation. So for example, I may define a key as "Search", I can then create an English version like so:
search = Search
and a French version like so
search = Rechercher
A good engine then can handle reading and parsing these files for you. Your code can then simply say, "Hey, for my current language, give me the 'search' key." That's exactly what this plugin does. Even better, you can perform translation at a later stage. If you don't have time to completely translate values for French speakers, the code will automatically fall back to English. As a coder, it means I can globalize the code and translators later can handle the localization.
Here is an example:
$.i18n.properties({
name:'terms',
path:'bundles/',
mode:'map',
callback:function() {
$("#intromsg").text($.i18n.prop("intromsg"));
$("#searchText").attr("placeholder", $.i18n.prop("search"));
}
});
This code calls the plugin and uses the browser's settings to determine the current language. When done, a callback is fired and I can then update my values. My demo is going to let user's select a language and will default to English. I began by abstracting out my localization call into a function:
function loadAndDisplayLanguages(lang) {
$.i18n.properties({
name:'terms',
path:'bundles/',
mode:'map',
language:lang,
callback:function() {
$("#intromsg").text($.i18n.prop("intromsg"));
$("#searchText").attr("placeholder", $.i18n.prop("search"));
}
});
}
And within my jQuery document.ready block, I fired off the request defaulting to "en":
loadAndDisplayLanguages('en');
Finally, I added a simple click handler for my drop down menu:
$(".langpick").on("click",function(e) {
loadAndDisplayLanguages($(this).data("lang"));
});
What's the data call there? I used a data attribute to store the language code for my 3 supported languages:
<li><a href="#" class="langpick" data-lang="en">English</a></li>
<li><a href="#" class="langpick" data-lang="fr">French</a></li>
<li><a href="#" class="langpick" data-lang="ja">Japanese</a></li>
Ok, with me so far? We aren't quite done yet. These modification work to update the search button placeholder text and intro text. But what about our product searches? Every product result has 4 values: the name, the available date, and the quantity:
In order to update these values, we need to ensure we have them in our properties, and we need to ensure jQuery can find them so we can replace them. My English properties file now looks like this:
intromsg = To display products, use the search form above.
search = Search
price = Price
available = Available
quantity = Quantity
I then edited my result handler to wrap the values in spans:
$.post("service.cfc?method=searchproducts", {search:s}, function(res,code) {
var dsp = "";
for(var i=0; i<res.length; i++) {
dsp += "<div class='productResult'><h3>"+res[i].name+"</h3>";
dsp += "<p><span class='pricelabel'>"+PRICE_STR+"</span>: "+res[i].price+"<br/>";
dsp += "<span class='availlabel'>"+AVAILABLE_STR+"</span>: "+res[i].available+"<br/>";
dsp += "<span class='quantlabel'>"+QUANTITY_STR+"</span>: "+res[i].quantity+"<br/>";
dsp += "</p></div>";
}
$("#results").html(dsp);
},"json");
Note the use of 3 variables, PRICE_STR, AVAILABLE_STR, QUANTITY_STR. My JavaScript code now creates 3 global variables for this, and my loadAndDisplayLanguages function can update them:
//store strings for price, available, and quantity
var PRICE_STR = "";
var AVAILABLE_STR = "";
var QUANTITY_STR = "";
function loadAndDisplayLanguages(lang) {
$.i18n.properties({
name:'terms',
path:'bundles/',
mode:'map',
language:lang,
callback:function() {
$("#intromsg").text($.i18n.prop("intromsg"));
$("#searchText").attr("placeholder", $.i18n.prop("search"));
PRICE_STR = $.i18n.prop("price");
AVAILABLE_STR = $.i18n.prop("available");
QUANTITY_STR = $.i18n.prop("quantity");
$(".pricelabel").text(PRICE_STR);
$(".availlabel").text(AVAILABLE_STR);
$(".quantlabel").text(QUANTITY_STR);
}
});
}
Woot! To test this version out, hit this url:
http://www.raymondcamden.com/demos/2012/feb/15/v2 Another old demo removed
Try switching your language to French or Japanese. Note that there is nothing there for Japanese. That's ok. Eventually (well, if this were real) we could create that properties file and everything would just work.
So most our our 'labels' and simple text is updated, but we've got more work we can do. Note that the numeric and date values are not localized. That's our next target.
Globalizing Numbers and Dates
To work with the numbers and dates, I'm going to use another jQuery Plugin - globalize (https://github.com/jquery/globalize). As you can guess by the name, it handles globalizing/localizing numbers and dates. (It also has similar support to the i18n plugin.) In general, this plugin worked great, but I did run into one issue. In order to support various locales, you have to add additional script tags. So for example, I have to include a script tag for French and Japanese. Unlike the i18n plugin which simply tries to include things dynamically, the globalize plugin requires you to explicitly add support. (And yes, I know I could load those script files dynamically too.) One big thing to watch out for here - when I initially added French support, I did it like so:
<script type="text/javascript" src="js/cultures/globalize.culture.fr.js"></script>
But my results were garbage. Turns out, I forgot about the chartset attribute. Adding this cleared this up immediately:
<script type="text/javascript" src="js/cultures/globalize.culture.fr.js" charset="utf-8"></script>
One more small note - the globalize plugin is - in some ways - much more advanced then the i81n plugin. It recognizes a concept of "culture" which is more specific then a simple language code. You can still use language codes, but advanced users will want to read the docs carefully and see if they want to make use of this feature.
So, how about an example? Given our quantity values are numbers, we can use the plugin like so:
Globalize.format(res[i].quantity,"n0")
Dates are a bit more trickier. You want to parse your original value first, then format it:
var dateStr = Globalize.parseDate(res[i].available,"MMMM, dd yyyy hh:mm:ss","en");
... Globalize.format(dateStr,"d") ...
Notice I explicitly set the language to English. Since my data is coming in with an English format, this is ok. The format function will use the currently selected language. I haven't covered that yet, but you can set a language (or culture) like so:
Globalize.culture("en");
Ok... so in general, easy enough to use, right? However, we have two things to consider here. First, we need to use these formatting functions when displaying our search results. Second, we need to ensure we can update them dynamically. But - we are taking original values and converting them into a language specific value. I assumed (and note, I could be wrong!) that once localized, the plugin may have an issue converting it back into something else. So I decided to once again make use of data values. This time I'm going to store my original values so I can fetch em later:
$.post("service.cfc?method=searchproducts", {search:s}, function(res,code) {
var dsp = "";
for(var i=0; i<res.length; i++) {
dsp += "<div class='productResult'><h3>"+res[i].name+"</h3>";
dsp += "<p><span class='pricelabel'>"+PRICE_STR+"</span>: <span class='priceval' data-price='"+res[i].price+"'>"+Globalize.format(res[i].price,"c")+"</span><br/>";
var dateStr = Globalize.parseDate(res[i].available,"MMMM, dd yyyy hh:mm:ss","en");
dsp += "<span class='availlabel'>"+AVAILABLE_STR+"</span>: <span class='dateval' data-date='"+res[i].available+"'>"+Globalize.format(dateStr,"d")+"</span><br/>";
dsp += "<span class='quantlabel'>"+QUANTITY_STR+"</span>: <span class='quantval' data-quant='"+res[i].quantity+"'>"+Globalize.format(res[i].quantity,"n0")+"</span><br/>";
dsp += "</p></div>";
}
That handles the display, now let's go back to loadAndDisplayLanguages. I've updated it to handle the new globalization calls. Note that - oddly - I had to be explicit with my language when formatting.
function loadAndDisplayLanguages(lang) {
$.i18n.properties({
name:'terms',
path:'bundles/',
mode:'map',
language:lang,
callback:function() {
$("#intromsg").text($.i18n.prop("intromsg"));
$("#searchText").attr("placeholder", $.i18n.prop("search"));
PRICE_STR = $.i18n.prop("price");
AVAILABLE_STR = $.i18n.prop("available");
QUANTITY_STR = $.i18n.prop("quantity");
$(".pricelabel").text(PRICE_STR);
$(".availlabel").text(AVAILABLE_STR);
$(".quantlabel").text(QUANTITY_STR);
$(".priceval").each(function(i,el) {
var thisPrice = $(this).data("price");
var newPrice = Globalize.format(thisPrice, "c",lang);
$(this).text(newPrice);
});
$(".dateval").each(function(i,el) {
var thisDate = $(this).data("date");
var dateP = Globalize.parseDate(thisDate,"MMMM, dd yyyy hh:mm:ss","en");
var newDate = Globalize.format(dateP, "d",lang);
$(this).text(newDate);
});
$(".quantval").each(function(i,el) {
var thisQuant = $(this).data("quant");
var newQuant = Globalize.format(thisQuant, "n0",lang);
$(this).text(newQuant);
});
}
});
}
You can demo this here...
Another old demo removed...
And that's that. Obviously there is a lot more to consider here. I cannot stress enough how much more additional work will be necessary for proper localization. I'd love to hear people chime in with corrections, real life examples, or other comments.
p.s. I didn't bother attaching the server side code as it's a simple ColdFusion service using fake data. If anyone wants it just ask.
Archived Comments
What nothing about Word? ah man...
All kidding aside, I remember something like this being in the Joomla CMS. They had a directory of language files for translating the different site elements.
Wow, perfect timing! I have a client coming over in about...29 minutes... to talk about this very subject!
Thank You!
Merci!
Gracias!
If the client is Fortune 500, visit the wishlist.
If the client is Fortune 100, pay off my car note.
Thanks. ;)
Not a code related note...
I benefitted from Internationalization of dates as a non-American. Born and raised in Finland, dates are shown as dd.mm.yyyy. I got my drivers license in the US when I was an exchange student in the 80's. In Finland you have to be 18 to get a drivers license and that's also the age limit to nightclubs. My birthday is October 6 (10/06/xxxx) and my US drivers license showed it as such 10/06/xxxx. Doorman in a Finnish nightclub read it as June 10th and I got in as underaged :) during the summer :D
Of course I will never tell this to my 6yr old twin girls. I'll let them have fond memories of their mommy's native country and Santa (Santa's from Finland ya'll).
Jaana
Great article, Ray! I've been wanting to start implementing Globalization of some sites and this was an excellent introduction. Would love to see more in depth articles in the future! Thanks!
We built something similar for "large global manufacturer" for localizing their sites, but without the jQuery piece. The part that bit us the most was doing single word translations vs. phrase translations, and translations where the english word had differing translations depending on context. We had to abstract out the key to be something other than the english word to call the translation we wanted.
Globalization, it's never as simple as this equals that.
Hi Ray - thanks for this. You want real world examples, so I'll just but in and say that we use Paul's JavaRB CFC for most of our websites (all in CF) and it works fine.
I'd like to confirm your point that creating the actual language / translation files is definitely the biggest part of this job, especially when concerned with the static part of the interface. It's not just a case of translating words. Often a variable, or variables need to be inserted into a translated string or phrase - and of course with different languages this can present a problem (formatRBstring does it in Paul's code). In some cases you may have to insert a single string, in other cases more than one. In some languages if you insert two or more different strings, they may come in different order. Managing your language file variables is also critical - handling upper and lower case, or fields starting in upper case. All very time consuming.
Good points, Richard. I should point out - the jQuery library I used _does_ support variables in the strings, I just didn't need any of that for this demo.
Ray - just to make it clear, I wasn't criticising the code or examples, just adding my tuppence-worth that producing multi-lingual websites can be pretty tough. Our sites are not exactly completely multilingual - the user/client can select the interface language only. Managing multilingual content is of course another thing altogether, but not really related to your topic of internailisation. Why don't we all just speak Klingon?
Oh, I didn't take it as a critique at all! I just wanted to make sure folks knew it was possible with the plugin I used since my examples were simple. :)
Ray,
One pet peeve that I have on this topic is how often I run into web applications built that restrict formatting in the name of "validation". To me, a valid phone number is one you can use to call me on the phone. A valid address is one you can use to send a letter to me. And yet I find that very often, US based developers make the assumption that every phone number in the world has 10 digits, or that every zip code in the world has 5 digits, and validation routines are run to make sure the phone number submitted is a 10 digit numeric so that it can be stored as a numeric in the database. You would have no idea how many times it has been impossible for me to enter my valid phone number or address in web applications built like that - including on (gasp!) adobe.com.
Stuff like phone numbers and zip codes are stings, always. There is no way a web developer can ensure a user enters a valid phone number, one that can be used to reach them on the phone, with an algorithm. And generally speaking, all web applications hosted on the internet are available to every country in the world, so this would apply to any site aimed at an international audience, even if it remains only in English.
Email address validation routines drive me nuts. So many of them, especially those with legacy algorithms, invalidate valid email addresses (ones that can be used to email a real person) because of the continuing expansion of top level domains. Don't forget that even if an algo is valid for all top level domains today, it may not be in a year's time.
The other thing that developers are really screwing up these days is to enforce language selection, and I really mean _enforce_, shove it down your throat with no possibility to change it, based on a user's geographical location. I live in Switzerland. German, French, Italian, Romansh and yes, English are the languages generally spoken here depending on the region and the participants in the conversation. And yet companies like Google, Adobe (again!), and a host of others will now geolocate visitors, and in my case, ONLY show their website to me in bloody German with no possibility at all to change the language. I f*****g hate that! Not because I dislike German, but simply because it is so utterly disrespectful to refuse to communicate with someone in the language they are most familiar with _when_ _you_ _can_.
Companies have no idea how many people they offend, or at least alienate, like this. Give users the ability to choose which language they are most comfortable with, always. People are not trees. They are not glued to the region they were born in, and here in Europe, lots of folks wind up in another country (or language speaking region) besides that of their origin. I am genuinely surprised how utterly rare it has become that a company that has globalized their website allows a visitor decide what language they speak! Imagine how you would feel, Ray, if the ONLY language you could view Adobe.com in was Finnish, living in America, and knowing full well that Adobe was an American company.
Google isn't as bad, but they simply cannot seem to remember my language selection, and it's always in critical parts of the application. Navigation menus suddenly appear in German while other parts of the page remain in English.
To give a flavor of what typically happens outside of America, I'm exposed to a mix of English, Italian, Swiss Italian dialect, French, German, and Swiss German dialect every day. It's normal here, depending on the grouping of people conversing. And it's very rare for someone from the region I'm in to be fluent in high German, the language Adobe and so many other companies, even Apple and Microsoft, shove down our throats. Perhaps they can speak some Swiss German dialect, yes, but read and write high German?
It all boils down to user choice. In my opinion, globalized websites should always, always allow a high degree of choice. You would think I'm addressing a minority of sites and applications here, because it would seem to be such a common sense thing, but my experience is that almost all globalized sites I've come across, and big companies like Apple, Microsoft, Adobe, and a whole slew of other tech companies, simply have not gotten these basics right yet. And without these basics, the effort to globalize seems somewhat pointless.
By the way, would someone please order and purchase a Windows 7 serial number for me? Believe it or not, unless I manage it all in German, it's impossible for me. On second thought, forget it. Windows Activation routine is also geo-located, and it's going to tell me it's invalid for the region I'm in. It's that bad, always.
Wow, that's some feedback Nando. :) Thank you - you make some great points.
I needed to order a new copy of my birth certificate a few years back. There was a dropdown to select the country to send it to, but the zip code had a validation routine on it that insisted on 5 numeric digits. I couldn't submit the form at first. I kept getting a validation message "Please enter a VALID zip code" Well my zip code here is 4 digits long, just like every other zip code in Switzerland. I thought about how the developer might have coded the routine, and then entered my zip code, 6999, then a space after it, submitted the form (with a fake telephone number because it wouldn't accept my real phone number). And somehow it worked. I was relieved I had gotten through another one of these.
Few days later I got an email from FedEx saying they had attempted delivery, but they needed me to confirm my correct street address. They weren't able to find via Cara in some city in Massachusetts with a zip code of 06999. I called FedEx and was on the phone for a very long time with them trying to convince them to send my birth certificate to Switzerland. Sorry, they couldn't, they said. "The zip code is clearly 06999 on the shipping bill, so that's where it was routed. Just give us your correct street address in Massachusetts and we'll deliver it." I kept say "But I'm in Switzerland - I entered my Swiss address in the form" and the lady kept repeating, stressing each word insistently, "Sir! Just give us your correct street address in Massachusetts and we'll deliver it."
I always seem to wind up on the phone with someone insisting some version of "but that's impossible!" after tangling with a mis-globalized web application. "Look, I speak English, I don't want the software interface for Abobe Creative Suite in German, in fact I couldn't use it if it was in German. I need the English version, but I can't buy the English version because I'm in Switzerland." "Sorry, but that's impossible. You have to have an American or Canadian delivery and credit card billing address, and access the website from either America or Canada, to purchase the English language version."
Anyway, the next mis-globablized website hurdle to get over, thru or under is how to get an activation key for the Windows 7 installation I have on Parallels that will work. I tried for about a half hour on Friday, but I just didn't have the stomach for it. I was locked out of English on all of microsoft.com, purchases seem possible only in German here in Switzerland.
The 01/02 -> 02/01 1st Feb, 2nd Jan date issue is quite possibly the thing I hate most about development. Languages always seem to default to handling it the "US" way, which is a pain, but it's never that hard to work around.
Also Ray, you've underlined something that's not a link... "introduce" in your second paragraph. That really threw me... I had to inspect the source to find out why the link wasn't working. Am I being old fashioned?
Nando you make some good (albeit angry sounding) points, I've had Google offer to translate pages that are perfectly fine in English into all manner of other languages.
It also reminds me a bit of how websites force you into or away from a mobile site without giving you the choice.
Finally, cool post Ray, those plugins look really useful.
Whoops, when I said "Languages default to handling it the US way"... I probably should have been clearer, I meant programming/development langauges.
Although that's another point, is the MM/DD way more internationally accepted? Is it just us UK kind that use DD/MM
I believe the US date (mm/dd) is only used in the US, the rest of the world pretty much uses dd/mm order with the date.
I could be wrong, please correct me.
See here about international date formats. The US of A is really the only nation left that would use only MDY and no other orders. Then again, 2012-07-23 would be used by the US Army and should be understood pretty well internationally. Nobody does it like yyyy-dd-mm, always yyyy-mm-dd.
http://en.wikipedia.org/wik...
America and Belize - leading the world!
One thing missing from this is SEO. The demo you provide does not use different URLS for content presented in different languages. I think it should. It makes much more sense in terms of SEO as you have essentially got different versions of the content. If you provided different URLs for the different languages then you would allow search engines to target users according to their language. Also, you would have three times as many pages and thus many more internal links and potential external links.
Would SEO be important for a web "app" though? (Note the emphasis.) I can't imagine, for example, Capital One, needing SEO for their account screens.
Agreed, for an App then I guess its less important. An App is by its nature often dynamic, hidden behind an authentication layer and very content slim. However, they are also often connected with a website and the delineation can become blurred, especially with apps that produce publicly accessible content (like Facebook). I think is worth keeping in mind if there is publicly accessible Content within the app you are creating. Indeed, especially if that content is created in multiple languages.
Hi,
If you’re interested to localize web software, PC software, mobile software or any other type of software, I warmly recommend a new l10n tool that my team recently developed and will probably make your work a lot faster and easier:
http://poeditor.com/
POEditor is intuitive and collaborative and has a lot of useful features to help your translations management process, which you can find enlisted on our website.
You can import from multiple localization file formats (like pot, po, xls, xlsx, strings, xml, resx, properties) or just use our REST API.
Feel free to try it out and recommend it to developers and everyone who might find it useful.
Hello everyone! If you are like me, curious to discover new resources about localization with interesting topics, I suggest to have a look here and find various articles on localization industry and everything else related with it: http://aboutlocalization.wo...