Native Language Support
PHP is a great language for developing dynamic web sites. Some do it for fun while others for business. It is true that a great part of the web is in English. However, if you are targeting a worldwide audience, then neither English nor Esperanto alone is an option.
If you need to deliver content in several languages, it is a good idea to explore several alternatives. However, some alternatives may not be suitable for dynamic websites. Added to that, there is the overhead of time spent in maintenance. To further complicate things, your needs may not be totally in line with the resources you have at your disposal. Therefore, it is advisable to choose an alternative that suits you best.
I found myself in positions which required me to deliver content in both English and Spanish, and in one project a third language. Here are the possibilities I explored:
Explicit links for each language
Use Apache's mod_negotiation
Use GNU Gettext support in PHP
Write your own
This article gives a brief introduction to the first three possibilities, but then we will go about the fourth solution which suited the requirements best, given the set of constraints. I am assuming that the reader is at least familiar with PHP programming and the use of PHP classes.
Principles of content negotiation
Before we go into exploring the various options, we should understand the basics of content negotiation and how that applies to the development framework. Then, you will be able to develop a web application that can deliver its content in the language of choice of your visitor.
By simply configuring the web browser, the user can set it up in a way that his or her preferred language is used when available. Several languages can be specified in a prioritized list, by setting up the preferences or option of the browser.
ands this list of preferred languages on every request made to the site. This action is totally transparent to the user, as the information gets sent in the Accept-Language header, for example:
Accept-Language: bg, es, en-US, fr
Here our visitor has chosen Bulgarian, US English, Spanish and French in that order. Notice that you can even specify regional variants. The first two characters are a language code as specified in an ISO standard. This language code may be followed by a dash and a region code.
As an example, if the request arrives to a website whose content is entirely in Russian, then the list is exhausted and the visitor will get Russian text whether (s)he likes it or not. Now, assuming the website has both English and Spanish content (the 2nd and 3rd options), then the visitor will receive pages in Spanish. Why? Simply because here Spanish had higher priority with respect to English.
Sometimes, the web server itself can manage the content negotiation, if configured to do so. Otherwise, the request for a particular language is ignored. Alternatively, the application that delivers the content takes the decision of which language it is to use. This is exactly what we will do later.
Before going further, I would like to point out that the content negotiation is not just dealing with human languages. For example, it also negotiates the kind of information the client can take by means of MIME types, but that is beyond the scope of this article.
Explicit links
Many multi-lingual websites present the content in various languages, and do so by placing a link on the document. There would be one link for each of the supported languages. This is a very simplistic approach and should only be used if you need to have multi-lingual content, but do not have the resources of a scripting language or dynamic content.
If a document is moved, or a new language is added or removed from the repertoire, then the webmaster would have to edit, add or remove links in each of the affected documents. This can be quite tedious.
Apache's content negotiation
The Apache web server can manage language-sensitive content delivery by using the information from the content negotiation headers. Then, the webmaster must provide the static pages for each language and name them properly. For example if the welcome page is available in Spanish and English, the webmaster would have these two files:
welcome.html.es welcome.html.en
When the web server is well configured, it will deliver the appropriate web page based on the language code according to the priority list.
This works perfectly for static pages. However, if you have a dynamic website where a great deal of the pages is generated based on queries, then this approach will not work. Another disadvantage is that you need to know how to do it and you may or may not have access to the configuration files. My experience was that it was a bit tricky and it did not offer enough flexibility for my purposes.
An advantage of this method is that the negotiation is between the browser and the Apache server. You need only to provide the static content.
GNU Gettext with PHP
This internationalization tool has been around for some time for C programmers. There is also a variant used on other Un*x, such as HP. Both are very good and are easy to use.
This extension has been available in PHP since version 3.0.6 and also in 4.0. The Gettext extension is easy to use, and is good if you are generating your webpages dynamically. The only thing left here would be the PHP code that generates the content and a set of message catalogs. Supporting a new language is as easy as generating a new catalog with the translations and dropping the file in the appropriate directory. Therefore, assuming you have a PHP application named "myphp" and that the appropriate message catalogs exist and are installed, then the application would have something like this:
<?php
/* Initialization of GetText in myphp */
putenv("LANG=$language");
bindtextdomain("myphp","./locale");
textdomain("myphp");
/* Print some messages in the native language */
echo gettext("Hello new user");
echo _("You have no new messages");
?>
My provider had recently upgraded from PHP 3.0RC5 to a PHP4.0 Beta 2 installation. While PHP4 does have support for the Gettext extension, my provider did not compile the PHP4 module with Gettext support. However, even if they had, moving to another provider without Gettext would become a major headache.
CtNls - National Language Support
Having considered various alternatives, I wrote down a set of requirements for the NLS module:
It had to be simple, and based in PHP
Easy to use
Allow the user the freedom of choosing or mixing NLS methods
As a result, I developed a PHP class that would allow the user to set up a multi-lingual website. With this class, the developer can emulate the Apache approach of one file per language, without reconfiguring Apache. Additionally, it is possible to use a PHP script that generates the output in the appropriate language by means of message catalogs.
A real life application of this class is on my website at http://www.coralys.com/ The main page is available in English (en), Spanish (es) and Dutch (nl).
Using CtNls
This class is very easy to use. The application would have something like th
