FutureQuest, Inc. FutureQuest, Inc. FutureQuest, Inc.

FutureQuest, Inc.
Go Back   FutureQuest Community > General Site Owner Support (All may read/respond) > PHP, Perl, Python and/or MySQL
User Name
Password  Lost PW

 
Thread Tools Display Modes
Old 02-24-2002, 05:47 AM   Postid: 62483
 Arthur
Developer
 
Arthur's Avatar
 
Join Date: Nov 2000
Location: The Netherlands
Posts: 2,116
PHP+gettext, internationalization of your website

With the upgrade of PHP to version 4.1.1 a new feature was introduced; gettext. Here's a small write-up about how to use gettext on your site.

So, what is gettext and how do I use it?
PHP's gettext functions provide an interface to the GNU gettext utility, which is a utility for programmers to internationalize their programs. Gettext helps manage the messages output by the software. You can also use it for internationalization (often abbreviated to I18N) of your website using PHP.

The basic usage in your PHP script is fairly simple;
1. set the language to be used
2. echo/print your text as you normally would, prepending 'gettext' or an underscore '_'. So instead of 'echo "Hello World"', you would write 'echo _("Hello World")' or 'echo gettext("Hello World")'

The translations are stored in a compressed binary file (.mo files), which you create from the plain ASCII files created by gettext from your script (.po files).

So, step by step, how to create your script.
Let's assume the script you want to internationalize looks like this:
PHP Code:
<?php
echo _("Hello World");
?>
The first step would be to create the .po script by extracting all the translatable string. For that you use the 'xgettext' program (xgettext is part of the gettext utilities and is not a PHP function). So, you invoke 'xgettext';
xgettext --default-domain=greetings -k_ hello.php

This results in a file called 'greetings.po' being created. The contents of the .po file looks like this:
#: hello.php:2
msgid "Hello World"
msgstr ""

The first line is a comment, the second line the string that is to be translated, the third line will hold the translation of the string.

Let's translate it into Dutch. Open the 'greetings.po' file in a text editor and put the translation after the 'msgstr' directive. Your 'greetings.po' file should now look something like this:
#: hello.php:2
msgid "Hello World"
msgstr "Hallo Wereld"

Now, let's make the binary .mo from this. For that you use the utility 'msgfmt' (msgfmt again is part of the gettext utilities);
msgfmt -o greetings.mo greetings.po

So, now you have your translation, but how do you change your script to show the translation?
First of all create a subdirectory called 'locale' and in that subdirectory make a subdirectory for the language, in this case 'nl_NL' and in that directory create a directory 'LC_MESSAGES'. Put the 'greetings.mo' file in the subdir 'LC_MESSAGES'. You should now have the file 'locale/nl_NL/LC_MESSAGES/greetings.mo'.
Secondly the original PHP script will have to be adapted. The locale needs to be set to the right language and the so-called textdomain will have to be set to the right file ('greetings.mo').
Next, what the script could look like, although in a real application you'd of course want to set the language based on $HTTP_ACCEPT_LANGUAGE or a setting chosen by the user.
PHP Code:
<?php
putenv
("LANG=nl_NL");
setlocale('LC_ALL'"nl_NL");
bindtextdomain("greetings""./locale/"); 
textdomain("greetings");

echo 
_("Hello World");
?>
Now you should have a working script that outputs "Hallo Wereld". Basically that's all there's to it.

All the gettext utilites you need are installed on the FutureQuest servers, but what if you want to develop the script on your Windows PC?
First of all, enable 'gettext' on your Windows' PHP installation. Open your php.ini file, which should be in \winnt or \windows, if it's not take php.ini-dist from the PHP directory and copy it to your main Windows directory. Next, uncomment the line ";extension=php_gettext.dll", by removing the semi-colon. Then set the extension_dir directive to wherever PHP is located (e.g. "f:\php\extensions\"). Next restart Apache, when there's no error look at the info outputted by phpinfo() to see if 'gettext' is now indeed enabled.

Secondly, you'll have to get a copy of the gettext utilites compiled for Win32;
http://sourceforge.net/projects/mingwrep/
http://home.a-city.de/franco.bez/get..._win32_en.html

That's it, you should now be ready to develop your PHP scripts with gettext under Windows.

Although the .po files can be edited in any plain text editor, some people have developed special editors for the job;
http://www.gtranslator.org/
http://poedit.sourceforge.net/
http://i18n.kde.org/tools/kbabel/
http://www.geocities.com/bilibao/
http://muli.sourceforge.net/

Most aren't for Windows, but poEdit is available as a Windows application. Vi/Vim (also available for Windows) isn't specifically intended for translating .po files, but does include syntax coloring for it, which can be handy.

So, that was a -hopefully useful- primer on the basics of using gettext with PHP, here are some more URLs where people can get more information;
http://www.php.net/manual/en/ref.gettext.php
http://www.gnu.org/manual/gettext/ht...ttext_toc.html
http://www.php-er.com/chapters/Gettext_Functions.html

Arthur
Arthur is offline  
Old 02-24-2002, 03:37 PM   Postid: 62496
dank
Site Owner

Forum Notability:
410 pts: Community Guru
[Post Feedback]
 
Join Date: Mar 2000
Location: MWV
Posts: 3,984
Thanks for the detailed explanation, Arthur.

Quote:
Basically that's all there's to it.
Uhh... Is that all??? I'm scratching my head trying to figure out how the above method carries any advantages over simply using variables (or arrays) to store translated strings of text based on the selected language. Either way you have to do the translations manually (I thought Paul or Rich originally described gettext as doing the translations for you?), so no work is saved that I can see. Instead, you "gain" all the locale and directory complexities...

The only advantage I can see is not having to declare all the variables typically used for language files, but is there any difference in overhead between that and whatever method gettext uses to look things up?

Dan
- confused, as always
__________________
The artist formerly known as Dluded
dank is offline  
Old 02-24-2002, 05:18 PM   Postid: 62502
 Arthur
Developer
 
Arthur's Avatar
 
Join Date: Nov 2000
Location: The Netherlands
Posts: 2,116
There are some more advantages. For instance when you change a text in your script and then you run gettext and merge it with previous translation (using msgmerge) the text is clearly marked (as 'fuzzy'). That makes it easy to see which translations need updating. Also texts that have not yet been translated are easy to spot. And if a text hasn't been translated the original text is shown.
If you have a number of translations it makes it easier to manage them, in my opinion.

Also, it's more or less a standard tool. You have editors that make it easier for translators to translate the site. Translators can get one file that contains all strings that need to be translated.

And the files are compressed and indexed which helps the speed.

Sure, there are other ways, but I think the gettext method provides a good way to manage different translations.
Arthur is offline  
Old 02-24-2002, 08:10 PM   Postid: 62513
dank
Site Owner

Forum Notability:
410 pts: Community Guru
[Post Feedback]
 
Join Date: Mar 2000
Location: MWV
Posts: 3,984
That's a reasonable set of advantages, I suppose. Aside from the compression/indexing one, all can be automated in other ways, though. I store translations in a database that is accessed through a little program I put together, and it keeps track of what has or has not been translated for each language and uses the default text if no translation has been made. It also writes out the actual language file with the click of a button.

Quote:
And if a text hasn't been translated the original text is shown.
If I understand your first post correctly, don't you still have to go into each of the language files and add or modify the msgid/msgstr lines? That would seem to still require a lot of manual legwork, especially for something with a large number of languages. With my method, a variable only needs to be added to the database once. The only repetitive work is outputting each language file and saving it over the old version, which could also be automated if I ever wanted to bother.

Dan
__________________
The artist formerly known as Dluded
dank is offline  
Old 02-25-2002, 03:06 AM   Postid: 62539
 Arthur
Developer
 
Arthur's Avatar
 
Join Date: Nov 2000
Location: The Netherlands
Posts: 2,116
Well, there you have another advantage; you don't have to write your own programs to automate and track things
Quote:
don't you still have to go into each of the language files and add or modify the msgid/msgstr lines?
You don't have to add or modify the msgid lines by hand, that's all done with xgettext and msgmerge. After you add/modify a translatable string to your script, you run xgettext again to extract all translatable strings and then you merge the new .po file with the old already translated .po files. For new strings the msgstr line will be empty, for modified strings it'll be marked fuzzy.
The merging with the various translations still has to be done by hand, but could be automated using a batch file or shell script.

It sounds like your system works similarly to what what you can do with gettext, but for people that don't have such a system set up already, I think using gettext is a good alternative.

Btw, the compression isn't really compression, it's more deflation, all text is just tightly packed together.

-- There's more than one way to do it
Arthur is offline  
Old 02-25-2002, 10:52 AM   Postid: 62565
jeep
Site Owner
 
jeep's Avatar

Forum Notability:
34 pts: User-friendly
[Post Feedback]
 
Join Date: Dec 2001
Location: Brighton, United Kingdom
Posts: 229
I can see some of the benefits from automated tools like gettext, but still it seems awfully complicated. All these files and command-line tools.

They way I do it was inspired by the system in Java. Basically I have a file with variables like
PHP Code:
$text_profile_welcome "welcome to your profile" 
where the middle portion of the variable is the file name and the final part is a descriptor.

These are kept in a file called language_en.php or language_fr.php etc etc. Then I do an include at the top of each page
PHP Code:
if (!$language) {
   include(
$abs_path."/language_en.php");
} else {
   include(
$abs_path."/language_".strtolower($language).".php");

In my mind not only is that easier but it's going to be faster than reading text files etc etc

Jason
__________________
Swing Digital - online community consultancy
jeep is offline  
Old 02-25-2002, 11:13 AM   Postid: 62566
dank
Site Owner

Forum Notability:
410 pts: Community Guru
[Post Feedback]
 
Join Date: Mar 2000
Location: MWV
Posts: 3,984
Jason, my method is pretty much the same as yours, and it works quite well as far as I can see. What I was describing above is how I maintain the language files. I agree that gettext seems overly complicated, but it's hard to envision a simpler "everything for everyone" method of accomplishing the same thing.

Quote:
Well, there you have another advantage; you don't have to write your own programs to automate and track things
True! I've actually thought about releasing the program I wrote, but I've never gotten around to putting around a demo...

Quote:
You don't have to add or modify the msgid lines by hand, that's all done with xgettext and msgmerge.
Phew, that's better than I was thinking. Still seems like you'd have to be very competent to carry out all the steps without goofing up on a semi-regular basis, but at least that's one major chunk of work it does for you.

Dan
dank is offline  
Old 02-25-2002, 12:20 PM   Postid: 62569
 Arthur
Developer
 
Arthur's Avatar
 
Join Date: Nov 2000
Location: The Netherlands
Posts: 2,116
Why is;
$text_profile_welcome = "welcome to your profile"
easier? You can end up with a huge amount of variables. If you re-use a certain phrase you have to manually look through all the variables to see if it was already used.

If you add or remove a phrase, you'll have to go through all the translations and add or remove a variable with the risk of missing one. For a few translations it'll be okay, but if you want a large number of translations, then it becomes difficult to manage all the files. Then you have to write your own system to manage and track everything and then you basically have the same system that gettext offers...

I'm not saying it's the Holy Grail of i18n, but I like it and if anyone wants to give it a try - see the instructions above to get started.
Arthur is offline  
Old 02-26-2002, 01:19 AM   Postid: 62628
dank
Site Owner

Forum Notability:
410 pts: Community Guru
[Post Feedback]
 
Join Date: Mar 2000
Location: MWV
Posts: 3,984
Quote:
Why is;
$text_profile_welcome = "welcome to your profile"
easier?
Not necessarily easier, but definitely more intuitive, which in turn leads to relative ease for most people.

The large number of variables is the main drawback to that method that I mentioned previously. I don't know at what point that becomes a significant resource drain, but I would guess it takes a rather large number of variables.

The rest of your points are all very easily addressed by a script/database like what I use. (The more I think about it, the more I remember why I embarked on writing it myself... Maybe I should market this seemingly overly simple concept! )

Anyway, not to take away from your original intent which was admirable -- to outline in layman's terms how to use the new gettext tools.

Dan
__________________
The artist formerly known as Dluded
dank is offline  
Old 02-26-2002, 07:00 PM   Postid: 62685
janderk
Visitor
 
janderk's Avatar

Forum Notability:
0 pts:
[Post Feedback]
 
Join Date: Dec 2000
Location: Amsterdam
Posts: 362
Thanks for pointing out and explaining the gettext feature. I might need that in the future.

JD
janderk is offline  


Currently Active Users Viewing This Thread: 1 (0 members and 1 visitors)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -4. The time now is 03:14 AM.


Running on vBulletin®
Copyright © 2000 - 2010, Jelsoft Enterprises Ltd.
Hosted & Administrated by FutureQuest, Inc.
Images & content copyright © 1998-2010 FutureQuest, Inc.
FutureQuest, Inc.