Tuesday, June 10, 2008

Character Encoding Problem in Anusaaraka Urdu to Devanagari transliteration script

While test driving the latest firefox3 I found that the Anusaaraka Urdu to Devanagari transliteration system was not working as expected. After looking around I figured out that the CGI script was setting the page encoding to Western (ISO-8859-1) instead of standard unicode(UTF-8). Perhaps the developers should note that setting correct parameters in HTML/XHTML meta elements is not enough. You need to set the proper encoding via the HTTP headers returned by the Web server, in this case I guess the CGI script didn't do that and the web server sent the default headers.


Often character encoding issue like this is mistaken for missing fonts or worse lack of proper support in the web browser (sic).

After changing to Unicode (UTF-8)

No comments:

Post a Comment

You can leave a comment here using your Google account, OpenID or as an anonymous user.

Popular Posts