Sunday, January 17, 2010

Special Characters - notes and comments

Reason for this entry is the rather disturbing observation, you can actually get different results from one and the same script.
The scenario is running a script, that uses embedded "special" characters (an €, in this case) and is executed using either the CLI version of SQL*Plus (sqlplus.exe), or the Windowed version (sqlplusw.exe).

Code Pages

It turns out that both versions use different MS Windows code pages: the Windowed version, sqlplusw.exe, picks up the code page of MS Windows, while the CLI version uses the code page of the CLI. I ended up with different binary values of the character.
By default, these differ on all non-US installations of MS Windows!

Debugging

In the course of the blog entries, I have mentioned, but never made a list of tell-tale signs. A troubleshooting list so to say. Here's a start:
  • Remember: there's always at least one way to screw up. There's no recipe, no Rule-of-Thumb, or whatever to make sure you'll never end up with weird glyphs.
    Unless you control the whole, single-user, system, that is...
  • If you see a small diamond in web browser, start hitting your web designer/builder: the page encoding is incorrect for the characters used.
    You can change the Character Set Coding in most browsers as workaround.
  • If you see inverted ('"upside-down") question marks, your Oracle infrastructure is to blame: somewhere along the line, character set conversion takes place, for which the "to-character set" does not have a glyph defined (e.g. ISO8859P1 does not have a glyph for the Euro-sign)


[add Apr 2011] A rather good note on this is Doc ID 158577.1 on Metalink.

No comments: