Author: pwillener, Location: JapanPosted: Mon Oct 22, 2007 8:00 am Post subject: CastleCops site encoding
I sometimes need to use Japanese Windows for certain tasks. Recently, during some idle time, I browsed the CastleCops site from Japanese Internet Explorer. I was a bit shocked to find the entire site sprinkled with garbage characters!
The cause was found easily: the CC site does not specify an explicit encoding anywhere. This causes browsers to choose a default encoding; when the browser is a Japanese one, then the default encoding will be 'Shift_JIS'. This will cause characters next to special characters (period, comma, apostrophes, etc.) to turn into random garbage characters, or even false Japanese characters.
Suggestion: SpamCop should specify its own encoding to allow users with non-European languages to successfully see the original CC content. Best choice: iso-8859-1 (Western European), or UTF-8 (Unicode).
Author: Paul, Posted: Tue Oct 23, 2007 1:01 am Post subject:
Interesting... this was set in the past, must have disappeared when we went to Squid. Thanks for the heads up, I'll check into this.
Author: Paul, Posted: Tue Oct 23, 2007 1:02 am Post subject:
I should note, it was set directly in the http header, and not in the content data.
Author: pwillener, Location: JapanPosted: Tue Oct 23, 2007 2:12 am Post subject:
I wouldn't know how to do that; I think the "normal" way is to supply it with a META tag. Thanks for checking into it.
Author: Paul, Posted: Tue Oct 23, 2007 2:24 am Post subject:
Apache just took care of it, but with squid in the middle, things are different apparently.
Author: ahoier, Location: USAPosted: Tue Oct 23, 2007 5:10 pm Post subject:
I don't know if it's related or not, but I'm on an English OS (IE7, IE6, or Firefox) and noticed sometimes the "dots" that separate New Forum Posts, Send a PM, Your Topics, etc. up top in the navbar would turn up as Question Marks ...
Didn't see it as a big deal, and well, it was quite random (or probably because I used too many configurations around here; depending on what I'm doing...lol).
But yea, I can't remember off hand where it was displaying like that...whether it was Firefox 2.0.0.7 (It's showing the dots fine right now), Torpark/XeroBank Browser, or IE7, or IE6 (on campus).
Author: pwillener, Location: JapanPosted: Wed Oct 24, 2007 1:35 am Post subject:
Can you check what your default encoding is (View | Encoding). Does the question mark go away when you change your encoding to Western European or Unicode UTF-8 ? If yes, then this is related.
Author: Paul, Posted: Wed Oct 31, 2007 11:42 pm Post subject:
OK I've got it in the apache conf now.
Author: Paul, Posted: Wed Oct 31, 2007 11:42 pm Post subject:
Of particular note from the conf file:
#
# Specify a default charset for all pages sent out. This is
# always a good idea and opens the door for future internationalisation
# of your web site, should you ever want it. Specifying it as
# a default does little harm; as the standard dictates that a page
# is in iso-8859-1 (latin1) unless specified otherwise i.e. you
# are merely stating the obvious. There are also some security
# reasons in browsers, related to javascript and URL parsing
# which encourage you to always set a default char set.
#
Author: pwillener, Location: JapanPosted: Thu Nov 01, 2007 8:03 am Post subject:
I have just checked it under Japanese Windows, and everything looks fine. The browser has received the default encoding iso-8859-1, and it displays perfectly.