Ruby 1.9.2 changes and i18n on Mac OSX
We recently noticed some pretty interesting changes in ruby 1.9.2. It appears that
A potentially more difficult change is in how ruby handles character encodings. For the most part, this isn't a problem inside "normal" code and strings, but things get dicey if you start reading text files off a filesystem. This is especially dicey if you're doing this and you're on a mac AND you work with western european data AND it involves money. If you save a file with a currency symbol on a mac, then subsequently read the file on a machine (or a tool) that uses/assumes utf-8, you will not see €, you will see a Û.
To cut to the chase, if you're developing software on a Mac, make sure you change your tools to use utf-8, NOT macroman or you will at some point be scratching your head. Why? As a quick example, the Euro symbol in macroman is mapped to a different character than it is in UTF-8. More importantly, for international applications, non-latin characters don't exist and you won't be able to properly edit files with asian and other non-latin based characters.
require
no longer allows relative paths and ruby is now more unicody. This means if you are in a directory with two files "foo.rb" and "bar.rb", you can no longer simply type "require 'foo'" inside bar.rb to use foo. Now, you need to either do "require './foo'" or "require_relative 'foo'". A potentially more difficult change is in how ruby handles character encodings. For the most part, this isn't a problem inside "normal" code and strings, but things get dicey if you start reading text files off a filesystem. This is especially dicey if you're doing this and you're on a mac AND you work with western european data AND it involves money. If you save a file with a currency symbol on a mac, then subsequently read the file on a machine (or a tool) that uses/assumes utf-8, you will not see €, you will see a Û.
To cut to the chase, if you're developing software on a Mac, make sure you change your tools to use utf-8, NOT macroman or you will at some point be scratching your head. Why? As a quick example, the Euro symbol in macroman is mapped to a different character than it is in UTF-8. More importantly, for international applications, non-latin characters don't exist and you won't be able to properly edit files with asian and other non-latin based characters.
Comments