- January 2006 (1)
- January 2007 (1)
- July 2007 (8)
- August 2007 (3)
- September 2007 (3)
- October 2007 (2)
- November 2007 (3)
- January 2008 (5)
- February 2008 (4)
- March 2008 (1)
- April 2008 (5)
- June 2008 (3)
- July 2008 (2)
- August 2008 (1)
- September 2008 (6)
- November 2008 (3)
- December 2008 (1)
- January 2009 (4)
- March 2009 (1)
- April 2009 (14)
- May 2009 (9)
- June 2009 (7)
- July 2009 (6)
- August 2009 (4)
- September 2009 (4)
- October 2009 (2)
- November 2009 (23)
- December 2009 (23)
- January 2010 (4)
- February 2010 (3)
- March 2010 (2)
- May 2010 (3)
- July 2010 (4)
PowerShell and Unicode
After being away from the Windows developer world for a few years, I have been pleased to find some of the nice things that Microsoft has given us. Visual Studio has some really nice refactoring capabilities. The Windows 7 user experience rivals OS X. And as an alternative to the venerable cmd.exe, we now have a much better command-line shell: PowerShell.
What I like most about PowerShell is that it feels more like a UNIX shell. It supports a lot of UNIXy commands (ls, echo, cat). It lets you use either forward slashes or backslashes in paths This is good for someone like me who can never remember what OS I'm using when I start typing a command.
But of course, Microsoft can't give us something new without throwing in some surprisingly inappropriate behavior. A couple of days ago, I needed to create a patch for a Subversion repository, and so I typed the typical command to do so (which works fine in UNIX shells and with cmd.exe):
svn diff > my_patch.diff
I then looked at my patch to verify that it looked good:
cat my_patch.diff | more
Everything looked fine. However, when I later tried to apply the patch to another Subversion workspace:
patch -p0 -i my_patch.diff
I got errors. I opened up my_patch.diff in Vim, and realized it was a UTF-16-encoded file.
Neither svn nor patch know how to deal with Unicode. How did this happen?
After wasting an hour trying various svn command-line options and diff utilities, I finally stumbled onto the answer. It turns out that, in PowerShell, svn diff > my_patch.diff is equivalent to this command:
svn diff | out-file my_patch.diff
and (get this), the out-file cmdlet encodes its output as Unicode by default.
This default behavior makes sense for out-file, but it is counter-intuitive that the > redirection operator would take ASCII and convert it to Unicode.
To make PowerShell do the right thing, you have to do this:
svn diff | out-file -encoding ascii my_patch.diff
Grrr.