Deep in the TextMate bundle is escape.rb, containing this utility:
# URL escape a string but preserve slashes (idea being we have a file system path that we want to use with file://) def e_url(str) str.gsub(/([^a-zA-Z0-9/_.-]+)/n) do '%' + $1.unpack('H2' * $1.size).join('%').upcase end end
The problem is that the "n" modifier on the match requires that this string (str) be ASCII-8BIT encoding. But on the Mac a file system path is UTF-8. Thus it is possible that this method will be sent a UTF-8 string and will choke on it. For example:
e_url("/Users/matt/Desktop/höwdy.txt")
[That's "howdy" with an umlaut over the o, in case it doesn't come across in your email.]
Now, it is legal to make a file:// URL that points to this file. But e_url is failing to create such a URL and we get a warning.
I suggest that this utility may need to be revised for use with Mavericks and Ruby 2.x.
m.
-- matt neuburg, phd = matt@tidbits.com, http://www.apeth.net/matt/ pantes anthropoi tou eidenai oregontai phusei Programming iOS 7! http://shop.oreilly.com/product/0636920031017.do iOS 7 Fundamentals! http://shop.oreilly.com/product/0636920032465.do RubyFrontier! http://www.apeth.com/RubyFrontierDocs/default.html TidBITS, Mac news and reviews since 1990, http://www.tidbits.com
On 10 Jan 2014, at 9:21, Matt Neuburg wrote:
[…] The problem is that the "n" modifier on the match requires that this string (str) be ASCII-8BIT encoding. But on the Mac a file system path is UTF-8. Thus it is possible that this method will be sent a UTF-8 string and will choke on it.
The reason we coerce the 8-bit ASCII encoding onto the source string is to look at the “raw” bytes.
That way, your file name should be encoded as “h%C3%B6wdy.txt”.
I suggest that this utility may need to be revised for use with Mavericks and Ruby 2.x.
If anyone know of a way to achieve the same with ruby 2 (that is backwards compatible) I’ll gladly update the code.
On Jan 9, 2014, at 10:35 PM, Allan Odgaard mailinglist@textmate.org wrote:
On 10 Jan 2014, at 9:21, Matt Neuburg wrote:
[…] The problem is that the "n" modifier on the match requires that this string (str) be ASCII-8BIT encoding. But on the Mac a file system path is UTF-8. Thus it is possible that this method will be sent a UTF-8 string and will choke on it.
The reason we coerce the 8-bit ASCII encoding onto the source string is to look at the “raw” bytes.
That way, your file name should be encoded as “h%C3%B6wdy.txt”.
I suggest that this utility may need to be revised for use with Mavericks and Ruby 2.x.
If anyone know of a way to achieve the same with ruby 2 (that is backwards compatible) I’ll gladly update the code.
It doesn't need to be backwards compatible, since your code can always check the Ruby version... :)
On my machine, simply calling URI.escape() does the job correctly. So you could either do what URI.escape() does - Ruby is open source, so it easy to see how it works, by calling each_byte or unpack("C*") and cycling through the "characters" - or you could just call URI.escape() directly.
m.
-- matt neuburg, phd = matt@tidbits.com, http://www.apeth.net/matt/ pantes anthropoi tou eidenai oregontai phusei Programming iOS 7! http://shop.oreilly.com/product/0636920031017.do iOS 7 Fundamentals! http://shop.oreilly.com/product/0636920032465.do RubyFrontier! http://www.apeth.com/RubyFrontierDocs/default.html TidBITS, Mac news and reviews since 1990, http://www.tidbits.com
On 11 Jan 2014, at 2:32, Matt Neuburg wrote:
If anyone know of a way to achieve the same with ruby 2 (that is backwards compatible) I’ll gladly update the code.
It doesn't need to be backwards compatible, since your code can always check the Ruby version... :)
Yeah, that’s lovelely, especially coupled with some of the monkey patching that was done to make TM support stuff work on ruby 1.9 ;)
On my machine, simply calling URI.escape() does the job correctly. So you could either do what URI.escape() does […]
So I guess we can just change it to be a wrapper for URI.escape. Which also means, you should be able to just call URI.escape directly if you wish to write a TM command that runs on ruby 2.0.