Trivial AJAX, encoding gotchas: comment previews
Implementing inline comment previews is trickier than it seems, as trivial as the involved AJAX is. I took a look at a couple familiar blogs and saw that getting it right is relatively difficult.
Everybody's beloved redhanded (powered by
hobix) can't swallow non-ASCII text:

Typo seems to do the Right Thing, but I found at least one browser broken
enough to mangle the preview (Konqueror 3.4.3, I don't know whether it's
fixed in newer versions):

Problems
Implementing an inline preview pane sounds as easy as it gets, after all all you have to do is
- take the contents from a textarea and escape them appropriately
- send them to the server with an XMLHttpRequest and await the response with the formatted version (just some HTML)
- update the innerHTML attribute of some hidden DIV you had somewhere in the page and show it
Only three things to do, little room for errors, right? Well, it turns out there are also at least three potential problems when you're using GET for the request:
- how to escape properly (client-side, Javascript)
- returning a correct response (server-side, your-lang-of-choice)
- having the preview displayed as it should
Escaping
Hobix uses escape which doesn't work with non-ASCII text. In my tests, it was sending letters with accents as escaped ISO-8859-1 and uncommon stuff (Kanji, etc.) as %uCODEPOINT.
You want encodeURIComponent which just yields escaped UTF-8, no matter the charset used in the page, as Javascript uses Unicode for all Strings, seemingly.
Server response
The only gotcha is remembering to set the charset in the HTTP response. Here's what my hiki plugin looks like:
def inline_preview msg = @cgi.params['msg'] msg = (msg && msg[0]) ? comment_sanitize(msg[0]) : "" parser = @conf.parser::new( @conf ) tokens = parser.parse(msg) formatter = @conf.formatter::new( tokens, @db, @plugin, @conf ) body = formatter.to_s # workaround for older browsers body = body.unpack("U*").map{|x| x < 128 ? x.chr : "&##{x};"}.join header = Hash::new header['type'] = 'text/plain' header['charset'] = "UTF-8" header['Content-Language'] = @conf.lang header['Pragma'] = 'no-cache' header['Cache-Control'] = 'no-cache' print @cgi.header(header) puts body nil # Don't move to the 'FrontPage' end
Buggy browsers
In theory, you should be able to stuff the text from the response into some DIV and the browser would display it correctly regardless of the charset/encoding used. In practice, it's safer to use entities for non-ASCII chars. This is what
body = body.unpack("U*").map{|x| x < 128 ? x.chr : "&##{x};"}.join
does.
Typo support - Kevin Ballard (2006-08-09 (Wed) 01:41:37)
I don't have access to Konqueror to test, so if you think the Typo preview issue you saw might actually be a problem in Typo itself, I'd love to hear it. You can email me at <kevin@sb.org>
yep that was it. - _why (2006-08-07 (Mon) 10:29:41)
Okay, thanks for the tip on encodeURIComponent. That snapped right in. For the entity support, I think that probably needs to go in RedCloth. Gotta think about it.
Benjamin 2006-08-16 (Wed) 03:06:40
That non-ascii text can definetely get you. You're right about that completely. It seems ok in typo from my checks also. Japanese looking ok to go. rubricks guys are dedeveloping in japanese also. - ben @ http://rubyonrailsblog.com/
testing - mfp (2006-08-06 (Sun) 06:42:48)
- ASCII <>!&
- stuff: áéíóúñàäöüêâ߀
- Japanese これが簡単なテストのはず
OK
- 47 http://www.artima.com/forums/flat.jsp?forum=123&thread=171077
- 37 http://ajax-international-swicki.eurekster.com/charset encoding ajax
- 27 http://ajax-international-swicki.eurekster.com/AJAX encoding
- 18 http://anarchaia.org
- 10 http://planetruby.0x42.net
- 9 http://www.rubycorner.com
- 9 http://ozmm.org
- 7 http://www.anarchaia.org
- 6 http://ozmm.org/2006/08/06
- 3 http://www.google.com/notebook/public/01691738163332232895/BDSaYIgoQvb3Dtc4h
Keyword(s):[blog] [ruby] [frontpage] [eigenclass.org] [comment] [preview] [AJAX] [utf8] [subpar] [escape]
References: