[SVN] Language Grammar Using Significant Whitespace, Including Another (sans indentation)

Gavin Kistner gkistner at nvidia.com
Tue Feb 23 00:09:05 UTC 2010


Summary:
Is it possible to write a grammar for Haml that provides proper syntax highlighting on Markdown filters?

Details:
The Haml markup language[1] uses significant whitespace indentation to produce hierarchical X(HT)ML. For example, this Haml markup...

  %html
    %head
      %title Hello World
    %body
      %h1 Hello World
      div#content
        Oh,
        it's on now!
      div#footer Copyright 2010.

...produces HTML like the following (modulo my tersified presentation)...
  <html>
  <head><title>Hello World</html></head>
  <body>
    <h1>Hello World</h1>
    <div id="content">Oh, it's on now!</div>
    <div id="footer">Copyright 2010.</div>
  </body></html>

Because Haml is focused on _structure_, but HTML is needed for both structure and _presentation_, Haml lets you include other well-known markup inline. For example:
    %body
      %h1 Hello World
      div#content
        :markdown
          Oh, it's **on** now!

          * Because I like lists.
          * And so do you.
      div#footer Copyright 2010.
...will run the content indented under :markdown through a Markdown[2] parser and inject the result as a child of the content element.

If you didn't know, there already exist bundles for both Haml (under "Ruby Haml") and Markdown. However, the Haml bundle does not recognize the :markdown filter specifically, or ask the Markdown grammar to process it.

I started to try to do this. Here's a pattern I injected semi-randomly into the Haml grammar (in the JSON notation presented by the Bundle editor for e-texteditor):
      {
         "begin" : "^(\\s*):markdown\\b",
         "end"   : "^(?!\\1)|^\\1\\S",
         "patterns" : [
            {
               "include" : "text.html.markdown"
            }
         ]
      },

This naive first attempt showed two problems.

First (and less important), I couldn't get it to stop when I needed it to. It worked on the following text (caused the 'bar' line not to be treated as Markdown):
    %foo
      :markdown
        Hello
        World

    %bar
...but when I got rid of the newline between World and %bar, the bar line was being treated as Markdown. I'm not sure why, since this line (which is indented by less whitespace than ":markdown") does not begin with the same whitespace (the first regexp pattern in the alternation of the 'end' pattern).

The second (far less workable) problem is that for those lines that were being treated as Markdown, the full lines--including all the extra leading whitespace required by Haml--were being processed. In Markdown, this makes all the text be treated (and syntax-highlighted) as a preformatted text block; clearly not the intent.


So:
a) Can you write a grammar pattern that says "Find everything indented more than this much (as captured by begin) and pass it off to this other grammar", but
b) Strip off that leading whitespace before passing it to the other grammar?

It would be non-ideal but still useful in this case to even perform this whitespace-stripping on a line-by-line basis and pass each stripped line to the other grammar.

Someone take up the torch. This yak is too hairy for me. :)



[1] http://haml-lang.com/
[2] http://daringfireball.net/projects/markdown/

http://groups.google.com/group/haml/browse_frm/thread/2a67aa977a1987a4

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macromates.com/textmate-dev/attachments/20100222/aae2e246/attachment.html>


More information about the textmate-dev mailing list