[TxMt] Language grammar generator for XML

Édouard Gilbert edouard.gilbert at gmail.com
Sat Mar 7 15:01:47 UTC 2009


Hi list,

I've recently been working on a Relax NG to TM Language Grammar XSLT  
stylesheet.  I did it mainly to exercise, so I didn’t look far for  
anything similar.  Because Relax NG is XML and can be easily  
generated from DTD or XML Schema using trang, it seemed like a good  
choice.

I’d like to read your comments, especially about the generated  
grammars style which need much improvement.

How does it work:
1) put a file.rng (in XML syntax) in the Schema directory
2) from the root directory, execute the shell script ./rng2txmt.sh  
Schema/file.rng
3) the grammar is generated as "Generated Language Grammars/ 
file.plist" (along with file.plist.xml)

If this doesn’t work, please read the known issues, it might be a  
namespace problem.

What does it try and do:
* look for and mark invalid tags or attributes under or in a given tag
* avoid to create empty repository entries
* give a tag-aware scope for attributes (aim is to generate auto- 
completion lists aside)

What I would like it to do:
* have basic namespace support
* have current-tag-aware (not any-ancestor-aware) scope for auto- 
completion of tags
e.g, a scope which only match the dots in <a>....<b>   <c/>    
<b>....<c/>....</a>
Not so long ago, I would have say it's impossible, but now that I’ve  
slightly improved my TM grammar-fu,
I’m pretty sure it is achievable and may even be not that hard.   
Matching > and /> to open, looking-ahead for <b and <c to close,  
perhaps.
* actually generate completion list.  This shouldn’t be too hard.

Known issues:
* If I’m right, TM grammars works in a « first matching rule is  
chosen » which is incompatible with Relax NG main advantage : non- 
determinism.  Thus I think some Relax NG schema may never be parsed  
correctly.  DTD and XML Schema need to be deterministic, however, so  
the issue is not that important.  I think this is the problem with  
the generated relaxng grammar.
* / ! \ Because XML namespace is a mess and I didn’t bother dealing  
with it in my stylesheet, one need to remove any mention of the  
default namespace in the rng file.  Otherwise the stylesheet won’t  
generate anything
* It currently doesn’t deal with anyName, exceptions, exclusive  
choice or any other RNG construction.
* No auto-indentation of the generated plist.  Who cares, anyway, TM  
cleans it up for you.
* A whole lot of useless scopes, mainly there for debugging.
* Whitespace in tag management in inconsistent.
* The code is ugly.

By the way, I’ve used some excerpts from default XML grammar.  I hope  
it does not bother its author.  Is he Brian Lalor or Allan Odgaard?

Thanks,
Édouard

-------------- next part --------------
A non-text attachment was scrubbed...
Name: rng2txmt.tar.bz2
Type: application/bzip2
Size: 13638 bytes
Desc: not available
URL: <http://lists.macromates.com/textmate/attachments/20090307/a18cc722/attachment.bz2>
-------------- next part --------------




More information about the textmate mailing list