[SVN] Revision 784 (HTML)

Mats Persson mats at imediatec.co.uk
Mon May 2 10:44:22 UTC 2005

On 2 May 2005, at 09:43, Jeroen van der Ham wrote:
> On 01-05-2005 13:06, Mats Persson wrote:
>> I may have misunderstood the value of the firstLineMatch thing,  
>> but I don't think it will work within the scope of HTML as well as  
>> it will  do in .sh files.
> Yes it will work, perfectly, as long as people write proper HTML...
> A proper HTML page has at least a line like the following as it's  
> first line:
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
>         "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
> Or even better:
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
>         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

I was uncertain of the extent of sophistication of the regex within  
firstLineMatch that could handle all the various implementations of  
HTML/XHTML starts, in comparison with for instance a bash .sh script.  
Here are some points to consider:

1.    in a .sh file we HAVE TO (as far as I know) start with a shebang.

2.    in valid HTML we should start with <!DOCTYPE...>, but it can be  
and is often left out.

3.    in valid XHTML we should start with an XML declaration <?xml  
version="1.0" encoding="utf-8"?>, but since Win IE has problems with  
that start it's left out most of the time.

4.    in my normal work I work with smaller snippets of (X)HTML where  
the <!Doctype is not available, then firstLineMatch might cause more  
problems than it's worth.

I don't know and as I said I might have misunderstood the full  
meanings/values of the firstLineMatch, and perhaps someone could set  
me right on it, if I'm wrong.

> So there are the ingredients for distinguishing between different  
> types of HTML files on the first lines.
> It might have to be extended to grab the first two, but the first  
> one should suffice already.

I am toying with the concepts of having a normal regexp type  
structure that checks for specific <!DocTypes and then includes  
"full" syntaxes accordingly much like we do today with PHP, CSS & JS,  
so that we would have something like this structure wise:

     |-> if DocType === XHTML 1.1
     |        # includes XHTML 1.1 specific syntax in separate file
     |-> if DocType === XHTML 1.0
     |        # includes XHTML 1.0 specific syntax in separate file
     |-> if DocType === HTML 4
     |        # includes HTML 4 specific syntax in separate file
     |-> if DocType === ??? or missing
     |        # includes  basic HTML syntax

The above is just an idea that may or may not work, be useful or  
whatever. It is too complicated for me to get my head around at this  
point in time.  I actually think that the (X)HTML language is far  
more complex in structure and possible variations etc. than say CSS,  
JS or PHP.

On top of all this, we have to somehow at some point in time be  
prepared to easily enable code_completion based on context. There's  
lots of work to do in the language over the coming months.

Kind regards,


"TextMate, coding with an incredible sense of joy and ease"
- www.macromates.com -

More information about the textmate-dev mailing list