[TxMt] Re: Python bundle syntax highlighting bug

Fabry Nicholas F. nick at superb-sublime.com
Wed Aug 15 16:39:00 UTC 2007


On Aug 15, 2007, at 08:00, Alex Ross wrote:

> I agree that prefixing all re's is not ideal.
>
> So, we have five options:
>
> 1. Match all raw strings unambiguously as regular expressions.  We  
> will sometimes have false-positives.
>
> 2. Match raw strings that are arguments to methods from the re  
> module.  We will sometimes not match raw strings that are regular  
> expressions, but can be pretty well guaranteed to never have a  
> false-positive.
>
> 3. Require some prefix to a raw string to "turn on" regular  
> expression matching.  This has an extremely high probability of  
> removing false-positives and false-negatives, but at the cost of  
> additional CRUFT.
>
> 4. A combination of 2. and 3.  Match raw strings that are arguments  
> to re.compile and raw strings prefixed with (?#) as regular  
> expressions, but no others.
>
> 5. Don't match re's at all.
>
> It would seem there is no perfect option.  I propose that we put it  
> to a vote, and perhaps appeal to our BDFL Allan.
>
> –Alex

>
> My vote would be for 4, but I'll add two more options:
>
> 6. Parse r' and r''' but not r" and r""" (or vice-versa) as regexes.
>
> 7. Parse the "r" prefix, but not the "R" prefix, as regexes.
>
> The last option is probably the simplest. I don't think I've ever  
> seen the "R" prefix in use and didn't even know it was an option  
> until I just read the spec moments ago.
>
> j.
>


6. and 7. are really variations of 3 - which 'special' prefix do you  
use to turn highlighting on or off, and what is it's default state?   
I have incorporated them below.

>> The last option is probably the simplest. I don't think I've ever
>> seen the "R" prefix in use and didn't even know it was an option
>> until I just read the spec moments ago.
>>
>
> I've never seen 'R' in use either, but I'm sure somebody, somewhere is
> doing it.  I think 6,7 are going to be too confusing.

I think that it's no more confusing than having r'(?# as the lead  
prefix, having R' instead, or having r' turning regex highlighting  
ON, and R' disabling it.  It also has the benefit that it's easier to  
read, and since there doesn't seem to be a standardized common-use of  
R', making it the 'I'm not a regex raw string' marker is reasonable.

> And adding a no-op "signal" to raw strings that will later be used as
> regexes just to turn on some coloring seems very unPythonic in that:
>
> It is ugly.
> It is implicit.
> It adds complexity.
> It detracts from readability.
> It is not the obvious way to do it.

Partially true - but this is not a language definition, nor Python  
code, but something different - a highlighter FOR Python.  The  
underlying code that makes up Python is very unpythonic.... and it  
certainly will make it MORE readable in TextMate, as then it will be  
highlighted correctly!

I vote for 4 as well.  Method number 2 will cover the most common use  
cases of regexes, and will keep the regex using folk (like me) happy,  
without highlighting non-regex raw strings, and keep that group  
happy.  Part 3 is more touchy...

Once we start to get to the edge cases, such as feeding in a raw  
string defined in one line into an re.compile in another line, either  
we:


a  -  Never highlight - keep raw string users happy, annoy regex  
users quite a bit,


b  -  Highlight when a prefix turns it on - obscure and a bit ugly,  
but keeps raw string users happy, annoy regex users much less, but  
still a little,


c  -  Highlight by default, but adding a prefix can turn it OFF, e.g.  
R instead of r - again, obscure and a bit ugly; annoys raw string  
users slightly, keep regex users happy, or


d  -  Always highlight all raw strings as regexes - annoy raw string  
users, keep regex users happy - which is what we have now.


I use regexes quite a bit, but I could foresee a case where I might  
want a raw string non-highlighted.  If we change it at all, I would  
vote for 4c.



Nick




More information about the textmate mailing list