[TxMt] Re: Python bundle syntax highlighting bug
Fabry Nicholas F.
nick at superb-sublime.com
Wed Aug 15 16:39:00 UTC 2007
On Aug 15, 2007, at 08:00, Alex Ross wrote:
> I agree that prefixing all re's is not ideal.
>
> So, we have five options:
>
> 1. Match all raw strings unambiguously as regular expressions. We
> will sometimes have false-positives.
>
> 2. Match raw strings that are arguments to methods from the re
> module. We will sometimes not match raw strings that are regular
> expressions, but can be pretty well guaranteed to never have a
> false-positive.
>
> 3. Require some prefix to a raw string to "turn on" regular
> expression matching. This has an extremely high probability of
> removing false-positives and false-negatives, but at the cost of
> additional CRUFT.
>
> 4. A combination of 2. and 3. Match raw strings that are arguments
> to re.compile and raw strings prefixed with (?#) as regular
> expressions, but no others.
>
> 5. Don't match re's at all.
>
> It would seem there is no perfect option. I propose that we put it
> to a vote, and perhaps appeal to our BDFL Allan.
>
> –Alex
>
> My vote would be for 4, but I'll add two more options:
>
> 6. Parse r' and r''' but not r" and r""" (or vice-versa) as regexes.
>
> 7. Parse the "r" prefix, but not the "R" prefix, as regexes.
>
> The last option is probably the simplest. I don't think I've ever
> seen the "R" prefix in use and didn't even know it was an option
> until I just read the spec moments ago.
>
> j.
>
6. and 7. are really variations of 3 - which 'special' prefix do you
use to turn highlighting on or off, and what is it's default state?
I have incorporated them below.
>> The last option is probably the simplest. I don't think I've ever
>> seen the "R" prefix in use and didn't even know it was an option
>> until I just read the spec moments ago.
>>
>
> I've never seen 'R' in use either, but I'm sure somebody, somewhere is
> doing it. I think 6,7 are going to be too confusing.
I think that it's no more confusing than having r'(?# as the lead
prefix, having R' instead, or having r' turning regex highlighting
ON, and R' disabling it. It also has the benefit that it's easier to
read, and since there doesn't seem to be a standardized common-use of
R', making it the 'I'm not a regex raw string' marker is reasonable.
> And adding a no-op "signal" to raw strings that will later be used as
> regexes just to turn on some coloring seems very unPythonic in that:
>
> It is ugly.
> It is implicit.
> It adds complexity.
> It detracts from readability.
> It is not the obvious way to do it.
Partially true - but this is not a language definition, nor Python
code, but something different - a highlighter FOR Python. The
underlying code that makes up Python is very unpythonic.... and it
certainly will make it MORE readable in TextMate, as then it will be
highlighted correctly!
I vote for 4 as well. Method number 2 will cover the most common use
cases of regexes, and will keep the regex using folk (like me) happy,
without highlighting non-regex raw strings, and keep that group
happy. Part 3 is more touchy...
Once we start to get to the edge cases, such as feeding in a raw
string defined in one line into an re.compile in another line, either
we:
a - Never highlight - keep raw string users happy, annoy regex
users quite a bit,
b - Highlight when a prefix turns it on - obscure and a bit ugly,
but keeps raw string users happy, annoy regex users much less, but
still a little,
c - Highlight by default, but adding a prefix can turn it OFF, e.g.
R instead of r - again, obscure and a bit ugly; annoys raw string
users slightly, keep regex users happy, or
d - Always highlight all raw strings as regexes - annoy raw string
users, keep regex users happy - which is what we have now.
I use regexes quite a bit, but I could foresee a case where I might
want a raw string non-highlighted. If we change it at all, I would
vote for 4c.
Nick
More information about the textmate
mailing list