Re: [TxMt] Re: Python bundle syntax highlighting bug

15 Aug 2007

      On Aug 15, 2007, at 08:00, Alex Ross wrote:
...
I agree that prefixing all re's is not ideal.
So, we have five options:

Match all raw strings unambiguously as regular expressions.  We

will sometimes have false-positives.

Match raw strings that are arguments to methods from the re

module.  We will sometimes not match raw strings that are regular  
expressions, but can be pretty well guaranteed to never have a  
false-positive.

Require some prefix to a raw string to "turn on" regular

expression matching.  This has an extremely high probability of  
removing false-positives and false-negatives, but at the cost of  
additional CRUFT.

A combination of 2. and 3.  Match raw strings that are arguments

to re.compile and raw strings prefixed with (?#) as regular  
expressions, but no others.

Don't match re's at all.

It would seem there is no perfect option.  I propose that we put it  
to a vote, and perhaps appeal to our BDFL Allan.
–Alex
...
My vote would be for 4, but I'll add two more options:

Parse r' and r''' but not r" and r""" (or vice-versa) as regexes.

Parse the "r" prefix, but not the "R" prefix, as regexes.

The last option is probably the simplest. I don't think I've ever  
seen the "R" prefix in use and didn't even know it was an option  
until I just read the spec moments ago.
j.
6. and 7. are really variations of 3 - which 'special' prefix do you  
use to turn highlighting on or off, and what is it's default state?   
I have incorporated them below.
...
...
The last option is probably the simplest. I don't think I've ever
seen the "R" prefix in use and didn't even know it was an option
until I just read the spec moments ago.
I've never seen 'R' in use either, but I'm sure somebody, somewhere is
doing it.  I think 6,7 are going to be too confusing.
I think that it's no more confusing than having r'(?# as the lead  
prefix, having R' instead, or having r' turning regex highlighting  
ON, and R' disabling it.  It also has the benefit that it's easier to  
read, and since there doesn't seem to be a standardized common-use of  
R', making it the 'I'm not a regex raw string' marker is reasonable.
...
And adding a no-op "signal" to raw strings that will later be used as
regexes just to turn on some coloring seems very unPythonic in that:
It is ugly.
It is implicit.
It adds complexity.
It detracts from readability.
It is not the obvious way to do it.
Partially true - but this is not a language definition, nor Python  
code, but something different - a highlighter FOR Python.  The  
underlying code that makes up Python is very unpythonic.... and it  
certainly will make it MORE readable in TextMate, as then it will be  
highlighted correctly!
I vote for 4 as well.  Method number 2 will cover the most common use  
cases of regexes, and will keep the regex using folk (like me) happy,  
without highlighting non-regex raw strings, and keep that group  
happy.  Part 3 is more touchy...
Once we start to get to the edge cases, such as feeding in a raw  
string defined in one line into an re.compile in another line, either  
we:
a  -  Never highlight - keep raw string users happy, annoy regex  
users quite a bit,
b  -  Highlight when a prefix turns it on - obscure and a bit ugly,  
but keeps raw string users happy, annoy regex users much less, but  
still a little,
c  -  Highlight by default, but adding a prefix can turn it OFF, e.g.  
R instead of r - again, obscure and a bit ugly; annoys raw string  
users slightly, keep regex users happy, or
d  -  Always highlight all raw strings as regexes - annoy raw string  
users, keep regex users happy - which is what we have now.
I use regexes quite a bit, but I could foresee a case where I might  
want a raw string non-highlighted.  If we change it at all, I would  
vote for 4c.
Nick

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [TxMt] Re: Python bundle syntax highlighting bug