Re: [TxMt] regex question

22 Feb 2006


      Oliver,
By default, Perl's -p switch causes each line of the input to be
1. read,
2. operated on by whatever commands are given (typically an s///), and
3. printed.
So, as Paul McCann said, Perl can't do what you want it to because  
the command is seeing only one line of input at a time. To operate on  
the whole file at once, use the -0777 switch. Using -000 to read a  
paragraph at a time would work for this specific case, but -0777 is  
usually the more general solution.
I am wondering, though, why you are trying to do this as a one-liner.  
As far as I know, bundle commands can start with a "shebang" line, so  
you can use a full, multiline Perl program if the command starts with  
#!/usr/bin/perl. Thus:
#!/usr/bin/perl
     local $/;             # put Perl in "slurp" mode
     $text = <>;           # read in the whole file
     $text =~ s/^([A-Z ]+)\n((.+))\n(.+)$/\n\n\t\t\t\t$1\n\t\t\t$2\n 
\n$3/mg;
     print $text;
should do the trick. I changed the first part of your regex so the  
capitalized line can contain only capital (unaccented) letters and  
spaces, which is what I thought you wanted. The '.*' you have in your  
regex allows any character in that line.
FYI, the 'g' option at the end of the substitution command makes the  
substitution work for every instance of the pattern rather than just  
the first. The 'm' option allows ^ and $ to match beginnings and  
endings of lines in multiline text. It's common to see the 's' option  
in multiline regexes, but that would break this one; it allows '.' to  
match the newline character, and we're relying on '.' *not* matching  
newline.
I meant to mention yesterday that the most commonly recommended  
reference on regexes is Friedl's _Mastering Regular Expressions_ from  
O'Reilly. It's probably more encyclopedic than you want, but it's  
very highly regarded. For Perl-specific regex help, there are any  
number of tutorials on the Internet. There's also _Programming Perl_  
from O'Reilly and the 'perlre' man page.
On Feb 22, 2006, at 7:32 AM, Paul McCann wrote:
...
Hi Oliver,
...
I'm trying to run a perl search/replace command via a 'bundle  
command'. I've entered the following into the 'Edit Command' box:
perl -pe '
   s/^([A-Z]+.*[A-Z]*\s*)\n((.+))\n(.+)$/\n\n\t\t\t\t$1\n\t\t\t$2\n 
\n$3/g;
'
I've set the input to 'Entire Document' and the output to 'Create  
New Document'.
The problem there is that the text won't be matching. You need to  
indicate to perl that you want a multiline match to occur (ie,   
that the regular expression should apply to all of the text in the  
document, not just one line at a time). This is done by using the  
"m" qualifier, as in
...
perl -pe 's/^([A-Z]+.*[A-Z]*\s*)\n((.+))\n(.+)$/\n\n\t\t\t\t$1\n 
\t\t\t$2\n\n$3/mg;'
But written this way perl is just seeing the first line of the  
document; in order for it to see a paragraph at a time (where a  
paragraph is delimited by a blank line) you can use the flag  
"-000" (three zeroes):
...
perl -000 -pe 's/^([A-Z]+.*[A-Z]*\s*)\n((.+))\n(.+)$/\n\n\t\t\t\t 
$1\n\t\t\t$2\n\n$3/mg;'
This gets you pretty close to what you're seeking, and I imagine  
you can tweak it to get it exactly right:
====================================================================== 
=
		OLIVER
	(I want to tell you)


I've got things to say.
Dr. Robert
====================================================================== 
=
Good luck,
Paul
--
Dr. Drang

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [TxMt] regex question