#!/usr/bin/perl local $/; # put Perl in "slurp" mode $text = <>; # read in the whole file $text =~ s/^([A-Z ]+)\n(\(.+\))\n(.+)$/\n\n\t\t\t\t$1\n\t\t\t$2
\n\n$3/mg; print $text;
Of all the solutions proposed here, this once seemed to work the best. Thanks for the tip about processing the entire document in one go.
I ended up with the following:
#!/usr/bin/perl local $/; # put Perl in "slurp" mode $text = <>; # read in the whole file $text =~ s/(\s+)$//mg; $text =~ s/((INT.|EXT.|I/E.|int.|ext.|i/e.)\s.*)$/***$1/mg; $text =~ s/^([A-Z].*[A-Z )])\n^((.*))$\n(.+)$/\n\t\t\t\t$1\n\t\t\t $2\n\t\t$3\n/mg; $text =~ s/^[A-Z]{2,}.*[A-Z)\d]$\n^(.*)$/\n\t\t\t\t$1\n\t\t$2\n/; $text =~ s/^***//mg; $text =~ s/^(.*(IN:|UP:|TO:))$/\n\t\t\t\t\t\t\t\t\t\t$1\n/; $text =~ s/^(\w+.*(.|?|!|"|-))$\n^\w+.*(.|?|!|"|-)$/\n$1/mg; print $text;
...but it doesn't work at all. I'm guessing that it's something obvious that I can't see, but in case it's not an explanation of what I'm trying to do follows.
1. removes trailing whitespace at the end of a string (which is common in the format I'm importing from. 2. appends three *'s to the beginning of what I'll call well-formed sluglines so that step 4 doesn't apply to these are well. 3. finds a specific set of lines, a Character followed by Parenthetical, followed by Dialoge. And formats it. 4. finds Character followed by Dialogue when there is no Parenthetical, and formats it. 5. repairs the sluglines that were changed in step 2. 6. formats transitions. 7. formats regular paragraphs (this is a little funky, but it's my fault).
I can do all these in the Find&Replace window, but I get nothing when running this bundle command.
You can find a sample document to test the script on here: http:// ollieman.net/files/bundles/braveheart-sample-unformatted.txt And you can find what the result should look like here: http:// ollieman.net/files/bundles/braveheart-sample-formatted.txt
Thanks again for all the help. This will allow we to switch full-time from Final Draft to TextMate. Something I'm really looking forward to.
On 2/22/06, Oliver Taylor oliver@ollieman.net wrote:
I ended up with the following:
#!/usr/bin/perl local $/; # put Perl in "slurp" mode $text = <>; # read in the whole file $text =~ s/(\s+)$//mg; $text =~ s/((INT.|EXT.|I/E.|int.|ext.|i/e.)\s.*)$/***$1/mg; $text =~ s/^([A-Z].*[A-Z )])\n^((.*))$\n(.+)$/\n\t\t\t\t$1\n\t\t\t $2\n\t\t$3\n/mg; $text =~ s/^[A-Z]{2,}.*[A-Z)\d]$\n^(.*)$/\n\t\t\t\t$1\n\t\t$2\n/; $text =~ s/^***//mg; $text =~ s/^(.*(IN:|UP:|TO:))$/\n\t\t\t\t\t\t\t\t\t\t$1\n/; $text =~ s/^(\w+.*(.|?|!|"|-))$\n^\w+.*(.|?|!|"|-)$/\n$1/mg; print $text;
...but it doesn't work at all.
Try this
#!/usr/bin/perl local $/; # put Perl in "slurp" mode $text = <>; # read in the whole file $text =~ s/(\s+)$//mg; $text =~ s{((INT.|EXT.|I/E.|int.|ext.|i/e.)\s.*)$}{***$1}mg; $text =~ s/^([A-Z].*[A-Z )])\n^((.*))\n(.+)$/\n\t\t\t\t$1\n\t\t\t$2\n\t\t$3\n/mg; $text =~ s/^([A-Z]{2,}.*[A-Z)0-9])\n(.*)$/\n\t\t\t\t$1\n\t\t$2\n/mg; $text =~ s/^[*]{3}(.+)$/\n$1\n/mg; $text =~ s/^(.*(IN:|UP:|TO:))$/\n\t\t\t\t\t\t\t\t\t\t$1\n/mg; $text =~ s/^(\w+.*(.|?|!|"|-))\n\w+.*(.|?|!|"|-)$/\n$1/mg; print $text;
The errors I fixed were:
1. No 'mg' options on some of the substitutions. Without 'm', ^ is the beginning of the entire string and $ is the end of the entire string. Without 'g', only the first substitution is made. 2. The use of '$\n' and '\n^' in some of your match strings when just '\n' was needed. The newline defines the beginning and ending of lines; adding a $ or ^ is doubling up. 3. Unnecessary backslash escapes in character classes. Things like asterisks and parentheses are not special in a character class and don't need to be escaped. 4. Unnecessary backslash escapes in the substitution string. The substitution string is not a regex and doesn't follow the same rules as the match string.
The first two will make the matches fail, the second two just make the regex longer and more difficult to read.
-- Dr. Drang