LaTeX bundle problem

List overview All Threads
Download

newer

older

Importing Xcode Project with...

Subversion commit error

Maxime Boissonneault

27 Jun 2007 27 Jun '07

8:13 p.m.

Hello, I'm writing my master thesis with TextMate and the LaTeX bundle, and I encountered two problems.

First, I want to compile it using latex, and not pdftex. So, I set the variable TM_LATEX_COMPILER to latex. However, this only is not enough. I checked the Typeset & View script and I had to add this in order to make it work :

if [ "$TEX" == "latex" ]; then TEX_PSTRICKS=1 fi after the line TEX=${TM_LATEX_COMPILER:-$DEF_TEX}

Second, I write it in French. There are a lot of accents (é,è,à,ù,ï,ë, and so on) in French, and sometimes, there is one in a chapter/section/subsection title. The accents go into the label to. Creating the chapter/section/subsection works fine with the accents. However, when I try to refer to them and use \ref{sub followed by Alt-Esc to use the auto-completion, I can select the sections with accents in the labels, but when I do select a section with an accent, instead of writing the label, it deletes everything inside the \ref{}. I see two solution to this : - Change the autocompletion script so that it can take accents. - Change the chapter/section/subsection/... commands so that they replace accents by regular letters (éèë by e, ...) in the labels.

However, I don't know how to do any of them.

Does anyone have a workaround this ? Or is it possible that someone correct this in the bundle ?

Thanks,

Maxime Boissonneault maxime.boissonneault@usherbrooke.ca

Show replies by date

Robin Houston

28 Jun 28 Jun

5:41 p.m.

On 6/27/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...

First, I want to compile it using latex, and not pdftex. So, I set the variable TM_LATEX_COMPILER to latex. However, this only is not enough.

You are right. You have to set TEX_PSTRICKS=1 as well (or do what you did, of course). This is just one of the unsatisfactory things about the configuration mechanism, as it stands. I do still mean to rationalise it, when I find the time!

...

Second, I write it in French. There are a lot of accents (é,è,à,ù,ï,ë, and so on) in French, and sometimes, there is one in a chapter/section/subsection title. The accents go into the label to. Creating the chapter/section/subsection works fine with the accents. However, when I try to refer to them and use \ref{sub followed by Alt-Esc to use the auto-completion, I can select the sections with accents in the labels, but when I do select a section with an accent, instead of writing the label, it deletes everything inside the \ref{}.

I can't reproduce this problem. It works okay in my simple tests. Could you send an example that demonstrates it?

Robin

Maxime Boissonneault

29 Jun 29 Jun

3:56 a.m.

Robin Houston a écrit :

...

On 6/27/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...
First, I want to compile it using latex, and not pdftex. So, I set the variable TM_LATEX_COMPILER to latex. However, this only is not enough.

You are right. You have to set TEX_PSTRICKS=1 as well (or do what you did, of course). This is just one of the unsatisfactory things about the configuration mechanism, as it stands. I do still mean to rationalise it, when I find the time!

Ok. Well, I can send you the code I did if you want to.

...

...
Second, I write it in French. There are a lot of accents (é,è,à,ù,ï,ë, and so on) in French, and sometimes, there is one in a chapter/section/subsection title. The accents go into the label to. Creating the chapter/section/subsection works fine with the accents. However, when I try to refer to them and use \ref{sub followed by Alt-Esc to use the auto-completion, I can select the sections with accents in the labels, but when I do select a section with an accent, instead of writing the label, it deletes everything inside the \ref{}.

I can't reproduce this problem. It works okay in my simple tests. Could you send an example that demonstrates it?

Just copy this in a .tex file :

\documentclass[titlepage,oneside,letterpaper,openright,12pt]{article} \usepackage[style/frenchb]{babel} \usepackage[cyr]{aeguill} \begin{document} \subsection{Étude de la physique} % (fold) \label{sub:Étude_de_la_physique}

\ref{}

% subsection Étude_de_la_physique (end)

\end{document}

Run, then go in the \ref{}, type sub followed by Alt-Esc. The sub will disappear.

Robin Houston

9:09 a.m.

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...

Run, then go in the \ref{}, type sub followed by Alt-Esc. The sub will disappear.

No, I'm afraid it works fine when I try it.

Can anyone reproduce this?

Robin

Maxime Boissonneault

2:28 p.m.

Hmmmm, What version of TextMate do you have ? I have Version 1.5.5 (1383)

Maxime

Robin Houston a écrit :

...

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...
Run, then go in the \ref{}, type sub followed by Alt-Esc. The sub will disappear.

No, I'm afraid it works fine when I try it.

Can anyone reproduce this?

Robin

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Robin Houston

3:57 p.m.

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...

Hmmmm, What version of TextMate do you have ? I have Version 1.5.5 (1383)

Same here. :-/

Robin

Maxime Boissonneault

4:06 p.m.

Oh, I think I figured it out. It works if you save the .tex file in UTF-8. However, I always save in ISO-8859-1 Latin 1. It doesn't work if the file is saved using this encoding.

I could switch to UTF-8, but the PDF I get if I use UTF-8 has the accents all messed up. I included the pdf generated when I am in UTF8 or in ISO-8859-1. The ISO is the correct one, and you see the UTF8 is not right.

Maxime

Robin Houston a écrit :

...

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...
Hmmmm, What version of TextMate do you have ? I have Version 1.5.5 (1383)

Same here. :-/

Robin

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Édouard Gilbert

4:10 p.m.

It seems to work if encoding is set to UTF-8, but not when it's latin1.

Le 29 juin 07 à 15:57, Robin Houston a écrit :

...

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...
Hmmmm, What version of TextMate do you have ? I have Version 1.5.5 (1383)

Same here. :-/

Robin

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Maxime Boissonneault

4:27 p.m.

...

It seems to work if encoding is set to UTF-8, but not when it's latin1.

Yeah, just what I wrote.

But if I encode to UTF-8, the PDF isn't displaying the accents correctly.

Robin Houston

4:34 p.m.

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...

But if I encode to UTF-8, the PDF isn't displaying the accents correctly.

\usepackage[utf8x]{inputenc}

Robin

Maxime Boissonneault

6:34 p.m.

I just tried it, I saved the files in UTF8, but I get these errors : /Introduction.tex:5: Package utf8x Error: MalformedUTF-8sequence

But it doesn't solve the incompatibility between Latin 1 and the auto-completion feature. Is it possible to investigate that problem ?

Maxime

Robin Houston a écrit :

...

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...
But if I encode to UTF-8, the PDF isn't displaying the accents correctly.

\usepackage[utf8x]{inputenc}

Robin

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Maxime Boissonneault

6:46 p.m.

Ok, nevermind about the "Malformed UTF-8 sequence", I don't know what I did wrong the first time, but now it works.

I still believe that the incompatibility with Latin 1 should be investigated, and I can provide assistance. Just have to tell me what to do.

Maxime

Maxime Boissonneault a écrit :

...

I just tried it, I saved the files in UTF8, but I get these errors : /Introduction.tex:5: Package utf8x Error: MalformedUTF-8sequence

But it doesn't solve the incompatibility between Latin 1 and the auto-completion feature. Is it possible to investigate that problem ?

Maxime

Robin Houston a écrit :

...
On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...
But if I encode to UTF-8, the PDF isn't displaying the accents correctly.

\usepackage[utf8x]{inputenc}

Robin

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Robin Houston

7:04 p.m.

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...

I still believe that the incompatibility with Latin 1 should be investigated, and I can provide assistance. Just have to tell me what to do.

Well, you need to investigate the command "Label Based on Current Word/Selection…" in the LaTeX bundle. Presumably there is something there that only works in UTF8.

Robin

Maxime Boissonneault

7:39 p.m.

Ok, I was not really talking about modifying the code myself. I meant providing more information that you might need.

However, I looked at the command, and even though I understand a bit of bash script (it looks like this to me), I don't know the way the bundle is coded enough to figure it out.

Maxime

Robin Houston a écrit :

...

On 6/29/07, Maxime Boissonneault maxime.boissonneault@usherbrooke.ca wrote:

...
I still believe that the incompatibility with Latin 1 should be investigated, and I can provide assistance. Just have to tell me what to do.

Well, you need to investigate the command "Label Based on Current Word/Selection…" in the LaTeX bundle. Presumably there is something there that only works in UTF8.

Robin

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Allan Odgaard

30 Jun 30 Jun

5:51 a.m.

On 29. Jun 2007, at 19:39, Maxime Boissonneault wrote:

...

Ok, I was not really talking about modifying the code myself. I meant providing more information that you might need. [...]

The problem is that your source documents are scanned for completion candidates in their on-disk state.

The scanner cannot know what encoding you have used, and so it assumes (for good reason) UTF-8.

But in fact the stuff is malformed, when treated as UTF-8, so TM will discard it, when the label completion command tries to insert it.

So, as I have repeatedly written on this list; all TextMate commands which work on your files assume these files are UTF-8, UTF-8 is the only sane encoding to use, and supporting more is not only tedious, in many cases it is impossible.

The only improvements I plan in this department is more warnings when stuff is not UTF-8 ;)

Maxime Boissonneault

3 Jul 3 Jul

2:48 p.m.

Hello, I understand the problem now. However, how come TextMate can detect what encoding was used for a file, and the scanner can not ?

UTF8 can cause problems with Bibtex (http://www.unix-ag.uni-kl.de/~fischer/kbibtex/encoding.html)

Allan Odgaard a écrit :

...

On 29. Jun 2007, at 19:39, Maxime Boissonneault wrote:

...
Ok, I was not really talking about modifying the code myself. I meant providing more information that you might need. [...]

The problem is that your source documents are scanned for completion candidates in their on-disk state.

The scanner cannot know what encoding you have used, and so it assumes (for good reason) UTF-8.

But in fact the stuff is malformed, when treated as UTF-8, so TM will discard it, when the label completion command tries to insert it.

So, as I have repeatedly written on this list; all TextMate commands which work on your files assume these files are UTF-8, UTF-8 is the only sane encoding to use, and supporting more is not only tedious, in many cases it is impossible.

The only improvements I plan in this department is more warnings when stuff is not UTF-8 ;)

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Adam R. Maxwell

4:12 p.m.

On Jul 3, 2007, at 05:48, Maxime Boissonneault wrote:

...

Hello, I understand the problem now. However, how come TextMate can detect what encoding was used for a file, and the scanner can not ?

UTF8 can cause problems with Bibtex (http://www.unix-ag.uni-kl.de/~fischer/kbibtex/encoding.html )

which has a better explanation of the problem(s) of using multibyte encodings with BibTeX. Sorting is broken, of course, but apparently some BibTeX styles have even worse problems.

...

Allan Odgaard a écrit :

...
On 29. Jun 2007, at 19:39, Maxime Boissonneault wrote:

...
Ok, I was not really talking about modifying the code myself. I meant providing more information that you might need. [...]

The problem is that your source documents are scanned for completion candidates in their on-disk state.

The scanner cannot know what encoding you have used, and so it assumes (for good reason) UTF-8.

But in fact the stuff is malformed, when treated as UTF-8, so TM will discard it, when the label completion command tries to insert it.

I haven't been following this too closely, but ISTR that cross- reference labels, like TeX command names, are limited to ASCII letters and digits 0-9. You might get away with accented characters using some encodings, but it would probably break horribly with UTF-8.

-- Adam

Maxime Boissonneault

4:27 p.m.

...

See also

http://groups.google.com/group/comp.text.tex/browse_frm/thread/7e406a5c250cc...

which has a better explanation of the problem(s) of using multibyte encodings with BibTeX. Sorting is broken, of course, but apparently some BibTeX styles have even worse problems.

Thanks

...

...
Allan Odgaard a écrit :

...
On 29. Jun 2007, at 19:39, Maxime Boissonneault wrote:

...
Ok, I was not really talking about modifying the code myself. I meant providing more information that you might need. [...]

The problem is that your source documents are scanned for completion candidates in their on-disk state.

The scanner cannot know what encoding you have used, and so it assumes (for good reason) UTF-8.

But in fact the stuff is malformed, when treated as UTF-8, so TM will discard it, when the label completion command tries to insert it.

I haven't been following this too closely, but ISTR that cross-reference labels, like TeX command names, are limited to ASCII letters and digits 0-9. You might get away with accented characters using some encodings, but it would probably break horribly with UTF-8.

LaTeX doesn't seem to have any problem with this. There are accents in my labels and it works pretty well.

I guess the ultimate solution would be to change the macro that generates the labels (according to what is written in the section/chapter titles) so that it strip the accented characters and replace them by the normal character (é,è,ë,ê -> e, à,â -> a, ô¨o (can't do the last one with my keyboard) -> o, and so on).

Maxime Boissonneault

Adam R. Maxwell

4 Jul 4 Jul

3:24 a.m.

On Jul 3, 2007, at 07:27, Maxime Boissonneault wrote:

...

...
I haven't been following this too closely, but ISTR that cross- reference labels, like TeX command names, are limited to ASCII letters and digits 0-9. You might get away with accented characters using some encodings, but it would probably break horribly with UTF-8.

LaTeX doesn't seem to have any problem with this. There are accents in my labels and it works pretty well.

If you have an example you could send (off-list), I'm curious to see that. I tried using a section label with various accented characters and it failed. The LaTeX Companion (2nd) says that characters outside the ASCII range can never be used in labels and cite keys (p. 482).

thanks, Adam

Édouard Gilbert

3 Jul 3 Jul

5:18 p.m.

Doesn't that concerns only BibTex entry? One could use this good old LaTeX encoding for them — 'e, \ss or "o and other special characters. BibDesk even allow automatic and customizable conversion of UTF8 characters to such strings.

Le 3 juil. 07 à 16:12, Adam R. Maxwell a écrit :

...

On Jul 3, 2007, at 05:48, Maxime Boissonneault wrote:

...
Hello, I understand the problem now. However, how come TextMate can detect what encoding was used for a file, and the scanner can not ?

UTF8 can cause problems with Bibtex (http://www.unix-ag.uni-kl.de/ ~fischer/kbibtex/encoding.html)

See also

http://groups.google.com/group/comp.text.tex/browse_frm/thread/ 7e406a5c250cc6bc/8aefd925c735c842?lnk=st&q=bibtex+unicode+group% 3Acomp.text.tex&rnum=1&hl=en#8aefd925c735c842

which has a better explanation of the problem(s) of using multibyte encodings with BibTeX. Sorting is broken, of course, but apparently some BibTeX styles have even worse problems.

...
Allan Odgaard a écrit :

...
On 29. Jun 2007, at 19:39, Maxime Boissonneault wrote:

...
Ok, I was not really talking about modifying the code myself. I meant providing more information that you might need. [...]

The problem is that your source documents are scanned for completion candidates in their on-disk state.

The scanner cannot know what encoding you have used, and so it assumes (for good reason) UTF-8.

But in fact the stuff is malformed, when treated as UTF-8, so TM will discard it, when the label completion command tries to insert it.

I haven't been following this too closely, but ISTR that cross- reference labels, like TeX command names, are limited to ASCII letters and digits 0-9. You might get away with accented characters using some encodings, but it would probably break horribly with UTF-8.

-- Adam ______________________________________________________________________ For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Adam R. Maxwell

4 Jul 4 Jul

3:32 a.m.

On Jul 3, 2007, at 08:18, Édouard Gilbert wrote:

...

Doesn't that concerns only BibTex entry? One could use this good old LaTeX encoding for them — 'e, \ss or "o and other special characters.

Yes...although you should use {"o} in BibTeX, I think. The point was that various parts of LaTeX/BibTeX are not compatible with UTF-8, although accented characters in labels aren't compatible with any encoding.

...

BibDesk even allow automatic and customizable conversion of UTF8 characters to such strings.

I know; I helped write it :).

-- Adam

Édouard Gilbert

6:54 p.m.

Le 4 juil. 07 à 03:32, Adam R. Maxwell a écrit :

...

On Jul 3, 2007, at 08:18, Édouard Gilbert wrote:

...
Doesn't that concerns only BibTex entry? One could use this good old LaTeX encoding for them — 'e, \ss or "o and other special characters.

Yes...although you should use {"o} in BibTeX, I think. The point was that various parts of LaTeX/BibTeX are not compatible with UTF-8, although accented characters in labels aren't compatible with any encoding.

...
BibDesk even allow automatic and customizable conversion of UTF8 characters to such strings.

I know; I helped write it :).

You did? I've a feature request, then :P

...

-- Adam

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Allan Odgaard

3 Jul 3 Jul

4:37 p.m.

On 3. Jul 2007, at 14:48, Maxime Boissonneault wrote:

...

I understand the problem now. However, how come TextMate can detect what encoding was used for a file, and the scanner can not ?

TextMate can *not* detect the encoding of your files. If you use UTF- n encoding, there is a 99.9999…% chance that it will get it right, but *any* other encoding, and TM’s guess is based on a frequency table and how well your file corresponds to this distribution when interpreted in the various encodings.

This however gives rather mixed results.

With Tiger and onward, we can store the encoding as meta data, but even ignoring the complexity of handling encodings for every single command that may read or write text, there are too many cases where the text is not tied to a file per se, or may even come from multiple files.

For example take the diff actions (including those tied to a version control system), the output here is the differences between two files, so maybe we can assume this output is in the encoding of the two input sources (hoping those share encoding)? Wrong… the diff output also contains file names, these also have an encoding (fortunately it’s UTF-8).

So anything else than mandating UTF-8 will make things break, and there is no technical solution to this problem. Sure, you can have things work “good enough” for some w/o going 100% UTF-8, and you can maybe fix some of the stuff that breaks when you are not using UTF-8, but you can never fix it all, so IMO it’s really not worth trying to support more than UTF-8, UTF-8 is the solution to the encoding problems of the past.

...

UTF8 can cause problems with Bibtex (http://www.unix-ag.uni-kl.de/ ~fischer/kbibtex/encoding.html)

Ironically that page goes on to say you should convert your BibTeX files to UTF-8 ;)

As I understand it, the problem is when generating alphabetic references, i.e. [ODGAARD03] instead of the plain [1] style, and the error quoted is from the UTF-8 package about a malformed UTF-8 sequence, so presumably the error would only occur when the generated reference text has non-ASCII in it, but I am a little puzzled about the error quoted, it really sounds like this is the error you would get if your LaTeX file is UTF-8, you use a non-UTF-8 BibTeX file, you have non-ASCII characters in the reference, and these characters end up in the reference, and then the UTF-8 packages chokes on this malformed UTF-8 sequence.

If anyone has a sample project that shows the problem, please send it my way.

Robin Houston

5:50 p.m.

On 7/3/07, Allan Odgaard throw-away-1@macromates.com wrote:

...

If anyone has a sample project that shows the problem, please send it my way.

See attached.

It's thirteen years since BibTeX 1.0 was announced. Is it ever likely to materialise?

Robin

6623

days inactive

6630

days old

textmate@lists.macromates.com

23 comments

participants

tags (0)

participants (5)

Adam R. Maxwell
Allan Odgaard
Maxime Boissonneault
Robin Houston
Édouard Gilbert