[TxMt] which version control system to take?

Phil Schumm pschumm at uchicago.edu
Sat Mar 1 15:52:45 UTC 2008


On Mar 1, 2008, at 6:20 AM, Christoph Held wrote:
> I am very serious about starting using version control for academic  
> writing. And it is not as if you had nothing to do with it. When  
> doing some reading along these lines I stumbled about some of your  
> articles in Practex journal. Although you advocated using  
> Subversion back then, the message for me was to use any kind of  
> versioning system at all.
>
> The last time doing a manual merge of three documents from  
> different authors was an absolute nightmare and took a lot of time.  
> Even if I were using version control for myself only, I'd imagine I  
> could still put it to good use by feeding it my coauthors works  
> after converting them to plain text myself. This is actually one of  
> my reasons I am favouring MultiMarkdown over pure Latex at the  
> moment as it is human readable and easy to convert to *.rtf or eben  
> *.doc format.


I use SVN for all my coding projects, but also for academic writing.   
WRT latter, if I am the sole author, then I write in either LaTeX or  
reStructuredText, and use SVN pretty much exactly like I would for a  
coding project.  In cases where I am not the sole author (which  
happens to be most of my work at the moment), I have found that the  
critical distinctions are (1) are my coauthors willing to work with  
text files (e.g., LaTeX, reST, or some other markup), and (2) are  
they willing and/or interested in learning Subversion?  This yields 4  
cases:

(i) If your coauthors are willing to work with text files, then  
everything is still pretty straightforward.  Even if they don't  
access the repository themselves (or perhaps just have read-only  
access), you can still distribute copies of exactly what's in the  
repository to them, and handle the files you get back from them using  
standard diff and merge tools.

(ii) Unfortunately, it can be difficult unless you are in a  
mathematical or computer science-related field to find coauthors who  
are comfortable working with text files.  Even if you're not asking  
them to use LaTeX (i.e., you use some kind of simplified markup) and  
are only asking them to enter their revisions into a master text file  
that you have already set up, many people will still try to do this  
in Word, and then want to use "track changes" to make their edits and  
comments.  If your coauthors insist on using Word to make their edits  
and comments, then I've found the following approach works pretty  
well.  I still maintain the master document in LateX or reST (stored  
in the repository), but then translate it into Word for distribution  
to my coauthors (e.g., using LaTeX -> latex2rtf -> RTF -> open in  
Word and save, but there are many other options for doing this).   
When I get back comments, I check out a copy of the project at the  
revision from which it was distributed, and then make the changes  
manually from each coauthor.  If you use separate checkouts for each  
coauthor, then you can use SVN's built-in features for resolving  
conflicts between their different sets of changes, rather than having  
to do it manually.  In addition, when I do this, I always use Word's  
"compare documents" feature to compare what they send me to the  
document I distributed to them.  That way, even if they forget to use  
"track changes" (or use it inconsistently), I'm sure not to miss any  
changes.

You can, if you want, check in the Word document(s) you distribute  
and the ones you get back from your coauthors (with their changes),  
just for completeness sake.  However, this will begin to fill up your  
repository with a lot of binary junk, and you can't use standard diff  
and merge tools with these files.  Moreover, since MS Office files  
are automatically modified every time you open them (even if you  
don't save any changes), they're lousy for strict tracking of changes  
(e.g., using diff or checksums).

(iii) While I've had little luck getting people who are used to Word  
to use text files, I have had some luck getting non-technical people  
to learn how to use Subversion.  This is because it is easy-to-use  
(at least for most standard operations), well-documented, and tools  
like TortoiseSVN make it very easy for Windows users to pick up.  If  
your coauthors want to do this -- even if they are using Word -- it  
can still be helpful, because it eliminates your having to serve as  
the middle-man for exchanging files and provides you with an  
automatic log of all versions.  If you do this, one strategy is to  
create a "Word" branch of your project, where your coauthors can  
checkout the latest Word version and checkin their changes.  You can  
then occasionally merge the changes from this branch onto the trunk  
and update the branch with your own changes (from the trunk), as  
necessary.  In fact, as long as you're careful not to copy files  
between the branch and trunk, this also makes it easy to purge the  
repository of all the Word files, once the paper has been published.

(iv) If your coauthors use LaTeX (or some other text-based markup)  
*and* know or are willing to learn SVN, then you're in heaven.   
Everything works just like a software project.  Believe it or not,  
this has occasionally happened to me.

Just a few of my own experiences -- YMMV.


-- Phil

P.S. IMHO, the simplicity of Subversion, its excellent documentation,  
and the existence of graphical clients like TortoiseSVN (for those  
coauthors who aren't comfortable working at the command line) should  
not be overlooked, especially for projects like academic writing (or  
any writing, for that matter).  In addition, if you are going to  
essentially have your own, private repository, then the benefits of  
distributed version control become much less important, if at all  
(i.e., you can just keep a copy of your entire repository on your  
laptop).




More information about the textmate mailing list