On Mar 1, 2008, at 6:20 AM, Christoph Held wrote:
I am very serious about starting using version control for academic writing. And it is not as if you had nothing to do with it. When doing some reading along these lines I stumbled about some of your articles in Practex journal. Although you advocated using Subversion back then, the message for me was to use any kind of versioning system at all.
The last time doing a manual merge of three documents from different authors was an absolute nightmare and took a lot of time. Even if I were using version control for myself only, I'd imagine I could still put it to good use by feeding it my coauthors works after converting them to plain text myself. This is actually one of my reasons I am favouring MultiMarkdown over pure Latex at the moment as it is human readable and easy to convert to *.rtf or eben *.doc format.
I use SVN for all my coding projects, but also for academic writing. WRT latter, if I am the sole author, then I write in either LaTeX or reStructuredText, and use SVN pretty much exactly like I would for a coding project. In cases where I am not the sole author (which happens to be most of my work at the moment), I have found that the critical distinctions are (1) are my coauthors willing to work with text files (e.g., LaTeX, reST, or some other markup), and (2) are they willing and/or interested in learning Subversion? This yields 4 cases:
(i) If your coauthors are willing to work with text files, then everything is still pretty straightforward. Even if they don't access the repository themselves (or perhaps just have read-only access), you can still distribute copies of exactly what's in the repository to them, and handle the files you get back from them using standard diff and merge tools.
(ii) Unfortunately, it can be difficult unless you are in a mathematical or computer science-related field to find coauthors who are comfortable working with text files. Even if you're not asking them to use LaTeX (i.e., you use some kind of simplified markup) and are only asking them to enter their revisions into a master text file that you have already set up, many people will still try to do this in Word, and then want to use "track changes" to make their edits and comments. If your coauthors insist on using Word to make their edits and comments, then I've found the following approach works pretty well. I still maintain the master document in LateX or reST (stored in the repository), but then translate it into Word for distribution to my coauthors (e.g., using LaTeX -> latex2rtf -> RTF -> open in Word and save, but there are many other options for doing this). When I get back comments, I check out a copy of the project at the revision from which it was distributed, and then make the changes manually from each coauthor. If you use separate checkouts for each coauthor, then you can use SVN's built-in features for resolving conflicts between their different sets of changes, rather than having to do it manually. In addition, when I do this, I always use Word's "compare documents" feature to compare what they send me to the document I distributed to them. That way, even if they forget to use "track changes" (or use it inconsistently), I'm sure not to miss any changes.
You can, if you want, check in the Word document(s) you distribute and the ones you get back from your coauthors (with their changes), just for completeness sake. However, this will begin to fill up your repository with a lot of binary junk, and you can't use standard diff and merge tools with these files. Moreover, since MS Office files are automatically modified every time you open them (even if you don't save any changes), they're lousy for strict tracking of changes (e.g., using diff or checksums).
(iii) While I've had little luck getting people who are used to Word to use text files, I have had some luck getting non-technical people to learn how to use Subversion. This is because it is easy-to-use (at least for most standard operations), well-documented, and tools like TortoiseSVN make it very easy for Windows users to pick up. If your coauthors want to do this -- even if they are using Word -- it can still be helpful, because it eliminates your having to serve as the middle-man for exchanging files and provides you with an automatic log of all versions. If you do this, one strategy is to create a "Word" branch of your project, where your coauthors can checkout the latest Word version and checkin their changes. You can then occasionally merge the changes from this branch onto the trunk and update the branch with your own changes (from the trunk), as necessary. In fact, as long as you're careful not to copy files between the branch and trunk, this also makes it easy to purge the repository of all the Word files, once the paper has been published.
(iv) If your coauthors use LaTeX (or some other text-based markup) *and* know or are willing to learn SVN, then you're in heaven. Everything works just like a software project. Believe it or not, this has occasionally happened to me.
Just a few of my own experiences -- YMMV.
-- Phil
P.S. IMHO, the simplicity of Subversion, its excellent documentation, and the existence of graphical clients like TortoiseSVN (for those coauthors who aren't comfortable working at the command line) should not be overlooked, especially for projects like academic writing (or any writing, for that matter). In addition, if you are going to essentially have your own, private repository, then the benefits of distributed version control become much less important, if at all (i.e., you can just keep a copy of your entire repository on your laptop).