In this article I discuss techniques for migrating source code repositories from darcs to git. I describe two approaches that failed for me, and introduce a new tool that I was able to use successfully with my own projects, and that can be used to create git mirrors of active darcs repositories.
Background
I've been a big fan of darcs for over two years, and have used it exclusively for my personal projects. However, in recent times I have been increasingly drawn in by the community and toolset that is growing around git, and I've naturally wanted to migrate some of my darcs repositories to git.
darcs2git
The first tool I tried, darcs2git, uses a low-level git component called git-fast-import to efficiently slurp data into git, but when I tried to use it with the latest git at the time of writing, git-fast-import choked on the binary data passed to it by darcs2git. Game over.
Tailor
Tailor, the Swiss Army Knife of inter-VCS synchronisation, has helped me several times in the past to migrate 80% of one VCS' contents into a new VCS-du-jour, ultimately leading me to abandon it or file bug reports. Your mileage might vary, depending on the source and target VCSes you try it with.
The key to tailor's versatility is also its Achilles heel - it has a standardised notion of a changeset, into which intermediate form every source changeset is transformed. This notional changeset can contain renames of files and directories, additions, deletions of files, and suchlike.
When I tried to use tailor to convert my darcs repos to git, it became stuck on certain changesets, apparently due to it misunderstanding some of the move/rename cases in the source darcs changesets, and therefore being unable to replay them into the working copy it uses for migration.
A new approach
I decided to write my own naïve conversion script, which would initialize a dual darcs/git repository in a working directory set aside for migration, then gradually pull changes from a source darcs repo and record each changeset wholesale into the git repository.
Using tricks from darcs2git, tailor and git-svn, I was able to do this pretty easily, and my script even supports incremental importing from an active darcs repository, which would allow it to be used for maintaining public git mirrors of darcs repositories.
Unoriginally dubbed 'darcs-to-git', the code can be found (and tracked) here:
darcs-to-git inserts tags into the comments of the git commits it produces, containing the globally unique darcs patch ID from which the commit originated. Having seen this technique used by the excellent git-svn, this seemed the neatest approach for making incremental migration of new darcs patches possible.
Branches, and multiple source darcs repositories
darcs-to-git can only import changesets from a single darcs repository, which essentially means there is no support for importing the implicit branching that results from darcs' "cherry-picking" nature. It should be possible to add such support, however.
Each darcs repository is essentially a unique branch, consisting of a dependency-ordered bag of changesets that individually may or may not appear in other darcs repositories. By comparing the patch list of two darcs repositories starting with their earliest (common) revision, it should be possible to determine the points at which to create git branches, and use git-merge and/or git-cherry-pick to propagate later common patches between branches.
Patches to darcs-to-git for this or any other worthwhile purpose are welcome.






Thanks for the patch, Adam -- I've applied it now. You're right that the usage was somewhat unclear. Hmm, if plenty of people are actually going to use this, it could use some better documentation...
Cool!
I've used the tool for all my darcs repositories now; worked great (except when darcs ate all the memory in the box because I had large binary patches - a bad idea with darcs).
Best regards, Adam
I may be extremely dense, but how it this supposed to work?
$ mkdir test
$ cd test
$ /usr/src/darcs-to-git/darcs-to-git virgil:/var/lib/darcs/www.koldfront.dk/
Running: ["darcs", "-v"]
Running: ["darcs", "init"]
Running: ["git-init"]
Initialized empty Git repository in .git/
Running: ["darcs", "changes", "--reverse", "--repodir=virgil:/var/lib/darcs/www.koldfront.dk/", "--xml", "--summary"]
darcs failed: can't set directory to virgil:/var/lib/darcs/www.koldfront.dk/
/usr/src/darcs-to-git/darcs-to-git:47:in `output_of': Failed to run: ["darcs", "changes", "--reverse", "--repodir=virgil:/var/lib/darcs/www.koldfront.dk/", "--xml", "--summary"] (RuntimeError)
from /usr/src/darcs-to-git/darcs-to-git:81:in `read_from_repo'
from /usr/src/darcs-to-git/darcs-to-git:185
$
Okay, maybe getting the DARCSREPO via ssh isn't supported, let me try one via http:
$ /usr/src/darcs-to-git/darcs-to-git http://koldfront.dk/darcs/mosaic-2.7b5/
Running: ["darcs", "-v"]
Running: ["darcs", "init"]
Running: ["git-init"]
Initialized empty Git repository in .git/
Running: ["darcs", "changes", "--reverse", "--repodir=http://koldfront.dk/darcs/mosaic-2.7b5/", "--xml", "--summary"]
darcs failed: can't set directory to http://koldfront.dk/darcs/mosaic-2.7b5/
/usr/src/darcs-to-git/darcs-to-git:47:in `output_of': Failed to run: ["darcs", "changes", "--reverse", "--repodir=http://koldfront.dk/darcs/mosaic-2.7b5/", "--xml", "--summary"] (RuntimeError)
from /usr/src/darcs-to-git/darcs-to-git:81:in `read_from_repo'
from /usr/src/darcs-to-git/darcs-to-git:185
$
No, still no go.
Perhaps "usage: darcs-to-git DARCSREPO" should say "usage: darcs-to-git PATHTOLOCALDIRCONTAININGADARCSREPO"?
Let me try that:
$ darcs get http://koldfront.dk/darcs/mosaic-2.7b5/
Copying patch 9 of 9... done.
Applying patch 9 of 9... done.
Finished getting.
$ mkdir hop
$ cd hop
$ /usr/src/darcs-to-git/darcs-to-git ../mosaic-2.7b5/
Running: ["darcs", "-v"]
Running: ["darcs", "init"]
Running: ["git-init"]
Initialized empty Git repository in .git/
Running: ["darcs", "changes", "--reverse", "--repodir=../mosaic-2.7b5/", "--xml", "--summary"]
[...]
1 files changed, 4 insertions(+), 3 deletions(-)
$
Hey, it worked, cool!
Thanks for creating darcs-to-git and sharing it. May I suggest the attached tiny patch?
Best regards,
Adam.
P.S. As you can see, I'm not that used to git yet...
commit 4f71a3ecc89627f0f0cf1f26f49fa688a46f9666
Author: Adam Sjøgren <asjo@koldfront.dk>
Date: Wed Mar 19 23:22:57 2008 +0100
Hint at the fact that the darcs repository must be a local one.
diff --git a/darcs-to-git b/darcs-to-git
index 75b5323..5659817 100755
--- a/darcs-to-git
+++ b/darcs-to-git
@@ -17,12 +17,12 @@ if [nil, '--help', '-h'].include?(SRCREPO)
STDERR.write(<<-end_usage)
Creates git repositories from darcs repositories
- usage: darcs-to-git DARCSREPO
+ usage: darcs-to-git DARCSREPODIR
1. Create an *empty* directory that will become the new git repository
2. From inside that directory, run this program, passing the location
- of the source darcs repo as a parameter
+ of the local source darcs repo as a parameter
The program will git-init the empty directory, and migrate all patches
in the source darcs repo into commits in that repository.
I have been experimenting with this on a large existing darcs repos and have the following minor fixes which I can send you - please email me Steve if you would like me to send a git bundle of these commits (I can't find an email address on this site):
- Fixed darcs boringfile expression to correctly exclude .git repository from whatsnew
- Cope with 'darcs setpref' in source repository which overrides the boringfile under _darcs with one under source control
When this happens it removes the patterns we have added to exclude the .git directory.
Handle it by deleting the _darcs/prefs/prefs file whenever it is recreated by pulling patches.
- Force git commits to be recorded even when git can't see any changes
It's possible in some circumstances (e.g. with merge conflicts) to have non-tag darcs patches which don't create or change any files when pulled individually, or which only create/remove empty directories,
- Make the check for an empty git repos work even if git-gc has been run
- Show full output from 'darcs whatsnew' when it reports changes
- When re-pulling to get new patches, consider all patches instead of stopping at the first that is already in git
- More verbose output when deciding which patches to pull
feature request... can --help show some usage info?
A README file wouldn't hurt either. I haven't gotten it to work yet :-(.
Commit b0d3292137a0b in the git repo adds more detailed usage info, which will hopefully get you started.
Please let me know if/how this fails for you.
I found your tool after much struggling with Tailor, and it seems to be working like a charm.
Just had to make one change:
self.name = patch_xml.get_elements('name').first.get_text.value rescue ''Kartik: I've committed a similar change; thanks.
I tried to convert one of my darcs repositiories using your script (648f1b6916119492d7310163497a90e04f48aa69). It had two problems. It totally failed to convert (read: ignore) my setpref patches; I managed to work around this by manually committing and reverting a bogus tree that claims to commit that darcs patch. The other is that it creates broken Author lines, like so:
Author: Antti-Juhani Kaijanaho <Antti-Juhani Kaijanaho <antti-juhani@kaijanaho.fi>>
You can find a tarred and gzipped copy of the darcs repository at http://antti-juhani.kaijanaho.fi/stuff/dctrl-tools.darcs.tar.gz .
Antti-Juhani: Thanks for the feedback. The author issue is already fixed; I encountered it myself and fixed the issue, but hadn't pushed my changes up to the public repository.
Unless I'm misunderstanding what you mean, I think it's normal that the setpref commits would not be migrated to git, because there's no easy way to translate "boring" files etc. to .gitignore.
I'd imagine that a destination git repository would necessarily include one or more commits that would include git-specific repository customisations, e.g. .gitignore files.
First: thanks for fixing the problems :) Version 4f630f3549d255a50080597f0c622cfabd2edeaa finished converting that repository without problems and I don't see any obvious problems in the result.
What I meant about the setpref was that the script crashed when converting a setpref patch, but I assume you figured that out since you've fixed that. As I said in the original comment, I expected the setpref patches to be ignored.
One nit is that you don't seem to include the full commit message in the converted commits (you include just the patch name), but I can work with the repository that I got, so I'm not really complaining,
Thanks again :)
I hadn't noticed the commit message issue, but I've fixed that now too. (I rarely enter commit comments in darcs for the personal projects I have been converting so far.)