Reflections on a complaint from a frustrated git user

February 23, 2009
git
SCM

Last week, Scott James Remnant posted a series of “Git Sucks” on his blog, starting with this one here, with follow up entries here and here. His problem? To quote Scott, “I want to put a branch I have somewhere so somebody else can get it. That’s the whole point of distributed revision-control, collaboration.” He thought this was a “mind-numbingly trivial” operation, and was frustrated when it wasn’t a one-line command in git.

Part of the problem here is that for most git workflows, most people don’t actually use “git push”. That’s why it’s not covered in the git tutorial (this was a point of frustration for Scott). In fact, in most large projects, the number of people need to use the “scm push” command is a very small percentage of the developer population, just as very few developers have commit privileges and are allowed to use the “svn commit” command in a project using Subversion. When you have a centralized repository, only the privileged few will given commit privileges, for obvious security and quality control reasons.

Ah, but in a distributed SCM world, things are more democratic — anyone can have their own repository, and so everyone can type the commands “git commit” or “bzr commit”. While this is true, the number of people who need to be able to publish their own branch is small. After all, the overhead in setting up your own server just so people can “pull” changes from you is quite large; and if you are just getting started, and only need to submit one or two patches, or even a large series of patches, e-mail is a far more convenient route. This is especially true in the early days of git’s development, before web sites such as git.or.cz, github, and gitorious made it much easier for people to publish their own git repository. Even for a large series of changes, tools such as “git format-patch” and “git send-email” are very convenient for sending a patch series, and on the receiving side, the maintainer can use “git am” to apply a patch series sent via e-mail.

It turns out that from a maintainer’s point of view, reviewing patches via e-mail is often much more convenient. Especially for developers who are just starting out with submitting patches to a project, it’s rare that a patch is of sufficiently high quality that it can be applied directly into the repository without needing fixups of one kind or another. The patch might not have the right coding style compared to the surrounding code, or it might be fundamentally buggy because the patch submitter didn’t understand the code completely. Indeed, more often than not, when someone submits a patch to me, it is more useful for indicating the location of the bug more than anything else, and I often have to completely rewrite the patch before it enters into the e2fsprogs mainline repository. Given that, publishing a patch that will require modification in a public repository where it is ready to be pulled just doesn’t make sense for many entry-level patch submitters. E-mail is in fact less work, and more appropriate for review purposes.

It is only when a mid-level to senior developer is trusted to create high quality patches that do not need review that publishing their branch in a pull-ready form really makes sense. And that is fairly rare, and why it is not covered in most entry-level git documentation and tutorials. Unfortunately, many people expect to see the command “scm push” in a distributed SCM, and since “git pull” is a commonly used command for beginning git users, they expect that they should use “git push” as well — not realizing that in a distributed SCM, “push” and “pull” are not symmetric operations. Therefore, while most git users won’t need to use “git push”, git tutorials and other web pages which are attempting to introduce git to new users probably do need to do a better job explaining why most beginning participants in a project probably don’t need their own publically accessible repository that other people can pull from, and which they can push changes for publication.

There is one exception to this, of course, and this is a developer who wants to get started using git for a new project which he or she is starting and is the author/maintainer, or someone who is interested in converting their project to git. And this is where bzr has an advantage over git, in that bzr is primarily funded by Canonical, which has a strong interest in pushing an on-line web service, Launchpad. This makes it easier for bzr to have relatively simple recipes for sharing a bzr repository, since the user doesn’t need to have access to a server with a public IP address, or need to set up a web or bzr server; they can simply take advantage of Launchpad.

Of course, there are web sites which make it easy for people to publish their git repositories; earlier, I had mentioned git.or.cz, github, and gitorious. Currently, the git documentation and tutorials don’t mention them since they aren’t formally affiliated with the git project (although they are used by many git users and developers and the maintainers of these sites have contributed a large amount of code and documentation to git). This should change, I think. Scott’s frustrations which kicked off his “git sucks” complaints would have been solved if the Git tutorial recommended that the easist ways for someone to publicly publish their repository is via one of these public web sites (although people who want to set up their own server certainly free to do so).

Most of these public repositories probably won’t have much reason to exist, but they don’t do much harm, and who knows? While most of the repositories published at github and gitoriuous will be like the hundreds of thousands of abandoned projects on Sourceforge, one or two of the new projects which someone starts experimenting on at github or gitorious could turn out to be the next Ruby on Rails or Python or Linux. And hopefully, they will allow more developers to be able to experiment with publishing commits on their own repositories, and lessen the frustrations of people like Scott who thought they needed their own repositories; whether or not a public repository is the best way for them to do what they need to do, at least this way they won’t get as frustrated about git. 🙂