Rebasing Toward Independence

Make sure every commit within a pull request at least compiles. Having a linear git commit history enables this much more easily.

Rebasing Toward Independence

Introduction

One of the things I will usually do when reviewing code is to verify that every commit within a pull request at least compiles. Since I learned a lot of my craft from developing and reviewing patches, rather than pull requests, this typically was innate in the structure of the code review. Nowadays, though, it's entirely possible that, while a given branch compiles after being merged into another branch, not every commit compiles independently.

You might ask why I'm such a stickler on this. The answer is twofold.
First, I think that commits should be independent of one another. This helps in a number of ways, but mostly it's about the way I think. I want to separate my changes into pieces that are independent of one another, because then I can better summarize them in my mind. This helps me think through, in detail, what steps are required before beginning a particular task. In turn, I've found this leads to less errors and more thought up front, before starting code development.

Second, I use git bisect very often when trying to diagnose a bug that's been found. Bisecting revisions to find where a change occurred is more useful when the commits you're bisecting are small, unrelated, and can be independently compiled. This last part is crucial, because if a specific commit can't be built without adding on other commits, it means you're going to have to git bisect --skip on that commit if it's a pivot point in the bisection search. This increases the likelihood that you're going to end up with multiple commits that could have introduced the bug.

Methodology

Now that I've expounded on the rationale for why you want your commits to be able to be independently compiled, how does one check that this is actually the case? The answer is git rebase.

You're probably thinking "yeah, I don't touch rebase because it changes the git history." That's a good rule of thumb to have for shared branches (e.g. master or develop), but for temporary branches[1], it's perfectly acceptable to change the history in order to make your commits have a better structure before merging them into your main trunk.

Rebasing Interactively

Git rebase has an interactive switch, -i, that allows you to specify which commits you want to operate on, and what you want to do to them. You just need to specify a starting point (usually your current HEAD) and an ending point:

git checkout feature/my-cool-feature
git rebase -i HEAD~2

This will bring up an editor where you can interactively tell git which commit(s) you want to edit:

Example of Editor in Interactive Rebase

In order to edit specific commits, simple change the word pick at the beginning of each line to edit. Then, save the file and exist the editor. Git will now start stepping through the commits, one by one, until it gets to the end of the set of commits you told it to edit. At each step, you simply have to determine if the code compiles. If it does, then you can run git rebase --continue to move on to the next commit.

Iterating Through Commits

The following is an example of how you can step through a branch with a large number of commits to determine, manually, if each commit compiles.

Note: I'm using git-prompt with zsh on Mac OSX for all of the following examples, which is why you'll see the steps remaining when I run git rebase. Your mileage may vary if you're using a different shell or OS.

  1. Check out the branch
git checkout feature/super-cool-feature
  1. Determine how many commits this branch has diverged from the trunk[2]
git cherry --abbrev=6 -v <trunk-branch> | wc -l
  1. Start git rebase
# Use the commits from the above step in place of XX
git rebase -i HEAD~XX

# You will need to use your editor to make
# all the commit messages start with 'edit'
# instead of 'pick' 
  1. Check each commit
# Run your compile command. If it's successful, run:
git rebase --continue
  1. Repeat until finished

Automating It

The above example is fine, and, for most small branches, I'll just do it manually. However, once you have more than ~10 commits, it becomes tiresome. Especially if you've just rebased from master, there were no conflicts, and now you just want to check to make sure everything still compiles, even though you're pretty sure it will.

Here's a small shell script that will enable you to do this automatically:

Fixing a Bad Commit

If you encounter a commit that doesn't compile, you can fix the code to make it compile before running git rebase --continue again. If you're using the above script, it will automatically stop for you if the compilation failed so you can fix it. Once fixed, you can re-run the script to continue.

Conclusion

Now that you know how to quickly check to determine if commits on a branch compile independently and b) how to fix commits that don't compile, I hope you'll use this knowledge to make your commits better. You can also check branches when reviewing code to verify that each commit compiles.

Be diligent about having a hygienic repository! It will save you tons of time in the long run.


  1. Almost all of my projects use a variation of git-flow. If you've never heard of it, I highly recommend checking it out. ↩︎

  2. This command makes use of git cherry, a command I have found extremely useful in the past. You can use it to determine which commits are in one branch but not another. For example, if you wanted to find which commits are in branch my-cool-feature but haven't yet been merged to develop, you could use: git checkout my-cool-feature && git cherry --abbrev=6 -v develop ↩︎