Although many people disregard the preciousness of commits and treat them as save points, they have great value when used correctly. Commits are not just a better way to save a file. When used right, they are documentation about the history of the changes. If we treat them as linear save points in our work, the history becomes extremely noisy with a lot of commits that are "oops, fix tests", "oops, fix indentation", "oops, fix typo" (referred to oops commits from here on). Committing stuff that is wrong is inevitable but git allows us to rewrite the history so that reading it becomes a pleasure and a lot more efficient.

The idea is to treat commits not as linear save points of our files, but as bundles of related changes. For example, changing a function and the unit tests of that function are related. However, changing a function and fixing a typo in the documentation of another are not related. So, if at some review time, some improvement need to be made in the change of one of your commits, simply add that change to the existing commit. Treating commits that way has a number of advantages and also incurs a cost.

Advantages

When Reviewing

The first advantage is when you are the reviewer of a medium to large pull request. If the submitter has classified the changes in different commits, it is much easier to review as you can review one type of change at a time or changes related to one thing at a time. That way you can review better, in less time. If the commits are all disorganized you have to review the whole thing at once which is often much more confusing.

When Reverting

Reverting a commit that does not represent a change in itself but is accompanied by a number of oops commits is more tedious than when the changes are bundled correctly in the same commit. Especially when the commits are interleaved with commits that really should be bundled with another change.

When Blaming

When looking at the blame entries in the code, it is really not interesting to know that the last change on a line is "oops forgot this". It's also not very interesting to see the message "Add feature X to module B" when you are looking at a file of module A that is completely unrelated. In those type of situations, you have to take additional steps to get further blame information that may be uninteresting again until you hit something interesting. On the other hand, if the changes are bundled well, you see the information you need to see without noise.

When Cherry-picking

Similarly to the reverting use case, cherry-picking commits is so much easier when changes are bundled correctly. If related changes are bundled you usually only have to cherry pick one commit. If there are ten commits for a change, you have to cherry-pick ten and because it's hard to come up with ten good messages for one change, most of them will have bad messages which means you will have to inspect a bunch of commits to find which ones are related and you are likely to miss one. If multiple changes have all been squashed in the same commit, you have to jump through a few hoops to cherry-pick only the part of the commit you need.

When Investigating

When investigating a new bug or a regression, it is often extremely valuable to see how the code changed. Typically, you move through the history of the file one commit at a time. If, for each real commit you have five oops commits it really becomes painful to go through the history. If the amount of noise is too high, you might end up not looking at the history at all because you waste too much time. If the changes are well bundled, that work is much more efficient and you get to the root of the issue quicker.

When Rebasing

Working in parallel with many other people means at some point you will encounter merge conflicts when trying to get your pull request passed. When you rebase over the other changes, sometimes you will get small little hunks that are easy to solve and sometimes you will get a huge hunk that is difficult to solve. In those times, it is usually useful to look at the history of the file to know what was changed. If the changes are well bundled, you find with ease what the change was. On the other hand, if that change is split into fifteen commits, ten of which are oops commits, it is really painful to go through the commits and try to make sense of what the intention of the author was and how that integrates in your change.

Costs

The cost of bundling changes judiciously is mainly the cost of learning how to use your favorite git client appropriately. This is a fixed cost that you only have to pay once. Then, you have the cost of the operations. The main operations to know are amend, fixup, squash and edit. Let's take a look at the cost of each of them. I will make my estimate based on my experience, being used to doing those operations with my favorite git client, magit.

Amend

Amend means you will add changes to the previous commit and optionally edit the message. Most of the time, that means that you do not need to think of, or type a commit message. Thus, if making a commit message takes 10 seconds, by amending you save 10 seconds when compared with a regular commit.

Fixup

Fixup means you will add a commit to another commit. Usually, I use this operation to add changes to a commit that is not the last. By using the magit feature called "instant fixup" it is just a matter of choosing the commit to which that the changes should be added to. That is usually shorter than writing a commit message.

Squash

Squash means merging multiple commits together. I don't usually do this operation very often but it's a matter of starting an interactive rebase, marking commits to be squash and executing the operation. Perhaps around 10 seconds total.

Edit

Edit allows maximum flexibility and is usually used to split a commit into multiple commits. Those output commits are created the very same way you make any commit but you first need to start an interactive rebase, perhaps in 3 seconds, mark the commit you want to split in 1 second and maybe 4 seconds to execute the operation.

Conclusion

In summary, the cost of proper bundling of changes is minimal, it is often shorter than adding oops commits and when it is not, it is only a few extra seconds here and there to manipulate git history. The advantages on the other hand, are enormous. Commits are an extremely valuable piece of documentation notably because it is tied to the code at a certain date. Thus you know that the message was up-to-date for that code. Making better commits make the code more understandable and also forces you think about the changes you make and be more deliberate about them.

On git commits

Advantages

When Reviewing

When Reverting

When Blaming

When Cherry-picking

When Investigating

When Rebasing

Costs

Amend

Fixup

Squash

Edit

Conclusion

Jean-Sebastien A. Beaudry

On git commits

Advantages

When Reviewing

When Reverting

When Blaming

When Cherry-picking

When Investigating

When Rebasing

Costs

Amend

Fixup

Squash

Edit

Conclusion

Jean-Sebastien A. Beaudry

On git commits

Updated Reading List

Posts