Transform large pull-requests into smaller manageable ones.

Small 200 line Pull Requests ought to be the default and most common mode of development. Evidence of this has been clearly demonstrated by many companies and researchers. It’s proven to improve development velocity, quality of code, quality of the final product, and provide the business with the agility to change direction without significant waste or over-investing.

Unfortunately, we don’t all live in the world of small pull requests. Yes, you may know some individual developers who do this, or perhaps you work in one of the few companies effectively working in small PRs, but the reality is that the “industry default mode” of development is submitting PRs for review that are thousands of lines changed across multiple areas of an application. Here’s some examples I’ve collected from real life.

You see this kind of PR in companies where 1 developer writes 1 feature over 2 weeks, and when “done” submits it for a review.

You will see this kind of PR in companies that “review an epic” in their development pipeline. Something to think about: Using the well established 200 lines per hour metric, this review should take nearly 100 hours of work to discover the defects.… — You will see this kind of PR in companies that “review an epic” in their development pipeline.

**Something to think about:** Using the well established 200 lines per hour metric, this review should take nearly 100 hours of *work* to discover the defects. Since the 200 lines per hour rule also requires a *break*, it will take 200 hours of *scheduled time* to review this work. This should take **roughly 5 weeks** before the first pass can be completed. If it happened faster, the data shows us that many defects will be missed. Missed defects in code review become the **much more expensive** kind of defect: the ones your QA team finds (if you have one) or customers find.

Ways to break up a large PR

Traditionally this has been an exhausting process of removing files, creating more branches, adding files back, and a lot of cut+paste work. This is error-prone and can lead to missing code or even copying over outdated code that was not “final”. To be frank, it’s a massive pain and the developers who create large PRs also don’t like breaking up their large PRs after they were written.

I’ve created a tool for you called “the git exfiltrator” (exfiltrate means the opposite of infiltrate). It will break your massive branch apart into smaller branches based on file paths.

Who is working with large PRs?

A typical development process that causes large PRs can look like this:

If this process was pushing fine-grained tasks (200 LOC) it might be manageable, but that’s less common than we would like. We have this problem for any number of reasons, but all of those reasons usually require a long cultural transformation, the kind of thing you can’t change with a single email to the team.

We know that in large PRs:

Detailed code reviews aren’t happening
Adding reviewers actually reduces quality
Unrelated changes are easily overlooked.
It takes longer to ship the complete feature
It demands a longer work backlog
It reduces the company’s ability to quickly change goals
It reduces the possibility for rapid learning and innovation.
If 9 out of 10 parts of a PR are perfect, but a single flaw exists in 1 part, then all 10 elements are held back and wait for that single part to get fixed.

Possible ways to break up a large PR

Move out all the noise: readme, meta files, dependencies, etc.
Align on the functionality: if you break up functionality by folder, this could be trivial.
Align on the high-level design: a module’s API interfaces, database schema, controller paths and URLs, cross-domain-calls, etc.
Focus on tests: Review the tests first, if they’re incorrect or missing you can start here and require development to incrementally fulfill tests (kind of a strange wormhole style of TDD).
Check out this concept piloted at Microsoft a few years ago: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ClusterChages-ICSE2015.pdf and https://www.microsoft.com/en-us/research/publication/helping-developers-help-themselves-automatic-decomposition-of-code-review-changesets/

git-exfiltrate: rescue large branches

Ways to break up a large PR

Who is working with large PRs?

Possible ways to break up a large PR

Further reading about why long PRs are bad for everyone

Articles and Posts

git-exfiltrate: rescue large branches

Ways to break up a large PR

Who is working with large PRs?

Possible ways to break up a large PR

Further reading about why long PRs are bad for everyone