Understanding delta file changes and merge conflicts in Git pull requests
A while ago I worked on a support request with a user reporting unexpected behavior from Git when completing a big and long-living pull request using Azure Repos. This pull request had known conflicts but also some missing changes on files and paths where there were no merge conflicts detected at all. This made me think Git was indeed, as user was reporting, ignoring some changes on its own. Long story short, it was not.
Let’s dig into this and hopefully help anyone using Git to understand the key processes of a Git pull request and the whys of the resulting changes of the merge operation to happen at the end of it.
Delta file changes for a Git pull request
The changes to be applied at completion of a pull request are the result of merging the head of the source branch against the head of the target branch. To refer to these changes we use the term delta file changes (Δchanges). The method Git uses to determine these changes is by comparing the heads of both branches from the merge-base commit.
Any commit before the merge-base commit will not be considered because it is part of both branches. When the source branch has grown so big that it is difficult to find the merge-base commit in the history you can run:
git merge-base refs/heads/master refs/heads/dev
For this example, the target branch is named master and the source branch is named dev. The result of this command will be the SHA value for the merge-base commit.
This will allow you to identify which is the merge-base commit. All the commits made on the source branch after the merge-base commit will be considered as Δchanges. These are the commits that will be listed in the commits tab for your pull request.
You can pass any type of Git reference to the merge-base command. You can also combine them:
– Explicit SHA values e.g., 5901ba2, 6b26d9e, 5597095
– Tags e.g., refs/tags/alpha, refs/tags/v1.1
– Remotes e.g., refs/remotes/origin/master
Merge conflicts in Git pull requests
There will be conflicts in the pull request when both the source branch and the target branch contain matching changes after the merge-base commit. This means both branches grew in parallel after the source branch was cut-off from the target branch and at some point, both branches made changes to the same file.
Git is especially good at tracking changes for text/code files. If the file is not a text/code file (such as images, videos etc), then Git will consider any change as a new version of the file.
In this diagram we show an example of a merge conflict, both branches received a commit on the file abstracted in the shape of a square. If we attempt to merge these branches Git won’t know which version of the file you intend to keep as final; we call these competing files.
For competing files you’ll have to mimic a sync between the branches by committing the version you want to keep on either branch. Git makes this easy by adding some conflict markers to the lines of the file in both branches after the conflict has been detected, see Resolving a merge conflict using the command line.
Git ignoring file changes after a pull request completion with no conflicts
Remember we said Git would determine which changes to apply based on the merge-base commit? Let’s elaborate a more complicated scenario to demonstrate this.
Which is the final version for file abstracted as a square?
In this example a user modified the file in the source branch and then rolled it back to the way it was at the merge-base commit expecting Git to set this version as the final one for the merge operation.
However, Git compared the version of the file from the merge-base commit against the HEAD of the source branch and determined there were no changes on it and therefore it ignored it. At the same time the target branch received a change on the same file and since there was no input on this file from Git on the source branch, it remained untouched and with no conflicts after the merge operation was completed. Keep in mind this behavior applies not only at a file level but at a line level for text/code files.
From the perspective of a support engineer, my suggestion to developers experiencing unexpected results from a merge operation is to identify the head of the source branch at the moment of a pull request completion and compare it to the merge-base commit by running the git merge-base command. After all this being said we can narrow down the possible scenarios to merge conflicts and unexpected resulting changes applied. If the issue remains unclear, we’ll be happy to help you in the Customer Service and Support team for Azure DevOps.
merge-base can work, but it’s often easier to just rebase your target branch on your source branch…and yes this means you have to force push but most the time the source branch is a feature branch that only a subset of devs are working on and it’s gonna go away after the merge anways. If you rebased in the given scenario, the “revert” commit would have come up as a conflict and the dev would have realized someone in trunk made changes. If you’re scared of force pushing the history…rebase in yet a 3rd branch: cut source to source2, rebase there, and PR that branch to target, close your original PR. rebase leads to much less surprise in practice
You’re absolutely right, rebasing the source branch would prevent this scenario from happening. The article was originally meant to shed some light to users who had unexpected changes applied once the PR was completed and probably source branch deleted. I will compose your suggestion mentioning this workflow as a safe-measure. I believe pulling the origin/target branch into the source branch (locally) would as well prevent these scenarios, reverse integration.
adding Pull Requests with Rebase by Edward Thomson