Git is a distributed version control system, so by default each Git repository has a copy of all files in the entire history. Even moderately-sized teams can create thousands of commits adding hundreds of megabytes to the repository every month. As your repository grows,
Post by this author
In a previous blog series, we announced that Git has a new commit-graph feature, and described some future directions. Since then, the commit-graph feature has grown and evolved. In the recently released Git version 2.24.0, the commit-graph is enabled by default!
In previous posts I’ve talked about performance improvements that our team contributed to the Git community. At Microsoft, we’ve been pushing Git to its limits with the largest and busiest Git repositories on the planet, improving core Git as we go and sending these improvements back upstream.
We’ve been discussing the commit-graph feature in Git 2.18 and how we can use generation numbers to accelerate commit walks. One area where we can get significant speedup is when presenting output in topological order. This allows us to walk a much smaller list of commits than before.
Earlier, we announced that Git 2.18 contains a new commit-graph feature, and we discussed the commit-graph file format. As shipped in Git 2.18, this file only speeds up commit walks by a constant multiple, due to parsing structured data from the commit-graph file.
Earlier, we announced the commit-graph feature in Git 2.18 and talked about some of its performance benefits. Today, we’ll discuss some if the technical details about how the commit-graph feature works, including some helpful properties of its file format. This file speeds up commit-graph walks so much that we were able to identify other ways to speed up these walks using small optimizations.
Have you ever run gitk and waited a few seconds before the window appears? Have you struggled to visualize your commit history into a sane order of contributions instead of a stream of parallel work? Have you ever run a force-push and waited seconds for Git to give any output?
Git was originally designed for Unix systems and still today, all the build tools for the Git codebase assume you have standard Unix tools available in your path. If you have an open-source mindset and want to start contributing to Git,
Visual Studio Team Services (VSTS) hosts the largest Git repository in the world: the Windows source code. Keeping a primary copy of the code available in the cloud and having it be performant while being updated by over 4000 users at the same time is a monumental achievement,