{"id":12645,"date":"2017-02-07T20:04:14","date_gmt":"2017-02-08T01:04:14","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/bharry\/?p=12645"},"modified":"2019-02-16T22:46:05","modified_gmt":"2019-02-16T22:46:05","slug":"more-on-gvfs","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/bharry\/more-on-gvfs\/","title":{"rendered":"More on GVFS"},"content":{"rendered":"<p>After watching a couple of days of GVFS conversation, I want to add a few things.\n<strong>What problems are we solving?<\/strong>\nGVFS (and the related\u00a0Git optimizations)\u00a0really solves\u00a04 distinct problems:<\/p>\n<ol>\n<li>A large number of files &#8211; Git doesn&#8217;t naturally work well with hundreds of thousands or millions of files in your working set.\u00a0 We&#8217;ve optimized it so that operations like git status are reasonable, commit is fast, push and pull are comfortable, etc.<\/li>\n<li>A large number of users &#8211; Lots of users\u00a0create 2 pretty direct challenges.\n<ol>\n<li>Lots of branches &#8211; Users of Git create branches pretty prolifically.\u00a0 It&#8217;s not uncommon for an engineer to build up ~20 branches over time and multiply 20 by, say 5000 engineers and that&#8217;s 100,000 branches.\u00a0 Git just won&#8217;t be usable.\u00a0 To solve this, we built a feature we call &#8220;limited refs&#8221; into our Git service (Team Services and TFS) that will cause the service to pretend that only the branches &#8220;you care about&#8221; are projected to your Git client.\u00a0 You can favorite the branches you want and Git will be happy.<\/li>\n<li>Lots of pushes &#8211; Lots of people means lots of code flowing into the server.\u00a0 Git has critical serialization points that will cause a queue to back up badly.\u00a0 Again, we did a bunch of work on our servers to handle the serialized index file updates in a way that causes very little contention.<\/li>\n<\/ol>\n<\/li>\n<li>Big files &#8211; Big binary files are a problem in Git are problem because Git copies all the versions to your local Git repo and makes for very slow operations.\u00a0 GVFS&#8217;s virtualized .git directory means it only pulls down the files you need when you need them.<\/li>\n<li>Big .git folder &#8211; This one isn&#8217;t exactly distinct.\u00a0 It is related to a large number of files and big files but, just generally the multiplication of lots of files, lots of history and lots of binary files creates a huge and unmanageable .git directory that gobbles up your local storage and slows everything down.\u00a0 Again GVFS&#8217;s virtualization only pulls down the content you need, when you need it, making it much smaller and faster.<\/li>\n<\/ol>\n<p>There are other partial solutions to some of these problems &#8211; like LFS, sparse checkouts, etc.\u00a0 We&#8217;ve tackled all of these problems in an elegant and seamless way.\u00a0 It turns out #2 is solved purely on the server &#8211; it doesn&#8217;t require GVFS and will work with any Git client.\u00a0 #1, #3 and #4 are addressed by GVFS.\n<strong>GVFS really is just Git<\/strong>\nOne of the other things I&#8217;ve seen in the discussions is how we are turning Git into a centralized version control system (and hence removing all the goodness).\u00a0 I want to be clear that I really don&#8217;t believe we are doing that and would appreciate the opportunity to convince you.\nLooking at the server from the client, it&#8217;s just Git.\u00a0 All TFS and Team Services hosted repos are *just* Git repos.\u00a0 Same protocols.\u00a0 Every Git client that I know of in the world works against them.\u00a0 You can choose to use the GVFS client or not.\u00a0 It&#8217;s your choice.\u00a0 It&#8217;s just Git.\u00a0 If you are happy with your repo performance, don&#8217;t use GVFS.\u00a0 If your repo is big and feeling slow, GVFS can save you.\nLooking at the GVFS client, it&#8217;s also &#8220;just Git&#8221; with a few exceptions.\u00a0 It preserves all of the semantics of Git &#8211; The version graph is a Git version graph.\u00a0 The branching model is the Git branching model.\u00a0 All the normal Git commands work.\u00a0 For all intents and purposes you can&#8217;t tell it&#8217;s not Git.\u00a0 There are three exceptions.<\/p>\n<ol>\n<li>GVFS only works against TFS and Team Services hosted repos.\u00a0 The server must have some additional protocol support to work with GVFS.\u00a0 Also, the server must be optimized for large repos or you aren&#8217;t likely to be happy.\u00a0 We hope this won&#8217;t remain the case indefinitely.\u00a0 We&#8217;ve published everything a Git server provider would need to implement GVFS support.<\/li>\n<li>GVFS doesn&#8217;t support Git filters.\u00a0 Git filters transform file content on the fly during a retrieval (like end of line translations).\u00a0 Because GVFS is projecting files into the file system, we can&#8217;t transform the file on &#8220;file open&#8221;.<\/li>\n<li>GVFS has limits on going offline.\u00a0 In short, you can&#8217;t do an offline operation if you don&#8217;t have the content it needs.\u00a0 However, if you do have the content, you can go offline and everything will work fine (commits, branches, everything).\u00a0 In the extreme case, you could pre-fetch everything and then every operation would just work &#8211; but that would kind of defeat virtualization.\u00a0 In a more practical case, you could just pre-fetch the content of the folders you generally use and leave off the stuff you don&#8217;t.\u00a0 We haven&#8217;t built tools yet to manage your locally cached state but there&#8217;s no reason we (or you) can&#8217;t.\u00a0 With proper management of pre-fetching GVFS can even give a great, full featured offline experience.<\/li>\n<\/ol>\n<p>That&#8217;s all I know of.\u00a0 Hopefully, if GVFS takes off, #1 will go away.\u00a0 But remember, if you have a repo in GVFS and you want to push to another Git server, that&#8217;s fine.\u00a0 Clone it again without the GVFS client, add a remote to the alternate Git server and push.\u00a0 That will work fine (ignoring the fact that it might be slow because it&#8217;s big).\u00a0 My point is, you are never locked in.\u00a0 And #3 can be improved with fairly straight forward tooling.\u00a0 It&#8217;s just Git.\nHopefully this sheds a little more light on the details of what we&#8217;ve done.\u00a0 Of course, all the client code is in our <a href=\"https:\/\/github.com\/Microsoft\/GVFS\">GitHub project <\/a>so feel free validate my assertions.\nThanks,\nBrian<\/p>\n","protected":false},"excerpt":{"rendered":"<p>After watching a couple of days of GVFS conversation, I want to add a few things. What problems are we solving? GVFS (and the related\u00a0Git optimizations)\u00a0really solves\u00a04 distinct problems: A large number of files &#8211; Git doesn&#8217;t naturally work well with hundreds of thousands or millions of files in your working set.\u00a0 We&#8217;ve optimized it [&hellip;]<\/p>\n","protected":false},"author":244,"featured_media":14617,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[9],"class_list":["post-12645","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-vs-team-services"],"acf":[],"blog_post_summary":"<p>After watching a couple of days of GVFS conversation, I want to add a few things. What problems are we solving? GVFS (and the related\u00a0Git optimizations)\u00a0really solves\u00a04 distinct problems: A large number of files &#8211; Git doesn&#8217;t naturally work well with hundreds of thousands or millions of files in your working set.\u00a0 We&#8217;ve optimized it [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/posts\/12645","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/users\/244"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/comments?post=12645"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/posts\/12645\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/media\/14617"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/media?parent=12645"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/categories?post=12645"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/tags?post=12645"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}