As Visual Studio transitions its core packages and services to an async loading/retrieval model, we’ve identified bugs in both the initial async implementation and the existing synchronous package loading mechanism.
While most of these bug fixes have been transparent to extenders, one requires attention due to changed constraints. This post highlights that change. Note that this is only applicable to VSSDK based extensions and does not apply to VisualStudio.Extensiblity. For more information on the differences between extensibility models in Visual Studio, please refer to this documentation: Choose the right Visual Studio extensibility model for you – Visual Studio (Windows) | Microsoft Learn
Summary
The change prevents cyclical package load requests to avoid a race condition where service requests, which should never return null, may occasionally do so. Cyclical package loads were never recommended, and any previous success was due to design decisions made over 25 years ago when Visual Studio was primarily single-threaded, and package loads had to occur on the UI thread.
If you trigger a cyclical package load request it will fail with an HRESULT of 0x80049283 (VS_E_CYCLICPACKAGELOAD). For managed callers this would surface as a COMException with that HRESULT, for native callers the HRESULT would be returned directly.
The change occurred in 17.12 Preview 2 (version 17.12.35309.182) and is present in all subsequent previews and releases.
Background Context
Service Discovery 101
In Visual Studio, a service is tied to a package. When the service is requested, the package loads (if not already loaded) and registers a service factory (IServiceProvider) to retrieve the service instance.
Typically, service factories are registered in the package’s Initialize(Async) method (or SetSite for native packages).
The service retrieval path in Visual Studio works as follows:
- Caller A asks for Service B
- The service manager checks if a service factory for Service B is already registered. If so, it invokes the factory and returns the result.
- If no factory is registered, it checks if the package providing the service factory is loaded. If loaded, the request fails, as the service cannot be acquired.
- If the package is not loaded, it loads the package, expecting it to register the service factory, and checks again for registration.
- If no factory is registered, the request fails.
- If a factory is registered, it is invoked to retrieve the service instance.
The Problem
The issue arises when a recursive package load is required.
So, for example, if Package A is currently loading and it calls GetService to get a service it itself provides. This creates a loop in the package load request chain as A is loading and has entered a code path, that to complete, must wait for A to finish loading.
Another example would be if Package A is currently loading and calls GetService on a service that comes from Package B. If Package B, during its load, calls GetService on a service that comes from package A we again have a loop in the package load request chain.
These loops can be arbitrarily long (i.e. Package A triggers Package B to load, which triggers package C to load, which triggers package D to load …. which triggers Package A to load).
If a service being requested during a package load from a package that is currently loading then:
- If the package has registered the service factory before the request is made, the service manager invokes the factory and returns the service—this is the ideal path.
- If the package has not registered the service factory before the request is made, a problem occurs. When the service manager checks if the package is loaded, it detects the package is still loading (not fully loaded). If it blocks waiting for the package to finish loading, a deadlock occurs, as the package load waits for the service return, but that requires the package load to complete. If it ignores the loading state, it returns null, since the service factory hasn’t been registered yet.
To make this work, the original design marked a package as “loaded” right before its Initialize/SetSite was called.
This was a reasonable compromise in the context of the time, as all package loads were synchronous and occurred on the UI thread, eliminating concerns about concurrent load requests. It also allowed cycles to “work” if the service-providing package registered all its services before attempting to retrieve any services that could lead to a cycle.
So Why the Change?
The term “work” was used in quotes because, while it sometimes worked, it was unreliable. For it to work, all packages had to register their service factories before making requests that could introduce a cycle. This was unenforceable, error-prone, and led to issues where internal packages would re-query for services if the initial request returned null. It was also susceptible to failure from minor changes, like refactoring initialization code or altering the load order of packages, which could invert the request chain and break the cycle (or introduce another).
Additionally, this approach only worked in a single-threaded loading model, which Visual Studio no longer uses.
In a multi-threaded loading model, the likelihood of race conditions increases. If a package is marked as loaded before its SetSite/Initialize method is called, a request from another thread may consider the package already loaded and attempt to retrieve the service, racing with the original package’s service factory registration.
To prevent null services from being returned due to this race condition, we must ensure that all loads, except the original request, block until the package initialization completes (until Initialize(Async)/SetSite returns). However, blocking on in-progress loads should only occur if the calling thread is independent of the original thread that initiated the load; otherwise, it could deadlock the entire process.
The Fix
To solve this problem, we rely on AsyncLazy to manage package loads. AsyncLazy uses AsyncLocal to detect re-entrant invocations. It can determine, from the context of any thread, if the caller is related to the original thread that invoked AsyncLazy’s value factory. If so, it throws an exception; if not, it blocks the caller until the original invocation completes.
This behavior is exactly what we need, but it means cyclic load requests will now throw an exception, causing a failure. While this is unavoidable (and actually desirable), it represents a change in behavior. This approach ensures a package isn’t considered loaded until its Initialize(Async)/SetSite completes, eliminating the registration race condition, and properly handles concurrent load requests—blocking if unrelated to the original load, or failing otherwise.
Call to Action
This change is necessary to prevent a race condition, which arises from how Visual Studio services are discovered.
Ways to avoid cycles
Since cycles are no longer allowed, it’s important to avoid creating them. The good news is that changes made to address these issues will also work in older versions, as cycles were never a good idea. While race conditions still exist in older versions, these cannot be avoided without applying this fix more broadly, which would impact a larger user base.
The most common mistake we observed was developers being too eager in Initialize(Async)/SetSite, often leading to cycles. Specifically, people would:
- Eagerly create service instances, which often required fetching other services.
- Eagerly create tool windows, which also required fetching services.
- Eagerly fetch services to store in backing fields, even if they weren’t used in Initialize(Async)/SetSite.
Your Initialize(Async)/SetSite should do minimal work. You won’t know why your package was loaded, but if it was to retrieve a service or load a tool window, those actions will occur by a follow-up call after Initialize(Async)/SetSite completes.
To avoid cycles, services can be retrieved “just in time” using Lazy<T> or AsyncLazy<T> outside of Initialize(Async)/SetSite. Similarly, while service factories should be registered eagerly, the actual service creation (which can involve fetching other services) happens only after the package is fully loaded, eliminating the cycle risk.
Tool windows should also be created outside Initialize(Async)/SetSite, and any required services can be fetched at that point. An example of this pattern can be seen here: Upgrading to Visual Studio 2022 17.12.1 displays error Exception from HRESULT: 0x80049283 – Developer Community In this case the extension is eagerly creating its toolwindow inside its InitializeAsync, this is unnecessary. A separate call will be made by the environment when your toolwindow is needed.
Customer Value
This change was made to reduce instances of services returning null when they should never do so. Such failures are problematic because if the caller doesn’t null-check the result, it could crash Visual Studio. If they do check for null, they must handle the missing service appropriately. Most users opted to disable functionality that depended on the missing service, leading to inconsistent behavior—where Visual Studio sometimes appeared to lack features that worked in previous sessions, even though the code hadn’t changed.
For extenders, this change adds some extra work, but it also eliminates the uncertainty around services occasionally returning null. They can now consistently treat null as either a fatal error or an indication that a feature is unavailable, rather than a strange, self-correcting issue that might resolve with a retry.
Conclusion + Thanks
We understand that changes to “working” code are generally undesirable. Initially, we didn’t expect many people to be affected, but after issues were reported by two internal teams and one external team, we realized a fix we anticipated would be non-disruptive was, in fact, impacting extenders.
This post was directly motivated by user feedback, which alerted us to the issue, rather than users silently adjusting their code without fully understanding the cause, for that we are grateful.
We appreciate the time you’ve spent reporting issues/suggestions and hope you continue to give us feedback when using Visual Studio on what you like and what we can improve. Your feedback is critical to help us make Visual Studio the best tool it can be! You can share feedback with us via Developer Community: report any bugs or issues via report a problem and share your suggestions for new features or improvements to existing ones.
Stay connected with the Visual Studio team by following us on YouTube, Twitter, LinkedIn, Twitch and on Microsoft Learn.