Load more VSTs : Effectively removing the FLS Slot allocation limit in Windows 10

Pete Brown

Pete

Not all limitations are great for creativity

Back in Windows 10 1903, we increased the number of FLS (Fiber Local Storage) slots for Windows 10 applications. This effectively removed a limit on the number of plugins a DAW could load.

I talked a bit about this briefly in the DAWbench radio show, but felt it could use some more explanation.

DAWBench Radio Show Episode 6 Main Page . Be sure to check it out, as well as the other episodes on Spotify, iTunes, and more.

Fiber Local Storage Slots

A Fiber is a lightweight type of thread that is cooperatively multi-tasked. Threads have Thread Local Storage, which is a place where, per-thread, one can store variables and data associated with that thread, and scoped to that thread. Fibers have a similar concept called Fiber Local Storage.

In computer science, everything that takes space or other resources has limits, either explicit or practical. In the case of FLS Slots in Windows (going back to at least XP), that explicit limit was 128 per-process. That means, each application could have, at most, 128 FLS slots allocated. Once a fiber in that application tried to allocate a 129th slot, an error would be returned. This is a limit that almost no Windows applications ever get near to hitting. Well, that is, no applications except for applications that load dozens or hundreds of DLLs they have no development control over.

I used that behavior to produce a sample that would report on the number of FLS Slots available in a process. The code itself is super simple:

#include "pch.h"
#include <iostream>
#include <Windows.h>

int main()
{
    std::cout << "Testing available FLS slots\n";

    DWORD result = 0;
    UINT count = 0;

    while (result != FLS_OUT_OF_INDEXES)
    {
        result = FlsAlloc(NULL);
        count += 1;
    }

    std::cout << "Out of slots at attempt " << count << "\n";
}

When I ran it on my PC, this was the output:

D:\Github\FLSReport\x64\Debug\>FLSReport.exe
Testing available FLS slots
Out of slots at attempt 4075

What uses the slots

You may wonder what is using the slots in a DLL. Great question! The slots could be allocated from any bit of code in the process, including libraries and your own code. In this case, it’s the Visual C runtime which allocates FLS slots (or or two per instance, depending upon version). This is true regardless of how it is linked in the project, and is per instance of the loaded and executing runtime code.

Note: When you have multiple instances of a plugin in the DAW, you still have only one loaded instance of the plugin in the process. As a result, each instance of the plugin doesn’t use more slots.

If all the plugins in the process are dynamically linked to the same version of the runtime, they will collectively only load one instance of that runtime into the process.

If each of the plugins in the process are statically linked to the runtime, each dll will cause an instance of the runtime (well, a subset of the runtime) to load. Remember, each of those instances allocates FLS slots.

Now imagine a big plugin suite, which has lots of different modules. Maybe the developer followed a smart modular approach that only loads the ones needed. But if each of the loaded modules was statically linked to the runtime, they too will take up FLS slots. In some cases, we were seeing a single plugin suite using over 40 slots.

When you have a max of 128 slots, and the DAW (or other plugin host) uses some itself, you can very quickly run out of allocation. When this happens, the DLL would fail to load, and the host would throw an error, or would fail silently.

Why this matters to musicians and music app developers

As mentioned above, the limit of 128 Fiber Local Storage slots has existed in Windows for a long time. This wasn’t something new to Windows 8 or Windows 10. However, most musicians hadn’t previously run into this because they would hit processing and latency limits before they hit a limit on the number of unique plugins they could use at once.

With modern (powerful) processors, musicians can now have many more live and active tracks, each with different plugins loaded. The increase in large multi-function plugins like synth and sample libraries, and mastering suites, has also increased the number of DLLs in place in a single process.

This is why musicians were just really starting to hit the limit. Enough were, in fact, that one developer produced a nice FLS Slot plugin for DAWs which showed how many slots were left. This plugin still works, but it stops checking at 128 free slots. The limit was really causing a stir in the musician community, because folks would spend top money for top of the line PCs, only to find they still had to freeze tracks, or jump through other hoops due to this limit.

What we changed

First, we spoke to DAW and plugin developers to encourage them to dynamically link the Visual C runtimes in their products. This made a real difference, especially with DAW developers who optimized their product to minimize instances of the runtime. In several cases, the number of available FLS slots for loading plugins increased by double digits. However, many of us use plugins which aren’t actively updated or maintained, or which are unlikely to change their behavior in a version that is a free upgrade for current users, so we had to do something which would not require anyone to change any code or linking practices.

After that, we looked at the kernel code for this. I don’t personally mess with kernel code (this is a good thing), but I know folks who do.

As it turned out, the existing limitations were baked through several parts of the FLS source in the kernel, and were not simple to adjust up or down. So, the developer rewrote the FLS Slot storage and allocation/deallocation code to be more robust and extensible in the future. We picked a max number of slots for now (around 4k) with the understanding that any existing limit will eventually be hit, so it had to be much easier to change in the future. So, once we hit the 4k slot mark in an application (or get close) we can easily increase that limit to something larger.

You may wonder why we’d have limits at all. Having limits today helps prevent runaway allocations or other crazy things that buggy or malicious code might do. Allocating 4k FLS slots isn’t going to cause issues. An app allocating a few billion slots probably isn’t something we’d want to see happen.

So, nice new fresh code, well-written, easily updated in the future, and with reasonable smart limits. It brings a tear to my eye.

As a musician, the only thing you need to do to take advantage of this, is running a recent version of Windows 10 (1903 or later).

If you are developing DLL plugins for applications, we recommend that you dynamically link the runtime, even on Windows 10 1903 and higher. Reasons for that include:

  • Less duplicated code in the DAW process, so a smaller working set
  • Easier runtime bug fixes without recompiling and redistribution
  • Less runtime initialization code running

The trade-off is that your installer becomes slightly more complex because it has to check for and install the runtime if missing. Most installers include optional steps in their templates which will handle this for you as long as you specify the version. For those which do not, you can learn more about deployment here:

0 comments

Leave a comment