December 28th, 2012

The cost of context switches

Andrew Arnott
Principal Software Engineer

Context switches are not free. But how expensive are they? I wrote a small program to find out, and I’m sharing the program and its results here.

I focused on purely context switches (no work is actually performed between context switches). So it’s not a real-world scenario, but it really brings out the hidden costs. Below are the results 500,000 context switches performing no work between each one.

Executing 500000 work cycles of 0 iterations each, in different ways...
Scenario            Total time (ms)     Time per unit (µs)
No-switch           0                   0.0002
Async w/o yield     67                  0.0353
Async w/ yield      664                 0.349
Thread switches     5215                2.7412

Notice how with each kind, the order of magnitude of the overhead increases. The code below will help you understand what each each scenario name actually means. Then we add a bit of work (counting to 500) per context switch, which is closer to a possible real-world work load (although relatively lightweight) that might occur for a given context:

Executing 500000 work cycles of 500 iterations each, in different ways...
Scenario            Total time (ms)     Time per unit (µs)
No-switch           380                 0.1998
Async w/o yield     368                 0.1935
Async w/ yield      832                 0.4374
Thread switches     5185                2.7257

Suddenly no context switch and async methods all share an order of magnitude, while thread switches still takes significantly longer. In fact closely comparing shows that Async w/o yield is faster than no switch at all. This of course is ludicrous and can be written off as noise. But several runs produced the same result, so we can glean from this that when doing even a small amount of work per context switch, that the no-yield async method adds insignificant overhead.

Following is the application that produced the above results.

using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;

class Program {
	const int unitSize = 500;
	const int workSize = 500000;
	const string spacing = "{0,-20}{1,-20}{2,-20}";

	private static void Main(string[] args) {
		Console.WriteLine("Executing {0} work cycles of {1} iterations each, in different ways...", workSize, unitSize);

		Console.WriteLine(spacing, "Scenario", "Total time (ms)", "Time per unit (μs)");
		Scenario("No-switch", DoSync);
		Scenario("Async w/o yield", DoAsyncNoYield);
		Scenario("Async w/ yield", DoAsyncWithYield);
		Scenario("Thread switches", ThreadSwitch);
	}

	static void Scenario(string name, Action operation) {
		GC.Collect();
		operation(); // warm it up
		var timer = Stopwatch.StartNew();
		operation();
		timer.Stop();
		Console.WriteLine(spacing, name, timer.ElapsedMilliseconds, MicroSecondsPerItem(timer));
	}

	static void ThreadSwitch() {
		int workRemaining = workSize;
		var evt = new AutoResetEvent(true);
		ThreadStart worker = () => {
			while (workRemaining > 0) {
				evt.WaitOne();
				workRemaining--;
				WorkUnit();
				evt.Set();
			}
		};

		var threads = new Thread[Environment.ProcessorCount];
		for (int i = 0; i < threads.Length; i++) {
			threads[i] = new Thread(worker);
			threads[i].Start();
		}

		for (int i = 0; i < threads.Length; i++) {
			threads[i].Join();
		}
	}

	static void DoAsyncNoYield() {
		var tcs = new TaskCompletionSource<object>();
		tcs.SetResult(null);
		var task = tcs.Task;
		Task.Run(
			async delegate {
				int workRemaining = workSize;
				while (--workRemaining >= 0) {
					await NoYieldHelper(task);
				}
			}).Wait();
	}

	static async Task NoYieldHelper(Task task) {
		WorkUnit();
		await task;
	}

	static void DoAsyncWithYield() {
		Task.Run(
			async delegate {
				int workRemaining = workSize;
				while (--workRemaining >= 0) {
					WorkUnit();
					await Task.Yield();
				}
			}).Wait();
	}

	static void DoSync() {
		int workRemaining = workSize;
		while (--workRemaining >= 0) {
			WorkUnit();
		}
	}

	private static double MicroSecondsPerItem(Stopwatch timer) {
		var ticksPerItem = (double)timer.ElapsedTicks / workSize;
		var microSecondsPerItem = TimeSpan.FromTicks((long)(ticksPerItem * 1000)).TotalMilliseconds;
		return microSecondsPerItem;
	}

	static void WorkUnit() {
		for (int i = 0; i < unitSize; i++) {
		}
	}
}

Author

Andrew Arnott
Principal Software Engineer

Principal Software Engineer and OSS contributor. Visual Studio Platform.

0 comments

Discussion are closed.