Optimizing large payloads in Fluid Framework

Fluid Framework is a platform for building apps that allows multiple people to work together on the same document or project in real-time. This makes collaboration smoother and more efficient, as changes are instantly visible to everyone involved. When multiple users co-author a document, each change made by any user needs to be communicated to all other users in real-time.

The data sent over wire (aka payload) can get very large, especially if the document is complex or contains a lot of data. For example, imagine a team working on a large report with numerous images, charts, and tables. Fluid already optimizes real-time collab by only sending deltas over the wire. The large payload problem is a small set of cases where a large amount of data is updated at once (i.e. a user pasted a huge table). This can result in significant data being transferred over the wire, leading to situations where clients can get throttled by the server, leading to numerous performance and data loss problems.

Fluid Framework offers three features which can be used to work around large payload scenarios and getting throttled by server: Grouped Batching, Batch Compression, and Compressed Batch Chunking. These features are all enabled by default and are currently not publicly configurable.

What are batches?

Let’s first understand batches. In context of Fluid, each change from user is considered an operation (‘op’ for short) and Batching is a way in which the framework accumulates and applies ops. A batch is a group of ops accumulated within a single JS turn, which will be broadcasted to all the other connected clients and then subsequently applied synchronously by the other clients. Additional logic and validation ensure that batches are never interleaved, nested or interrupted. By default, Fluid Framework flushes batches at the end of the JS turn for the following reasons:

it is more efficient from an I/O perspective, as batching ops overall decrease the number of payloads sent to the server
reduces concurrency related bugs, as it ensures that all ops generated within the same JS turn are also applied by all other clients within a single JS turn. Clients using the same pattern can safely assume ops will be applied exactly as they are observed locally.

Grouped batching

With Grouped Batching, all batch messages are combined under a single “grouped” message before compression. Upon receiving this new “grouped” message, the batch messages will be extracted, and they each will be given the same sequence number – that of the parent “grouped” message.

The purpose for enabling grouped batching before compression is to eliminate the empty placeholder messages in the chunks. These empty messages are not free to transmit, can trigger service throttling, and in extreme cases can still result in a batch too large (from empty op envelopes alone).

Grouped batching is opaque to the server and implementations of the Fluid protocol do not need to alter their behavior to support this client feature.

Op Compression

Compression targets payloads which exceed the max batch size, because compression may not yield any benefits if the payload is too small. For example, compressing text with a single character will produce a compression artifact larger than a single character.

The current service payload size limit for Fluid Framework is 1MB. After some testing against internal partners, we landed on 614,400 bytes (600KB) as an ideal balance between compression improvements and time spent compressing.

Compression only targets the contents of the ops and not the number of ops in a batch. It is opaque to the server and implementations of the Fluid protocol do not need to alter their behavior to support this client feature.

Lastly, compressing a batch yields a batch with the same number of messages. It compresses all the content, shifting the compressed payload into the first op, leaving the rest of the batch’s messages as empty placeholders to reserve sequence numbers for the compressed messages.

Chunking for compression

Op chunking for compression targets payloads which exceed the max batch size after compression. So, only payloads which are already compressed will be chunked. The default chunk size is set to 204,800 bytes and is not publicly configurable. This value represents both the size of the chunks and the threshold for the feature to activate. When this threshold is reached, the large op is split into smaller ops (chunks). The value of this config property enables a trade-off between large chunks / few ops and small chunks / many ops. Chunking is opaque to the server and implementations of the Fluid protocol do not need to alter their behavior to support this client feature.

Closure

In terms of performance and impact on latency, the results greatly depend on payload size, payload structure, network speed and CPU speed.

In general, compression offers a trade-off between higher compute costs, lower bandwidth consumption and lower storage requirements, while chunking slightly increases latency due to the overhead of splitting an op, sending the chunks and reconstructing them on each client. Grouped batching heavily decreases the number of ops observed by the server and slightly decreases the bandwidth requirements as it merges all the ops in a batch into a single op and also eliminates the op envelope overhead.

Developers can learn and read more about these features with examples on GitHub.

Fluid Framework 2 is production-ready now! We are excited to see all the collaborative experiences that you will build with it.

Visit the following resources to learn more:

Get Started with Fluid Framework 2
Sample app code using SharedTree DDS and SharePoint Embedded
For questions or feedback, reach out via GitHub
Connect directly with the Fluid team, we would love to hear what you are building!
Follow @FluidFramework on X (Twitter) to stay updated