The Azure Event Grid client libraries support distributed tracing for the CloudEvents schema. They populate the Distributed Tracing extension that allows connecting event consumer telemetry to producer calls. The Event Grid documentation shows how to enable tracing in the producer. It also shows how to configure the Event Hubs or Service Bus subscription.
Azure Functions supports distributed tracing with Azure Monitor, which includes built-in tracing of executions and bindings, performance monitoring, and more.
Microsoft.Azure.WebJobs.Extensions.EventGrid package version 3.1.0 or later enables correlation for CloudEvents between producer calls and Functions Event Grid trigger executions as shown on the screenshot below.
If you use an Event Grid trigger with the CloudEvents schema, you only need to update the Microsoft.Azure.WebJobs.Extensions.EventGrid
package to the latest version. In other cases, Functions still trace trigger calls so you can monitor how events are processed. However, the connection between send calls on producer and Function trigger calls is missing in Azure Monitor’s end-to-end trace.
In this post, I’ll show how to enable correlation in .NET and Java Functions for:
- Event Grid schema with Event Grid trigger
- CloudEvents schema with HTTP trigger (webhook)
Get started
Before enabling correlation, make sure you get telemetry from producer and Functions in Azure Monitor:
-
If you publish CloudEvents using Azure Event Grid client library version 4 with distributed tracing enabled, it:
- Populates Distributed Tracing extension on the events.
- Traces outgoing calls
-
If you use another library to publish events and it doesn’t support tracing, see the example of manual instrumentation for Event Grid schema below.
-
Enable tracing on Azure Functions.
-
If you use Event Grid trigger and CloudEvents schema in your Functions, update the Microsoft.Azure.WebJobs.Extensions.EventGrid package to version 3.1.0. You don’t need to change your Functions code to enable correlation.
Connect send and Function calls
To correlate Azure Function executions to producer traces, we’re going to read context from the event and populate it on Azure Functions telemetry. For CloudEvents, we’ll just read it from the Distributed Tracing extension. For the Event Grid schema, we’ll have to read and write custom properties in the data
payload.
.NET
Azure Functions uses System.Diagnostics.Activity under-the-hood. We’ll update the Activity
created by Functions to link the producer context. Links connect independent traces together in the Azure Monitor UX.
CloudEvents with webhook (HTTP) trigger
-
Here, we extract the context from the CloudEvent and link it to the
Activity
. It’s done in the Azure Monitor-specific format:using System.Collections.Generic; using System.Diagnostics; using System.Text.Json; // code omitted for brevity public static void LinkContext(this Activity activity, IDictionary<string, object> extensionAttributes) { if (activity != null && extensionAttributes.TryGetValue("traceparent", out var tp) && tp is string traceparent && IsValidTraceparent(traceparent)) { var link = new AzureMonitorLink ( // parse traceparent according to https://www.w3.org/TR/trace-context/ traceparent.Substring(3, 32), // traceId traceparent.Substring(36, 16) // spanId ); // consider formatting JSON manually for best performance activity.AddTag("_MS.links", JsonSerializer.Serialize(new[] { link })); if (extensionAttributes.TryGetValue("tracestate", out var ts) && ts is string tracestate) { activity.TraceStateString = tracestate; } } } private static bool IsValidTraceparent(string traceparent) => traceparent != null && traceparent.StartsWith("00-") && traceparent.Length == 55; // Property names match Azure Monitor's over-the-wire format. Don't change them. private readonly record struct AzureMonitorLink(string operation_Id, string id);
-
Now we need to call the
LinkContext
method in the Function execution as early as possible.using Azure.Messaging; using System.Diagnostics; // code omitted for brevity [FunctionName("MyFunction")] public static async Task<IActionResult> RunAsync( [HttpTrigger(AuthorizationLevel.Anonymous, "POST", "OPTIONS", Route = "handler")] HttpRequest req, ILogger log) { // handshake, code omitted for brevity var @event = CloudEvent.Parse(BinaryData.FromStream(req.Body)); // Activity can be null if Azure Monitor isn't enabled. Activity.Current?.LinkContext(@event.ExtensionAttributes); // ... }
-
Deploy your Function and trigger it. It takes up to 5 minutes for data to propagate and become accessible through the Azure portal.
-
View your traces in the Azure portal by navigating to Transaction search on the left for your Application Insights resource. Select See all data in the last 24 hours.
The following Azure portal screenshot shows Azure Function consumer calls linked to the producer:
CloudEvents: batch processing
With batching enabled on Event Grid subscription, we’ll receive multiple events at once and link each of them.
-
Let’s add a
LinkContext
method that populates several links toActivity
. Adding multipletracestate
properties is unsupported, but it doesn’t affect correlation between producer and Function.using Azure.Messaging; using System.Collections.Generic; using System.Diagnostics; using System.Linq; using System.Text.Json; // code omitted for brevity public static void LinkContext(this Activity activity, IEnumerable<CloudEvent> events) { if (activity != null && events.Any()) { var links = new List<AzureMonitorLink>(); foreach (CloudEvent @event in events) { if (@event.ExtensionAttributes.TryGetValue("traceparent", out var tp) && tp is string traceparent && IsValidTraceparent(traceparent)) { links.Add(new AzureMonitorLink( traceparent.Substring(3, 32), // traceId traceparent.Substring(36, 16))); // spanId // multiple tracestates are not currently supported. } } activity.AddTag("_MS.links", JsonSerializer.Serialize(links)); } }
-
Now we need to call the
LinkContext
method in Function execution as soon as events are deserialized.using Azure.Messaging; using System.Diagnostics; // code omitted for brevity [FunctionName("BatchWebhook")] public static async Task<IActionResult> RunAsync( [HttpTrigger(AuthorizationLevel.Anonymous, "POST", "OPTIONS", Route = "handler")] HttpRequest req, ILogger log) { // handshake, code omitted for brevity var events = CloudEvent.ParseMany(BinaryData.FromStream(req.Body)); Activity.Current?.LinkContext(events); // ... }
Event Grid schema
The Event Grid schema doesn’t have dedicated properties to propagate trace context. We’ll use custom properties, inject them into the event on the producer side, and then read them in the Function. We’ll use the same approach to link producer trace context to Activity
created by Azure Functions.
-
Update your Event Grid data model definition to include
traceparent
andtracestate
properties:using System.Text.Json.Serialization; // code omitted for brevity internal readonly record struct EventGridData { [JsonPropertyName("traceparent")] public string Traceparent { get; init; } [JsonPropertyName("tracestate")] public string Tracestate { get; init; } // code omitted for brevity }
-
On the producer side, we’ll create a new
Activity
and addtraceparent
andtracestate
to event data.-
If you use the Application Insights SDK, track the new
DependencyTelemetry
using theStartOperation
method. It will create a newActivity
under-the-hood. Inject the context of the newActivity
to the event data.using Azure.Messaging.EventGrid; using Microsoft.ApplicationInsights; using Microsoft.ApplicationInsights.DataContracts; using System.Diagnostics; // code omitted for brevity using (var sendEventDependency = telemetryClient.StartOperation<DependencyTelemetry>("Send Event Grid event")) { sendEventDependency.Telemetry.Type = "InProc"; var eventData = new EventGridData { Traceparent = Activity.Current.Id, Tracestate = Activity.Current.TraceStateString }; var @event = new EventGridEvent("subject", "type", "data-version", eventData); await eventCollector.AddAsync(@event); }
-
If you use OpenTelemetry (experimental support), create a new
Activity
using custom ActivitySource. For more information on usingActivitySource
, see Adding distributed tracing instrumentation. As a note,Activity
can benull
here.using Azure.Messaging.EventGrid; using System.Diagnostics; // code omitted for brevity // make sure to enable this ActivitySource when configuring OpenTelemetry private static ActivitySource source = new ActivitySource("MyEventGridProducer"); // code omitted for brevity using (var sendActivity = source.StartActivity("Send Event Grid event")) { var eventData = new EventGridData { Traceparent = sendActivity?.Id, Tracestate = sendActivity?.TraceStateString }; var @event = new EventGridEvent("subject", "type", "data-version", eventData); await publisherClient.SendEventAsync(@event); }
-
-
Azure Functions consumer changes are similar to the CloudEvents example above. The only difference is how
traceparent
andtracestate
are obtained from thedata
property. Modify this code to use your data model definition.using System.Diagnostics; using System.Text.Json; using System.Text.Json.Serialization; // code omitted for brevity public static void LinkContext(this Activity activity, EventGridData eventData) { if (activity != null && IsValidTraceparent(eventData.Traceparent)) { var link = new AzureMonitorLink(eventData.Traceparent.Substring(3, 32), eventData.Traceparent.Substring(36, 16)); // consider formatting JSON manually for best performance activity.AddTag("_MS.links", JsonSerializer.Serialize(new[] { link })); activity.TraceStateString = eventData.Tracestate; } }
-
Call
LinkContext
method in Function execution as early as possible.using Azure.Messaging; using System.Diagnostics; // code omitted for brevity [FunctionName("EventGridFunction")] public void RunEventGrid([EventGridTrigger] EventGridEvent @event, ILogger log) { Activity.Current?.LinkContext(@event.Data.ToObjectFromJson<EventGridData>()); // ... }
The following screenshot shows Azure Function consumer calls correlated with producer in the Transaction diagnostics. In this case, the producer is instrumented with OpenTelemetry:
Java
Azure Functions supports distributed tracing for bindings in Java without extra configuration. Azure Monitor preview support enables collection of custom and rich telemetry from Java Functions. We’ll need it to correlate Azure Functions and event producer.
-
Enable Azure Monitor for Java Function apps (preview)
-
Add a dependency on the OpenTelemetry API package: io.opentelemetry:opentelemetry-api. For more information, see OpenTelemetry documentation.
-
Obtain an OpenTelemetry tracer instance. We’ll use it to start a new span.
import io.opentelemetry.api.GlobalOpenTelemetry; import io.opentelemetry.api.trace.Tracer; // code omitted for brevity private final static Tracer TRACER = GlobalOpenTelemetry.getTracer("my-function");
The examples below use com.azure.core.models.CloudEvent
and com.azure.messaging.eventgrid.EventGridEvent
models from Azure SDKs. You can get them by adding a dependency on com.azure:azure-messaging-eventgrid.
If you use different implementations, you might need to adjust these examples for your use case.
CloudEvents schema
The Microsoft.Azure.WebJobs.Extensions.EventGrid
package (version 3.1.0 or later) enables correlation for CloudEvents within Event Grid triggers on Java workers. Check if the extension bundle you use includes this support. You may also update Microsoft.Azure.WebJobs.Extensions.EventGrid
by switching to explicit extension installation; however, using extension bundles is recommended.
If you can’t update Microsoft.Azure.WebJobs.Extensions.EventGrid
yet, you can still enable correlation using the following steps:
-
Add a helper class instance that writes the trace context to
EventGridEvent
:import io.opentelemetry.context.propagation.TextMapGetter; // code omitted for brevity private static final Iterable<String> KEYS = List.of("traceparent", "tracestate"); private static final TextMapGetter<Map<String, Object>> CLOUD_EVENT_GETTER = new TextMapGetter<Map<String, Object>>() { @Override public Iterable<String> keys(Map<String, Object> carrier) { return KEYS; } @Override public String get(Map<String, Object> carrier, String key) { return carrier.get(key).toString(); } };
-
We’ll read events from string input here. Since there could be a batch of events, depending upon Event Grid subscription configuration, we’ll get the trace context from each of them. We can’t modify telemetry reported by the Azure Functions runtime here. So we’ll create a new span and link trace contexts from all of the events.
import com.azure.core.models.CloudEvent; import io.opentelemetry.api.trace.Span; import io.opentelemetry.api.trace.SpanBuilder; import io.opentelemetry.api.trace.StatusCode; import io.opentelemetry.api.trace.propagation.W3CTraceContextPropagator; import io.opentelemetry.context.Context; import io.opentelemetry.context.Scope; // code omitted for brevity @FunctionName("CloudEvent") public void processCloudEvents(@EventGridTrigger(name="eventsStr") String eventsStr, final ExecutionContext context) { List<CloudEvent> events = CloudEvent.fromString(eventsStr); SpanBuilder spanBuilder = TRACER.spanBuilder("Process CloudEvents"); events.stream().forEach(event -> { // extract trace context from the event using OpenTelemetry propagator. Context eventContext = W3CTraceContextPropagator.getInstance(). extract(Context.current(), event.getExtensionAttributes(), CLOUD_EVENT_GETTER); spanBuilder.addLink(Span.fromContext(eventContext).getSpanContext()); }); Span span = spanBuilder.startSpan(); try (Scope scope = span.makeCurrent()) { // process events here } catch (Throwable t) { span.setStatus(StatusCode.ERROR); throw t; } finally { span.end(); } }
If you’d like to trace each event in the batch separately, you can modify this example to create a span for each event. We’re rethrowing the exception here. Azure Functions will record it. If you don’t want to rethrow the exception, you’ll probably want to record it with span.recordException(ex)
.
Event Grid schema
Similarly to the .NET example, we’ll use custom properties in event data. Those properties will be populated on the event producer. On the consumer side, we’ll start a new span and link it to the producer trace context.
-
Update your Event Grid data model definition to include
traceparent
andtracestate
properties:import com.fasterxml.jackson.annotation.JsonProperty; // code omitted for brevity static class EventGridData { @JsonProperty("traceparent") public String traceparent; @JsonProperty("tracestate") public String tracestate; // code omitted for brevity }
-
Add a helper class instance that reads trace context from
EventGridData
:import io.opentelemetry.context.propagation.TextMapSetter; // code omitted for brevity private static final TextMapSetter<EventGridData> EVENT_GRID_SETTER = new TextMapSetter<EventGridData>() { @Override public void set(EventGridData carrier, String key, String value) { if ("traceparent".equals(key)) { carrier.traceparent = value; } else if ("tracestate".equals(key)) { carrier.tracestate = value; } } };
-
Add
traceparent
andtracestate
to event data on the producer side:private void sendEventGridEvent() { // change this to your event data model EventGridData eventData = new EventGridData(); Span span = TRACER.spanBuilder("Send Event Grid event").startSpan(); try (Scope unused = span.makeCurrent()) { // inject context into EventGridData W3CTraceContextPropagator.getInstance().inject(Context.current(), eventData, EVENT_GRID_SETTER); eventGridClient.sendEvent(new EventGridEvent("subject", "type", BinaryData.fromObject(eventData), "data-version")); } catch (Throwable t) { span.setStatus(StatusCode.ERROR); throw t; } finally { span.end(); } }
-
Similar to the CloudEvents example, read the trace context from the event data and trace event processing:
import com.azure.messaging.eventgrid.EventGridEvent; // code omitted for brevity @FunctionName("EventGridEvent") public void processEventGridEvents(@EventGridTrigger(name="eventsStr") String eventsStr, final ExecutionContext context) { List<EventGridEvent> events = EventGridEvent.fromString(eventsStr); SpanBuilder spanBuilder = TRACER.spanBuilder("Process EventGridEvents"); events.stream().forEach( event -> { EventGridData data = event.getData().toObject(EventGridData.class); Context eventContext = W3CTraceContextPropagator.getInstance(). extract(Context.current(), data, EVENT_GRID_GETTER); spanBuilder.addLink(Span.fromContext(eventContext).getSpanContext()); }); Span span = spanBuilder.startSpan(); try (Scope scope = span.makeCurrent()) { // process events here } catch (Throwable t) { span.setStatus(StatusCode.ERROR); throw t; } finally { span.end(); } } private static final TextMapGetter<EventGridData> EVENT_GRID_GETTER = new TextMapGetter<EventGridData>() { @Override public Iterable<String> keys(EventGridData carrier) { return KEYS; } @Override public String get(EventGridData carrier, String key) { if ("traceparent".equals(key)) { return carrier.traceparent; } else if ("tracestate".equals(key)) { return carrier.tracestate; } return null; } };
The following screenshot shows Azure Function consumer calls correlated with the producer in the Transaction viewer. It shows the case in which batching is configured on the Event Grid subscription. The Functions runtime (with Microsoft.Azure.WebJobs.Extensions.EventGrid
version 2) tracks a single Function execution that results in three calls to the Java worker.
Want to hear more?
Thanks for reading this Azure SDK blog post. What do you think of distributed tracing in the Azure SDK? We’re actively seeking feedback on this feature, so let us know!
0 comments