Run Powerful Interactive Analytics Queries against Azure Service Fabric’s Internal Traces and Diagnostic Data
In this post, Sr. Consultant Kurt Schenk explains how to instrument your SF Cluster to send diagnostic data to Log Analytics.
CollectServiceFabricData (https://github.com/microsoft/CollectServiceFabricData) is a fantastic tool created and used by the Service Fabric Support team to easily ingest Azure Service Fabric traces and diagnostic data into your Log Analytics workspace, or Azure Data Explorer instance. Once you have done this, you can run powerful interactive analytics queries.
This Service Fabric traces and diagnostic data is stored in the Azure Storage account that was configured in ARM. For example:
"storageAccountKey": "[listKeys(resourceId('Microsoft.Storage/storageAccounts', variables('applicationDiagnosticsStorageAccountName')),'2015-05-01-preview').key1]",
The Fabric Diagnostics Collection Agent (FabricDCA) which runs on each Service Fabric node then sends the traces and diagnostic data below to this Azure Storage account:
- service fabric detailed .dtr logs in .zip. (fabriclogs-*)
- service fabric counter .blg files. (fabriccounters-*)
- service fabric fabric exceptions .dmp files. (fabriccrashdump-*)
- service fabric events stored in Azure blob tables
- service fabric setup .trace files
This data, especially detailed .dtr logs, are the most verbose traces for Service Fabric. So, first recommend setting up monitoring and diagnostics described here: https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-diagnostics-overview. And first troubleshooting (for Windows) using Perfmon counters, Windows Event Logs (Microsoft-Service Fabric/Admin, Microsoft-Service Fabric/Operational), ETW Event Sources (Microsoft-ServiceFabric-Services, Microsoft-ServiceFabric-Actors), etc., which you can collect with your monitoring tools. For example, Azure SaaS offerings like Azure Application Insights, Azure Log Analytics; or 3rd party solutions such as Splunk, ElasticSearch, and others. But having access to the detailed traces and diagnostic data that CollectServiceFabricData can capture is critical when this is not sufficient.
The CollectServiceFabricData GitHub repo includes example queries for both Azure Log Analytics and Azure Data Explorer: https://github.com/microsoft/CollectServiceFabricData/tree/master/docs. Although there are more examples for Azure Data Explorer (aka. Kusto internally at Microsoft), since the Kusto query language is used for both, you can use Kusto query examples (with small modifications) for Azure Log Analytics.
You can run this tool by downloading the latest release at https://github.com/microsoft/CollectServiceFabricData/releases. And all source is available if you want to review, or compile yourself.
Please try CollectServiceFabricData and share any issues or pull requests at https://github.com/microsoft/CollectServiceFabricData.