Perform advanced analytics on Application Insights data using Jupyter Notebook

To help you leverage your telemetry data and better monitor the behavior of your Azure applications, we are happy to provide a Jupyter Notebook template that extends the power of Application Insights. Instead of making ad hoc queries in the Application Insights portal when an issue arises, you can now write a Jupyter Notebook that routinely queries for telemetry data, performs advanced analytics, and sends the derived data back to Application Insights for monitoring and alerting. You can execute the Jupyter Notebook using Azure WebJob either on a schedule or via webhook.

Through this approach, you can manipulate and analyze your telemetry data beyond the constraints of query language or limit. You can take advantage of the existing alerting system to monitor the newly derived data, rather than raw instrumentation data. The derived data can also be correlated with other metrics for root cause analysis, used to train machine learning models, and much more. In this blog post, you will find a step-by-step guide for operationalizing this template to perform advanced analytics on your telemetry data, as well as an example implementation.

Create a Jupyter Notebook

Create a new Notebook or clone the template. While Jupyter supports various programming languages, this blog post focuses on performing advanced analytics in Python 2.7.

Query for telemetry data from Application Insights

To query for telemetry data from an Application Insights resource, the Application ID and an API Key are needed. Both can be found in Application Insights portal, on the API Access blade and under Configure.

!pip install –upgrade applicationinsights-jupyter

from applicationinsights_jupyter import Jupyter

API_URL = "https://api.aimon.applicationinsights.io/"
APP_ID = "REDACTED"
API_KEY = "REDACTED"
QUERY_STRING = "customEvents
| where timestamp >= ago(10m) and timestamp < ago(5m)
| where name == 'NodeProcessStarted'
| summarize pids=makeset(tostring(customDimensions.PID)) by cloud_RoleName, cloud_RoleInstance, bin(timestamp, 1m)"

jupyterObj = Jupyter(APP_ID, API_KEY, API_URL)
jupyterObjData = jupyterObj.getAIData(QUERY_STRING)

Get more information by accessing the API.

Send derived data back to Application Insights

To send data to an Application Insights resource, the Instrumentation Key is needed. It can be found in Application Insights portal, on the Overview blade.

!pip install applicationinsights

from applicationinsights import TelemetryClient

IKEY = "REDACTED"
tc = TelemetryClient(IKEY)

tc.track_metric("crashCount", 1)
tc.flush()

Get more information by accessing the API.

Execute the Notebook using Azure WebJob

To execute the Notebook using Azure WebJob, the Notebook, its dependencies, and the Jupyter server need to be uploaded onto an Azure App Service container.

Prepare the necessary resources

Download the Notebook onto your machine.
Install the Jupyter server using Anaconda.
Execute the Notebook on your machine to install all dependencies, as App Service container does not allow changes to the directories where the modules would otherwise be installed automatically.
Update the path in a dependency to reflect App Service container’s directory. Replace the first script in Anaconda2/Scripts/jupyter-nbconvert-script.py with
#!D:/home/site/wwwroot/App_Data/resources/Anaconda2python.exe
Update the local copy of the Notebook, excluding pip commands.
Create run.cmd file containing the following script
D:homesitewwwrootApp_DataresourcesAnaconda2Scriptsjupyter nbconvert –execute <Your Notebook Name>.ipynb

FTP resources

Obtain deployment credentials and FTP connection information.
FTP the Anaconda2 folder to a new directory in App Service container
D:homesitewwwrootApp_Dataresources

Operationalize the Notebook

Create a new Azure WebJob and upload the Notebook and run.cmd file.

An example implementation

We operationalized this template and have been performing advanced analytics on telemetry data of one of our own services.

Our service runs four Node.js processes on each cloud instance. From root cause analysis, we have noticed cases of Node.js crashes. However, due to limitations of the SDK, we cannot log when the crash occurs. So, we created a Jupyter Notebook to analyze the existing telemetry data to detect Node.js crashes.

A custom event NodeProcessStarted is logged when a new Node.js process starts in a cloud instance. Normally, all four processes start nearly simultaneously when they are recycled every 8-11 hours. So, when we see less than four NodeProcessStarted events occur at a different frequency, we can infer that new process(es) started to replace recently crashed process(es).

In this implemented template, you will see how we query for telemetry data, analyze the data, query for more telemetry data to enrich the analysis, and then send the derived data back to Application Insights.

 

We hope this template helps you derive actionable insights from telemetry data and better manage your Azure applications.
Quelle: Azure

Published by