Businesses of every size and shape have a need to better understand their customers, their systems, and the impact of external factors on their business. How rapidly businesses mitigate risks and capitalize on opportunities can set apart successful businesses from businesses that can’t keep up. Anomaly detection—or in broader terms, outlier detection—allows businesses to identify and take action on changing user needs, detect and mitigate malignant actors and behaviors, and take preventive actions to reduce costly repairs.The speed at which businesses identify anomalies can have a big impact on response times, and in turn, associated costs. For example, detecting a fraudulent financial transaction in hours or days after it happens often results in writing off the financial loss. The ability to find the anomalous transaction in seconds allows for the invalidation of the transaction and corrective actions to prevent future fraud. Similarly, by detecting anomalies in industrial equipment, manufacturers can predict and prevent catastrophic failures that could cause capital and human loss by initiating proactive equipment shutdowns and preventative maintenance. Likewise, detecting anomalous user behavior (for example, sign-in into multiple accounts from the same location/device) can prevent malignant abuse, data breaches, and intellectual property theft.In essence, anomalous events have an immediate value. If you don’t seize that value, it vanishes into irrelevance until there’s a large enough collection of events to perform retrospective analysis. (See image below for an illustration of that concept.) To avoid falling off this “value cliff,” many organizations are looking to stream analytics to provide a real-time anomaly detection advantage.At Google Cloud, our customer success teams have been working with an increasing number of customers to help them implement streaming anomaly detection. In working with such organizations to help them build anomaly detection systems, we realized that providing these reference patterns can significantly reduce the time to solution for those and future customers.Reference patterns for streaming anomaly detectionReference patterns are technical reference guides that offer step-by-step implementation and deployment instructions and sample code. Reference patterns mean you don’t have to reinvent the wheel to create an efficient architecture. While some of the specifics (e.g., what constitutes an anomaly, desired sensitivity level, alert a human vs. display in a dashboard) depend on the use case, most anomaly detection systems are architecturally similar and leverage a number of common building blocks. Based on that learning, we have now released a set of repeatable reference patterns for streaming anomaly detection to the reference patterns catalog (see the anomaly detection section).These patterns implement the following step-by-step process:Stream events in real time Process the events, extract useful data points, train the detection algorithm of choiceApply the detection algorithm in near-real time to the events to detect anomaliesUpdate dashboards and/or send alertsHere’s an overview of the key patterns that let you implement this broader anomaly detection architecture:Detecting network intrusion using K-means clusteringWe recently worked with a telecommunications customer to implement streaming anomaly detection for Netflow logs. In the past, we’ve seen that customers have typically implemented signature-based intrusion detection systems. Although this technique works well for known threats, it is difficult to detect new attacks because no pattern or signature is available. This is a significant limitation in times like now, when security threats are ever-present and the cost of a security breach is significant. To address that limitation, we built an unsupervised learning-based anomaly detection system. We also published a detailed writeup: Anomaly detection using streaming analytics and AI. The following video gives a step-by-step overview of implementing the anomaly detection system. Keep in mind that the architecture and steps in the video can be applied to other problem domains as well, not just network logs. Detecting fraudulent financial transactions using Boosted TreesWhile the previous pattern used a clustering algorithm (trained in BigQuery ML), the finding anomalies in financial transactions in real time using Boosted Trees pattern uses a different ML technique called BoostedTrees. BoostedTrees is an ensemble technique that makes predictions by combining output from a series of base models. This pattern follows the same high-level architecture and uses Google Cloud AI Platform to perform predictions. One of the neat things in the reference pattern is the use of micro-batching to group together the API calls to the CAIP Prediction API. This ensures that a high volume of streaming data does not necessarily result in API quota issues. Here’s what the architecture looks like:Time series outlier detection using LSTM autoencoderMany anomaly detection scenarios involve time series data (a series of data points ordered by time, typically evenly spaced in time domain). One of the key challenges with time series data is that it needs to be preprocessed to fill any gaps (either due to source or transmission problems) in data. Another common requirement is the need to aggregate metrics (e.g., Last, First, Min, Max, Count values) from the previous processing window when applying transforms to the current time window. We created a Github library to solve these problems for streaming data and jump-starts your implementation for working with time series data. These patterns are driven by needs we’ve seen in partnering with customers to solve problems. The challenge of finding the important insight or deviation in a sea of data is not unique to any one business or industry; it applies to all. Regardless of where you are starting, we look forward to helping you on the journey to streaming anomaly detection. To get started, head to the anomaly detection section in our catalog of reference patterns. If you have implemented a smart analytics reference pattern, we want to hear from you. Complete this short survey to let us know about your experience.
Quelle: Google Cloud Platform
Published by