Februar 2019 - Seite 45 von 84 - Cloud Computing Köln

Have you ever had to manually upload and tag a lot of files? It’s no fun. Increasingly though, machine learning algorithms can help you or your team classify and tag large volumes of content automatically. And if your company uses Box, a popular file sharing, storage and collaboration service, you can now apply Google ML services to your files with just a few lines of code, using the Box Skills Kit, a new framework within Box’s developer toolkit.With technologies like image recognition, speech-to-text transcription, and natural language understanding, Google Cloud makes it easy to enrich your Box files with useful metadata. For example, if you have lots of images in your repository, you can use the Cloud Vision API to understand more about the image, such as objects or landmarks in an image, or in documents, —or even parse their contents and identify elements that determine the document’s category. If your needs extend beyond functionality provided by Cloud Vision, you can point your Skill at a custom endpoint that serves your own custom-trained model.An example integration in actionNow, let’s look at an example. Many businesses use Box to store images of their products. With the Box Skills Kit and the product search functionality in the Cloud Vision API, you can automatically catalog these products. When a user uploads a new product image into Box, the product search feature within the Vision API helps identify similar products in the catalog, as well as the maximum price for such a product.Configuring and deploying a product search Box skillLet’s look at how you can use the Box Skills Kit to implement the use case outlined above.1.Create an endpoint for your Skill a. Follow this QuickStart guide. b. You can use this API endpoint to call a pre-trained machine learning model to classify new data. c. Create a Cloud Function to point your Box Skill at the API endpoint created above. d. Clone the following repository. e. Next, follow the instructions to deploy the function to your project. f. Make a note of the endpoint’s URI.2.Configure a Box Custom Skills App in Box, then configure it to point to the Cloud Function created above. a. Follow the instructions. b. Then these instructions.And there you have it. You now have a new custom Box Skill enabled by Cloud AI that’s ready to use. Try uploading a new image to your Box drive and notice that maximum retail price and information on similar products are both displayed under the “skills” console.Using your new SkillNow that you’re all set up, you can begin by uploading an image file of household goods, apparel, or toys into your Box drive. The upload triggers a Box Skill event workflow, which calls a Cloud Function you deployed in Google Cloud and whose endpoint you will specify in the Box Admin Console. The Cloud Function you create then uses the Box Skills kit’s FileReader API to read the base64-encoded image string, automatically sent by Box when the upload trigger occurs. The Function then calls the product search function of Cloud Vision, and creates a Topics Card with data returned from the product search function. Next, it creates a Faces card in which to populate a thumbnail that it scaled from the original image. Finally, the function persists the skills card within Box using the skillswriter API. Now, you can open the image in Box drive and click on the “skills” menu (which expands, when you click the “magic wand” icon on the right), and you’ll see product catalog information, with similar products and maximum price populated.What’s next?Over the past several years, Google Cloud and Box have built a variety of tools to make end users more productive. Today, the Box Skills integration opens the door to a whole new world of advanced AI tools and services: in addition to accessing pre-trained models via the Vision API, Video Intelligence API or Speech-to-Text API, data scientists can train and host custom models written in TensorFlow, sci-kit learn, Keras, or PyTorch on Cloud ML Engine. Lastly, Cloud AutoML lets you train a model on your dataset without having to write any code. Whatever your levels of comfort with code or data science, we’re committed to making it easy for you to run machine learning-enhanced annotations on your data.You can find all the code discussed in this post and its associated documentation in its GitHub repository. Goodbye, tedious repetition! Hello, productivity.
Quelle: Google Cloud Platform

14. Februar 2019

da Agency

AI in Depth: serving a Keras text classifier with preprocessing using Cloud ML Engine

Cloud ML Engine now supports deploying your trained model with custom online prediction Python code, now in beta. In this blog post, we show how custom online prediction code helps maintaining affinity between your preprocessing logic and your model, which is crucial to avoid training-serving skew. We show an example of building a Keras text classifier, and deploying it for online serving in Cloud ML Engine, along with its text preprocessing components.Cloud ML Engine pre-processing, training, and classification diagramBackgroundThe hard work of building a machine learning (ML) model pays off only when you deploy the model and use it in production—when you integrate it into your pre-existing systems or incorporate your model into a novel application. If your model has multiple possible consumers, you might want to deploy the model as an independent, coherent microservice that is invoked via a REST API that can automatically scale to meet demand. Although Cloud ML Engine may be better known for its training abilities, it can also serve TensorFlow, Keras, scikit-learn, and XGBoost models with REST endpoints for online prediction.While training that model, it’s common to transform the input data into a format that improves model performance. But when performing predictions, the model expects the input data to already exist in that transformed form. For example, the model might expect a normalized numerical feature, for example TF-IDF encoding of terms in text, or a constructed feature based on a complex, custom transformation. However, the callers of your model will send “raw”, untransformed data, and the caller doesn’t (or shouldn’t) need to know which transformations are required. This means the model microservice will be responsible for applying the required transformation on the data before invoking the model for prediction.The affinity between the preprocessing routines and the model (i.e., having both of them coupled in the same service) is crucial to avoid training-serving skew, since you’ll want to ensure that these routines are applied on any data sent to the model, with no assumptions about how the callers prepare the data. Moreover, the model-preprocessing affinity helps to decouple the model from the caller. That is, if a new model version requires new transformations, these preprocessing routines can change independently of the caller, as the caller will keep on sending data in its raw format.Beside preprocessing, your deployed model’s microservice might also perform other operations, including postprocessing of the prediction produced by the model, or even more complex prediction routines that combine the predictions of multiple models.To help maintain affinity of preprocessing between training and serving, Cloud ML Engine now enables users to customize the prediction routine that gets called when sending prediction requests to a model deployed on Cloud ML Engine. This feature allows you to upload a custom model prediction class, along with your exported model, to apply custom logic before or after invoking the model for prediction.Customizing prediction routines can be useful for the following scenarios:Applying (state-dependent) preprocessing logic to transform the incoming data points before invoking the model for prediction.Applying (state-dependent) post-processing logic to the model prediction before sending the response to the caller. For example, you might want to convert the class probabilities produced by the model to a class label.Integrating rule-based and heuristics-based prediction with model-based prediction.Applying a custom transform used in fitting a scikit-learn pipeline.Performing complex prediction routines based on multiple models, that is, aggregating predictions from an ensemble of estimators, or calling a model based on the output of the previous model in a hierarchical fashion.The above tasks can be accomplished by custom online prediction, using the standard framework supported by Cloud ML Engine, as well as with any model developed by your favorite Python-based framework, including PyTorch. All you need to do is to include the dependency libraries in the setup.py of your custom model package (as discussed below). Note that without this feature, you would need to implement the preprocessing, post-processing, or any custom prediction logic in a “wrapper” service, using, for example, App Engine. This App Engine service would also be responsible for calling the Cloud ML engine models, but this approach adds complexity to the prediction system, as well as latency prediction time.Next we’ll demonstrate how we built a microservice that can handle both preprocessing and post-processing scenarios using the Cloud ML Engine custom online prediction, using text classification as the example. We chose to implement the text preprocessing logic and built the classifier using Keras, but thanks to Cloud ML Engine custom online prediction, you could implement the preprocessing using any other libraries (like NLTK or Scikit-learn), and build the model using any other Python-based ML framework (like TensorFlow or PyTorch). You can find the code for this example in this Colab notebook.A text classification exampleText classification algorithms are at the heart of a variety of software systems that process text data at scale. The objective is to classify (categorize) text into a set of predefined classes, based on the text’s content. This text can be a tweet, a web page, a blog post, user feedback, or an email: in the context of text-oriented ML models, a single text entry (like a tweet) is usually referred to as a “document.”Common use cases of text classification include:Spam-filtering: classifying an email as spam or not.Sentiment analysis: identifying the polarity of a given text, such as tweets, product or service reviews.Document categorization: identifying the topic of a given document (for example, politics, sports, finance, etc.)Ticket routing: identifying to which department to dispatch a ticketYou can design your text classification model in two different ways; choosing one versus the other will influence how you’ll need to prepare your data before training the model.N-gram models: In this option, the model treats a document as a “bag of words,” or more precisely, a “bag of terms,” where a term can be one word (uni-gram), two words (bi-gram) or n words (n-grams). The ordering of the words in the document is not relevant. The feature vector representing a document encodes whether a term occurs in the document or not (binary encoding), how many times the term occurs in the document (count encoder) or more commonly, Term Frequency Inverse Document Frequency (TF-IDF encoder). Gradient-boosted trees and Support Vector Machines are typical techniques to use in n-gram models.Sequence models: With this option, the text is treated as a sequence of words or terms, that is, the model uses the word ordering information to make the prediction. Types of sequence models include convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their variations.In our example, we utilize the sequence model approach.Hacker News is one of many public datasets available in BigQuery. This dataset includes titles of articles from several data sources. For the following tutorial, we extracted the titles that belong to either GitHub, The New York Times, or TechCrunch, and saved them as CSV files in a publicly shared Cloud Storage bucket at the following location:gs://cloud-training-demos/blogs/CMLE_custom_predictionHere are some useful statistics about this dataset:Total number of records: 96,203Min, Max, and Average number of words per title: 1, 52, and 8.7Number of records in GitHub, The New York Times, and TechCrunch: 36,525, 28,787, and 30,891Training and evaluation percentages: 75% and 25%The objective of the tutorial is to build a text classification model, using Keras to identify the source of the article given its title, and deploy the model to Cloud ML Engine using custom online prediction, to be able to perform text pre-processing and prediction post-processing.Preprocessing textSequence tokenization with KerasIn this example, we perform the following preprocessing steps:Tokenization: Divide the documents into words. This step determines the “vocabulary” of the dataset (set of unique tokens present in the data). In this example, you’ll make use of the most frequently 20,000 words, and discard the other ones from the vocabulary. This value is set through the VOCAB_SIZE parameter.Vectorization: Define a good numerical measure to characterize these documents. A given embedding’s representation of the tokens (words) will be helpful when you’re ready to train your sequence model. However, these embeddings are created as part of the model, rather than as a preprocessing step. Thus, what you need here is to simply convert each token to a numerical indicator. That is, each article’s title is represented as a sequence of integers, and each is an indicator of a token in the vocabulary that occurred in the title.Length fixing: After vectorization, you have a set of variable-length sequences. In this step, the sequences are converted into a single fixed length: 50. This can be configured using MAX_SEQUENCE_LENGTH parameter. Sequences with more than 50 tokens will be right-trimmed, while sequences with fewer than 50 tokens will be left-padded with zeros.Both the tokenization and vectorization steps are considered to be stateful transformations. In other words, you extract the vocabulary from the training data (after tokenization and keeping the top frequent words), and create a word-to-indicator lookup, for vectorization, based on the vocabulary. This lookup will be used to vectorize new titles for prediction. Thus, after creating the lookup, you need to save it to (re-)use it when serving the model.The following block shows the code for performing text preprocessing. The TextPreprocessor class in the preprocess.py module includes two methods.fit(): applied on training data to generate the lookup (tokenizer). The tokenizer is stored as an attribute in the object.transform(): applies the tokenizer on any text data to generate the fixed-length sequence of word indicators.Preparing training and evaluation dataThe following code prepares the training and evaluation data (that is, it converts each raw text title to a NumPy array with 50 numeric indicator). Note that, you use both fit() and transform() with the training data, while you only use transform() with the evaluation data, to make use of the tokenizer generated from the training data. The outputs, train_texts_vectorized and eval_texts_vectorized, will be used to train and evaluate our text classification model respectively.Next, save the processor object (which includes the tokenizer generated from the training data) to be used when serving the model for prediction. The following code dumps the object to processor_state.pkl file.Training a Keras modelThe following code snippet shows the method that creates the model architecture. We create a Sequential Keras model, with an Embedding layer, Dropout layer, followed by two Conv1d and Pooling Layers, then a Dense layer with Softmax activation at the end. The model is compiled with sparse_categorical_crossentropy loss and accuracy acc (accuracy) evaluation metric.The following code snippet creates the model by calling the create_model method with the required parameters, trains the model on the training data, and evaluates the trained model’s quality using the evaluation data. Lastly, the trained model is saved to keras_saved_model.h5 file.Implementing a custom model prediction classIn order to apply a custom prediction routine that includes preprocessing and postprocessing, you need to wrap this logic in a Custom Model Prediction class. This class, along with the trained model and the saved preprocessing object, will be used to deploy the Cloud ML Engine online prediction microservice. The following code shows how the Custom Model Prediction class (CustomModelPrediction) for our text classification example is implemented in the model_prediction.py module.Note the following points in the Custom Model Prediction class implementation:from_path is a “classmethod”, responsible for loading both the model and the preprocessing object from their saved files, and instantiating a new CustomModelPrediction object with the loaded model and preprocessor object (which are both stored as attributes to the object).predict is the method invoked when you call the “predict” API of the deployed Cloud ML Engine model. The method does the following:Receives the instances (list of titles) for which the prediction is neededPrepares the text data for prediction by applying the transform() method of the “stateful” self._processor object.Calls the self._model.predict() to produce the predicted class probabilities, given the prepared text.Postprocesses the output by calling the _postprocess method._postprocess is the method that receives the class probabilities produced by the model, picks the label index with the highest probability, and converts this label index to a human-readable label ‘github’, ‘nytimes’, or ‘techcrunch’ .Deploying to Cloud ML EngineFigure 1 shows an overview of how to deploy the model, along with its required artifacts for a custom prediction routine to Cloud ML Engine.Uploading the artifacts to Cloud StorageThe first thing you want to do is to upload your artifacts to Cloud Storage. First, you need to upload:Your saved (trained) model file: ‘keras_saved_model.h5′ (see the Training a Keras model section).Your pickled (seralized) preprocessing objects (which contain the state needed for data transformation prior to prediction): processor_state.pkl (see the Preprocessing Text section). Remember, this object includes the tokenizer generated from the training data.Second, upload a python package including all the classes you need for prediction (e.g., preprocessing, model classes, and post-processing). In this example, you need to create a pip-installable tar with model_prediction.py and preprocess.py. First, create the following setup.py file:Now, generate the package by running the following command:This creates a .tar.gz package under a new /dist directory, created in your working directory. The name of the package will be $name-$version.tar.gz where $name and $version are the ones specified in the setup.py.Once you have successfully created the package, you can upload it to Cloud Storage:Deploying the model to Cloud ML EngineLet’s define the model name, the model version, and the Cloud ML Engine runtime (which corresponds to a TensorFlow version) required to deploy the model.First, create a Cloud ML Engine using the following gcloud command:Second, create a model version using the following gcloud command, in which you specify the location of the model and preprocessing object (–origin), the location the package(s) including the scripts needed for your prediction (–package-uris), and a pointer to your Custom Model Prediction class ( –model-class). This should take one to two minutes.Calling the deployed model for online predictionsAfter deploying the model to Cloud ML Engine, you can invoke the model for prediction using the following code:Given the titles defined in the request object, the predicted source of each title from the deployed model would be as follows: [‘techcrunch’, ‘techcrunch’, ‘techcrunch’, ‘nytimes’, ‘nytimes’, ‘nytimes’, ‘github’, ‘github’, ‘techcrunch’]. Note that the last one was mis-classified by the model.ConclusionIn this tutorial, we built and trained a text classification model using Keras to predict the source media of a given article. The model required text preprocessing operations for preparing the training data, and preparing the incoming requests to the model deployed for online predictions. We have now shown how you can deploy the model to Cloud ML Engine with custom online prediction code, in order to perform preprocessing to the incoming prediction requests and post-processing to the prediction outputs. Enabling a custom online prediction routine in Cloud ML Engine allows for affinity between the preprocessing logic, the model, and the post-processing logic required to handle prediction request end-to-end, which helps to avoid training-serving skew, and simplifies deploying ML models for online prediction.Thanks for following along. If you’re curious to try out some other machine learning tasks on GCP, take this specialization on Coursera. If you want to try out these examples for yourself in a local environment, run this notebook on Colab. Send a tweet to @GCPcloud if there’s anything we can change or add to make text analysis even easier on Google Cloud.
Quelle: Google Cloud Platform

14. Februar 2019

da Agency

Anomaly detection using built-in machine learning models in Azure Stream Analytics

Built-in machine learning (ML) models for anomaly detection in Azure Stream Analytics significantly reduces the complexity and costs associated with building and training machine learning models. This feature is now available for public preview worldwide.

What is Azure Stream Analytics?

Azure Stream Analytics is a fully managed serverless PaaS offering on Azure that enables customers to analyze and process fast moving streams of data, and deliver real-time insights for mission critical scenarios. Developers can use a simple SQL language (extensible to include custom code) to author and deploy powerful analytics processing logic that can scale-up and scale-out to deliver insights with milli-second latencies.

Traditional way to incorporate anomaly detection capabilities in stream processing

Many customers use Azure Stream Analytics to continuously monitor massive amounts of fast-moving streams of data in order to detect issues that do not conform to expected patterns and prevent catastrophic losses. This in essence is anomaly detection.

For anomaly detection, customers traditionally relied on either sub-optimal methods of hard coding control limits in their queries, or used custom machine learning models. Development of custom learning models not only requires time, but also high levels of data science expertise along with nuanced data pipeline engineering skills. Such high barriers to entry precluded adoption of anomaly detection in streaming pipelines despite the associated value for many Industrial IoT sites.

Built-in machine learning functions for anomaly detection in Stream Analytics

With built-in machine learning based anomaly detection capabilities, Azure Stream Analytics reduces complexity of building and training custom machine learning models to simple function calls. Two new unsupervised machine learning functions are being introduced to detect two of the most commonly occurring anomalies namely temporary and persistent.

AnomalyDetection_SpikeAndDip function to detect temporary or short-lasting anomalies such as spike or dips. This is based on the well-documented Kernel density estimation algorithm.
AnomalyDetection_ChangePoint function to detect persistent or long-lasting anomalies such as bi-level changes, slow increasing and slow decreasing trends. This is based on another well-known algorithm called exchangeability martingales.

Example

SELECT sensorid, System.Timestamp as time, temperature as temp,
AnomalyDetection_SpikeAndDip(temperature, 95, 120, 'spikesanddips')
OVER PARTITION BY sensorid
LIMIT DURATION(second, 120) as SpikeAndDipScores
FROM input

In the example above, AnomalyDetection_SpikeAndDip function helps monitor a set of sensors for spikes or dips in the temperature readings. Furthermore, the underlying ML model uses a user supplied confidence level of 95 percent to set the model sensitivity. A training event count of 120 that corresponds to a 120 second sliding window are supplied as function parameters. Note that the job is partitioned by sensorid, which results in multiple ML models being trained under the hood, one for each sensor and all within the same single query.

Get started today

We’re excited for you to try out anomaly detection functions in Azure Stream Analytics. To try this new feature, please refer to the feature documentation, "Anomaly Detection in Azure Stream Analytics."
Quelle: Azure

13. Februar 2019

da Agency

Moving your Azure Virtual Machines has never been easier!

To meet customer demand, Azure is continuously expanding. We’ve been adding new Azure regions and introducing new capabilities. As a result, customers can now move their existing virtual machines (VMs) to new regions while adopting the latest capabilities. There are other factors that prompt our customers to relocate their VMs. For example, you may want to do that to increase SLAs.

In this blog, we will walk you through the steps you need to follow to move your VMs across regions or within the same region.

Why do customers want to move their Azure IaaS Virtual Machines?

Some of the most common reasons that prompt our customers to move their virtual machines include:

• Geographical proximity: “I deployed my VM in region A and now region B, which is closer to my end users, has become available.”

• Mergers and acquisitions: “My organization was acquired, and the new management team wants to consolidate resources and subscriptions into one region.”

• Data sovereignty: “My organization is based in the UK with a large local customer base. As a result of Brexit, I need to move my Azure resources from various European regions to the UK in order to comply with local rules and regulations.”

• SLA requirements: “I deployed my VMs in Region A, and I would like to get a higher level of confidence regarding the availability of my services by moving my VMs into Availability Zones (AZ). Region A doesn’t have an AZ at the moment. I want to move my VMs to Region B, which is still within my latency limits and has Availability Zones.”

If you or your organization are going through any of these scenarios or you have a different reason to move your virtual machines, we’ve got you covered!

Move Azure VMs to a target region

For any of the scenarios outlined above, if you want to move your Azure Virtual Machines to a different region with the same configuration as the source region or increase your availability SLAs by moving your virtual machines into an Availability Zone, you can use Azure Site Recovery (ASR). We recommend taking the following steps to ensure a successful transition:

1. Verify prerequisites: To move your VMs to a target region, there are a few prerequisites we recommend you gather. This ensures that you’re creating a basic understanding of the Azure Site Recovery replication, the components involved, the support matrix, etc.

2. Prepare the source VMs: This involves ensuring the network connectivity of your VMs, certificates installed on your VMs, identifying the networking layout of your source and dependent components, etc.

3. Prepare the target region: You should have the necessary permissions to create resources in the target region including the resources that are not replicated by Site Recovery. For example, permissions for your subscriptions in the target region, available quota in the target region, Site Recovery’s ability to support replication across the source-target regional pair, pre-creation of load balancers, network security groups (NSGs), key vault, etc.

4. Copy data to the target region: Use Azure Site Recovery replication technology to copy data from the source VM to the target region.

5. Test the configuration: Once the replication is complete, test the configuration by performing a failover test to a non-production network.

6. Perform the move: Once you’re satisfied with the testing and you have verified the configuration, you can initiate the actual move to the target region.

7. Discard the resources in the source region: Clean up the resources in the source region and stop replication of data.

Move your Azure VM ‘as is’

If you intend to retain the same source configuration as the target region, you can do so with Azure Site Recovery. Your virtual machine configuration availability SLAs will be the same before and after the move. A single instance VM after the move will come back online as a single instance VM. VMs in an Availability Set after the move will be placed into an Availability Set, and VMs in an Availability Zone will be placed into an Availability Zone within the target region.

To learn more about the steps to move your VMs, refer to the documentation.

Move your Azure virtual machines to increase availability

As many of you know, we offer Availability Zones (AZs), a high availability offering that protects your applications and data from datacenter failures. AZs are unique physical locations within an Azure region and are equipped with independent power, cooling, and networking. To ensure resiliency, there’s a minimum of three separate zones in all enabled regions. With AZs, Azure offers 99.99 percent VM uptime SLA.

You can use Azure Site Recovery to move your single instance VM or VMs in an Availability Set into an Availability Zone, thereby achieving 99.99 percent uptime SLA. You can choose to place your single instance VM or VMs in an Availability Set into Availability Zones when you choose to enable the replication for your VM using Azure Site Recovery. Ideally each VM in an Availability Set should be spread across Availability Zones. The SLA for availability will be 99.99 percent once you complete the move operation. To learn more about the steps to move the VMs and improve your availability, refer to our documentation.

Azure natively provides you with the high availability and reliability you need for your mission-critical workloads, and you can choose to increase your SLAs and meet compliance requirements using the disaster recovery features provided by Azure Site Recovery. You can use the same service to increase availability of the virtual machines you have already deployed as described in this blog. Getting started with Azure Site Recovery is easy – simply check out the pricing information, and sign up for a free Azure trial. You can also visit the Azure Site Recovery forum on the Microsoft Developer Network (MSDN) for additional information and to engage with other customers.
Quelle: Azure