ONNX Runtime integration with NVIDIA TensorRT in preview

Today we are excited to open source the preview of the NVIDIA TensorRT execution provider in ONNX Runtime. With this release, we are taking another step towards open and interoperable AI by enabling developers to easily leverage industry-leading GPU acceleration regardless of their choice of framework. Developers can now tap into the power of TensorRT through ONNX Runtime to accelerate inferencing of ONNX models, which can be exported or converted from PyTorch, TensorFlow, and many other popular frameworks.

Microsoft and NVIDIA worked closely to integrate the TensorRT execution provider with ONNX Runtime and have validated support for all the ONNX Models in the model zoo. With the TensorRT execution provider, ONNX Runtime delivers better inferencing performance on the same hardware compared to generic GPU acceleration. We have seen up to 2X improved performance using the TensorRT execution provider on internal workloads from Bing MultiMedia services.

How it works

ONNX Runtime together with its TensorRT execution provider accelerates the inferencing of deep learning models by parsing the graph and allocating specific nodes for execution by the TensorRT stack in supported hardware. The TensorRT execution provider interfaces with the TensorRT libraries that are preinstalled in the platform to process the ONNX sub-graph and execute it on NVIDIA hardware. This enables developers to run ONNX models across different flavors of hardware and build applications with the flexibility to target different hardware configurations. This architecture abstracts out the details of the hardware specific libraries that are essential to optimizing the execution of deep neural networks.

How to use the TensorRT execution provider

ONNX Runtime together with the TensorRT execution provider supports the ONNX Spec v1.2 or higher, with version 9 of the Opset. TensorRT optimized models can be deployed to all N-series VMs powered by NVIDIA GPUs on Azure.

To use TensorRT, you must first build ONNX Runtime with the TensorRT execution provider (use –use_tensorrt –tensorrt_home <path to location for TensorRT libraries in your local machine> flags in the build.sh tool). You can then take advantage of TensorRT by initiating the inference session through the ONNX Runtime APIs. ONNX Runtime will automatically prioritize the appropriate sub-graphs for execution by TensorRT to maximize performance.

InferenceSession session_object{so};
session_object.RegisterExecutionProvider(std::make_unique<::onnxruntime::TensorrtExecutionProvider>());
status = session_object.Load(model_file_name);​

Detailed instructions are available on GitHub. In addition, a collection of standard tests are available through the onnx_test_runner utility in the repo to help verify the ONNX Runtime build with TensorRT execution provider.

What is ONNX and ONNX Runtime

ONNX is an open format for deep learning and traditional machine learning models that Microsoft co-developed with Facebook and AWS. ONNX allows models to be represented in a common format that can be executed across different hardware platforms using ONNX Runtime. This gives developers the freedom to choose the right framework for their task, as well as the confidence to run their models efficiently on a variety of platforms with the hardware of their choice.

ONNX Runtime is the first publicly available inference engine with full support for ONNX 1.2 and higher including the ONNX-ML profile. ONNX Runtime is lightweight and modular with an extensible architecture that allows hardware accelerators such as TensorRT to plug in as “execution providers.” These execution providers unlock low latency and high efficiency neural network computations. Today, ONNX Runtime powers core scenarios that serve billions of users in Bing, Office, and more.

Another step towards open and interoperable AI

The preview of the TensorRT execution provider for ONNX Runtime marks another milestone in our venture to create an open and interoperable ecosystem for AI. We hope this makes it easier to drive AI innovation in a world with ever-increasing latency requirements for production models. We are continuously evolving and improving ONNX Runtime, and look forward to your feedback and contributions!  

To learn more about using ONNX for accelerated inferencing on the cloud and edge, check out the ONNX session at NVIDIA GTC. Have feedback or questions about ONNX Runtime? File an issue on GitHub, and follow us on Twitter. 
Quelle: Azure

7 considerations for choosing the right hybrid cloud provider

Hybrid cloud computing is an attractive option for businesses that want to combine the advantages of public and private clouds.
An organization’s hybrid cloud provider will be an important partner as it integrates on-premises systems with cloud-based ones. Before awarding this important contract, there are seven important aspects to consider.
1. Get to know your workloads.
Before even talking to a hybrid cloud provider, understand the workloads that you want to pull into the hybrid environment and where you will locate them. For example, data backup and disaster recovery require a different kind of hybrid cloud service than complex analytics applications.
At the same time, ensure your provider can grow with you as your cloud strategy matures. Look to providers that can offer the services you will need as your cloud environment evolves. Seek out solutions that can integrate well with other providers’ platforms if you need to allocate different hybrid cloud contracts in a multicloud environment.
2. Evaluate performance.
Your choice of workload informs the next question to ask a potential provider: for what kind of workload is its infrastructure optimized? As cloud services evolve, providers are beginning to specialize in the kinds of workloads that they support. For example, some might focus on supporting developers, while others might serve a particular kind of application such as systems, applications and products (SAP).
Another aspect of performance is latency. Latency requirements are strict, especially in hybrid cloud environments where on-premises workloads communicate with cloud infrastructure. In these instances, your organization might require a provider with a local edge data center or at least one that can support the appropriate direct connectivity options.
3. Match public and private infrastructure.
Your hybrid cloud provider must also be able to support the technology options that you already use in on-premises infrastructure. Look for easy mappings between the virtual machine choices you’ve made in house and the formats that the service provider supports, for example.
Aligning the two infrastructures will make it easier to migrate workloads between one environment and the other.
4. Look for easy onboarding.
Ask your potential cloud provider what assistance it offers with migrating data and workloads to its infrastructure. Migration can be a challenging task, especially when working with large data sets. How can the provider help make it simpler and cheaper?
Some may offer hardware appliances to help you ship large data sets manually. At the very least, it should provide migration tools to help you map data between your on-premises infrastructure and its own or provide a consulting service to walk you through the process.
5. Assess security.
The provider should also be able to help you as you secure your data in a hybrid environment.
Hybrid workloads often involve security controls such as tokens. These tokens protect sensitive information in cloud data centers by pointing to records kept on customer premises. Ensure that hybrid cloud providers can help you implement these security measures.
A cloud provider should also be able to answer questions about their compliance processes and risk management. For a list of questions to ask, look through this cloud security assessment list from the Object Management Group.
6. Ensure availability and redundancy.
Security is only one aspect of computing risk. Another is availability. Check your provider’s approach to making your data available.
Service-level agreements (SLAs) will be a key factor here. They should not only include availability guarantees, but also escalation and compensation procedures in case the service provider cannot meet them. Consider the provider’s ability to help you support multiple cloud service providers so that you can failover between each in the event of a problem.
7. Weigh out pricing.
Cost was one of the main initial drivers for cloud computing. While other considerations such as scalability have become increasingly prevalent as cloud computing strategies mature, budget is still a key factor.
“Cloud shock” is an issue in cloud computing contracts. It often happens when customers don’t keep track of the online resources they are using. Check operating fees with the hybrid cloud provider, including the cost of unplanned service expansions to cover spikes in demand.
Be mindful that ending a contract may come with a fee. Plan for any extraction costs to ensure you can migrate your data successfully at the close of the relationship.
Like any business partnership, a hybrid computing contract is something that customers should approach carefully and with an understanding of what they hope to achieve. This will help you choose the right hybrid cloud provider and craft a solid platform on which to build a long-term hybrid cloud strategy.
Learn more about the top 10 criteria for selecting a managed services provider that best matches your business’s IT needs.
The post 7 considerations for choosing the right hybrid cloud provider appeared first on Cloud computing news.
Quelle: Thoughts on Cloud