Microsoft continues to build the case for data estate modernization on Azure

Special thanks to Rik Tamm-Daniels and the Informatica team for their contribution to this blog post. ​

With the latest release of Azure SQL Data Warehouse, Microsoft doubles-down on Azure SQL DW as one of the core data services for digital transformation on Azure. In addition to the fundamental benefits of agility, on-demand scaling and unlimited compute availability, the most recent price-to-performance metrics from the GigaOM report are one of several the compelling arguments they have made for customers to adopt Azure SQL DW. Interestingly, Microsoft is also announcing the general availability of Azure Data Lake Gen 2 and Azure Data Explorer. Along with Power BI for rich visualization, these enhanced set of capabilities cement Microsoft’s leadership position around Cloud Scale Analytics.

Every day, I speak with joint Informatica and Microsoft customers who are looking to transform their approach to their data estate with a cohesive data lake and cloud data warehousing solution architecture. These customers range from global logistics companies, to auto manufacturers to the world’s largest insurers, and all of them see the tremendous potential of the Microsoft modern data estate approach; in fact, just via Informatica's iPaaS (integration platform-as-a-service) offering, Informatica Intelligent Cloud Services, we’ve seen a significant quarter-to-quarter growth in customer data volumes being moved to Azure SQL DW.

Of course, as compelling as the Azure SQL DW technology is, for many customers, modernizing a legacy enterprise data warehouse is a daunting proposition to even consider. The thought of touching the intricate web of dependencies around the warehouse can keep even the most battle-tested CIO up at night. A key consideration when attempting your own cloud data warehousing/cloud data modernization initiative is to ensure you have intelligence about the existing schemas, lineage and dependencies to enable companies to incrementally unravel the data web surrounding the warehouse, and with laser-like precision, begin to move workloads and use case to Azure SQL DW.

Enter Informatica’s Enterprise Data Catalog with full end-to-end source-to-destination lineage and searchable machine-learning and AI-driven intelligent metadata about what data lives where in the warehouse to clear the fog of complexity and illuminate a clear path to cloud data warehousing. In fact, the concept of discovery and catalog driven-modernization is such a compelling leap forward that Microsoft and Informatica developed a single-sign-on Data Accelerator on Informatica’s Intelligent Cloud Services on Azure that can be accessed directly from the Azure SQL DW management console with your Azure credentials.

Data Accelerator for Azure

Want to see how Informatica and Microsoft can jumpstart your cloud data warehousing modernization initiative? Join us on Informatica's world tour of hands-on workshop at a Microsoft Technology Center near you. Workshops are taking place in North America right now and will be coming to EMEA and APJ very soon!

Register here: Cloud Data Warehouse Modernization for Azure Workshop. 
Quelle: Azure

Modernizing financial risk computation with Hitachi Consulting and GCP

Editor’s note:Hitachi Consulting is the digital solutions and professional services organization within Hitachi Ltd., and a strategic partnerof Google Cloud. Hitachi Consulting has deep experience in the financial services industry, and they work with many large banks on moving to using digital solutions. Today we’ll hear how they used Google Cloud Platform (GCP) to build a proof-of-concept platform to move traditional financial risk computation tasks from on-premises to cloud, gaining flexibility, scalability and cost savings.  At Hitachi Consulting, we’ve found that GCP’s high-performance infrastructure and big data analysis tools are ideal for financial applications and data. We wanted to explore using GCP to help modernize the financial applications common to many of our banking customers. In this post, we’ll describe our experience building a proof-of-concept market risk computation solution.Financial services companies need flexible infrastructureRisk management is a core activity for financial services organizations. These organizations often have extensive hardware and software investments, typically in the form of high-performance computing (HPC) grids, to help with risk computations. Increasing regulation and the need for access to timely risk exposure calculations places great demands on this computing infrastructure. So financial services organizations have to increase the flexibility, scalability and cost-effectiveness of their risk infrastructure and applications to meet this growing demand.We set out to build a proof-of-concept risk analytics platform that could tackle the downsides of traditional approaches to market risk exposure applications, such as:Managing large amounts of compute nodes within an on-premises grid architectureDependency on expensive third-party orchestration softwareLack of flexibility and scalability to meet growing demandModernizing risk applications with cloud-native toolsThe cloud presents many opportunities for modernizing risk applications. A traditional lift-and-shift approach, where existing applications are moved to the cloud with minimum modification, can increase scalability and reduce costs. At the other end of the scale, applications can be fully redesigned to use streaming pipeline architectures to help meet demands for results in near real-time. However, we think there’s a place for a middle path that lets financial institutions take advantage of cloud-native services to get cost and flexibility benefits, while continuing to use the risk models they’re used to.Our approach uses a few key technology components:Containers as lightweight alternatives to traditional virtual machines to perform typical Value at Risk (VaR) calculations using the open-source QuantLib librariesGoogle Kubernetes Engine (GKE) as a managed container platform and replacement for the on-premises compute gridCloud Pub/Sub and Cloud Dataflow for orchestration of the risk calculation pipelineCloud Datastore as intermediate storage for checkpointingBigQuery for data warehousing and analyticsHere’s a look at how these pieces come together for risk calculation:IngestionThe first step is to ingest data into the pipeline. Here, the inputs take the form of aggregated portfolio and trade data. One key design goal was the ability to handle both batch and stream inputs. In the batch case, CSV files are uploaded to Google Cloud Storage, and the file upload triggers a message onto a Cloud Pub/Sub topic. For the streaming case, information is published directly onto a Cloud Pub/Sub topic. Cloud Pub/Sub is a fully managed service that provides scalable, reliable, at-least-once delivery of messages for event-driven architectures. Cloud Pub/Sub enables loose coupling of application components and supports both push and pull message delivery.PreprocessingThose Cloud Pub/Sub messages feed a Cloud Dataflow pipeline for trade data preprocessing. Cloud Dataflow is a fully managed, auto-scaling service for transforming and enriching data in both stream and batch modes, based on open-source Apache Beam. The portfolio inputs are cleansed and split into individual trade elements, at which point the required risk calculations are determined. The individual trade elements are published to downstream Cloud Pub/Sub topics to be consumed by the risk calculation engine.Intermediate results from the preprocessing steps are persisted to Cloud Datastore, a fully managed, serverless NoSQL document database. This pattern of checkpointing intermediate results to Cloud Datastore is repeated throughout the architecture. We chose Cloud Datastore for its flexibility, as it brings the scalability and availability of a NoSQL database alongside capabilities such as ACID transactions, indexes and SQL-like queries.CalculationAt the heart of the architecture sits the risk calculation engine, deployed on GKE. GKE is a managed, production-ready environment for deploying containerized applications. We knew we wanted to evaluate GKE, and Kubernetes more broadly, as a platform for risk computation for the following reasons:Existing risk models can often be containerized without significant refactoringKubernetes is open source, minimizing vendor lock-inKubernetes abstracts away the underlying compute infrastructure, promoting portabilityKubernetes provides sophisticated orchestration capabilities, reducing dependency on expensive third-party toolsGKE is a fully managed service, freeing operations teams to focus on managing applications rather than infrastructureThe risk engine is a set of Kubernetes services designed to handle data enrichment, perform the required calculations, and output results. Pods are independently auto-scaled via Stackdriver metrics on Cloud Pub/Sub queue depths, and the cluster itself is scaled based on the overall CPU load. As in the preprocessing step, intermediate results are persisted to Cloud Datastore and pods publish messages to Cloud Pub/Sub to move data through the pipeline. The pods can run inside a private cluster that is isolated from the internet but can still interact with other GCP services via private Google access.OutputFinal calculation results output by the risk engine are published to a Cloud Pub/Sub topic, which feeds a Cloud Dataflow pipeline. Cloud Dataflow enriches the results with the portfolio and market data used for the calculations, creating full-featured snapshots. These snapshots are persisted to BigQuery, GCP’s serverless, highly scalable enterprise data warehouse. BigQuery allows analysis of the risk exposures at scale, using SQL and industry-standard tooling, driving customer use cases like regulatory reporting.Lessons learned building a proof-of-concept data platformWe learned some valuable lessons while building out this platform:Choosing managed and serverless options greatly improved team velocityBe aware of quotas and limits; during testing we encountered BigQuery streaming-insert limits. We worked around that using a blended streaming and micro-batch strategy with Cloud Dataflow.We had to do some testing and investigation to get optimum auto-scaling of the Kubernetes pods.The system scaled well under load without warm-up or additional configurationWhat’s next for our risk solutionWe built a modernized, cloud-native risk computation platform that offers several advantages over traditional grid-based architectures. The architecture is largely serverless, using managed services such as Cloud Dataflow, Cloud Pub/Sub and Cloud Datastore. The solution is open-source at its core, using Kubernetes and Apache Beam via GKE and Cloud Dataflow, respectively. BigQuery provides an easy way to store and analyze financial data at scale. The architecture has the ability to handle both batch and stream inputs, and scales up and down to match load.Using GCP, we addressed some of the key challenges associated with traditional risk approaches, namely inflexibility, high management overhead and reliance on expensive third-party tools. As our VP of financial services, Suranjan Som, put it: “The GCP risk analytics solution provides a scalable, open and cost-efficient platform to meet increasing risk and regulatory requirements.” We’re now planning further work to test the solution at production scale.Read more about financial services solutions on GCP, and learn about Hitachi Consulting’s financial services solutions.
Quelle: Google Cloud Platform

HPC made easy: Announcing new features for Slurm on GCP

Now we’re sharing a new set of features for Slurm running on Google Cloud Platform (GCP) including support for preemptible VMs, custom machine types, image-based instance scaling, attachable GPUs, and customizable NFS mounts. In addition, this release features improved deployment scalability and resilience.Slurm is one of the leading open-source HPC workload managers used in TOP500 supercomputers around the world. Last year, we worked with SchedMD, the core company behind Slurm, to make it easier to launch Slurm on Compute Engine.Here’s more information about these new features:Support for preemptible VMs and custom machine typesYou can now scale up a Compute Engine cluster with Slurm and preemptible VMs, while support for custom machine types lets you  run your workloads on instances with an optimal amount of CPU and memory. Both features help you achieve much lower costs for your HPC workloads: Preemptible VMs can be up to 80% cheaper than regular instances and custom machine types can generate savings of 50% or more compared to predefined types.Image-based instance scalingRather than installing packages from the internet and applying script configurations, now you can create Slurm compute instances based on a Google-provided disk image. This feature significantly shortens the time required to provision each node and increases deployment resilience. Images are automatically made by provisioning an image creation node, which are then used as the basis of all other auto-scaled compute instances.  This can yield a net-new cluster of 5000 nodes in under 7 minutes.Optional, attachable GPUsCompute Engine supports a wide variety of GPUs (e.g. NVIDIA V100, K80, T4, P4 and P100, with others on the horizon), which you can attach to your instances based on region and zone availability. Now, Slurm will automatically install the appropriate NVIDIA/CUDA drivers and software according to GPU model and compatibility, making it easy to scale up your GPU workloads on Compute Engine using Slurm.Customizable NFS mounts and VPC flexibilityFinally, you can now set the NFS hosts of your choice for storage. Cloud Filestore is a great option if you want a fully managed NFS experience. You can also specify a pre-existing VPC or Shared VPC to host your cluster.Getting startedThis new release was built by the Slurm experts at SchedMD. You can download this release in SchedMD’s GitHub repository. For more information, check out the included README. And if you need help getting started with Slurm check out the quick start guide, and for help with the Slurm features for GCP check out the Slurm Auto-Scaling Cluster and Slurm Cluster Federation codelabs. If you have further questions, you can post on the Slurm on GCP Google discussion group, or contact SchedMD directly.
Quelle: Google Cloud Platform

Leading security companies use Google Cloud to deliver Security-as-a-Service

This week, innovation in the security industry is on display as more than 700 security vendors exhibit at RSA Conference. There is no shortage of vendor solutions attempting to help organizations address the business imperative of securing users, applications, and data in today’s challenging threat and regulatory environment.In much the way organizations have embraced cloud-delivered solutions for collaboration, data analytics, CRM and ERP, they are also turning to cloud-delivered security solutions. Many organizations have found the challenges of deploying and operating on-premises security solutions are reduced when they delivered in the cloud. These challenges are particularly acute with many next-generation security tools that require highly skilled operators, rely on large volumes of data, use high-speed analytics, and depend on continuous updates.It’s no surprise then that many security companies have turned to public cloud providers to help deliver their newest products and services to customers. But the choice of cloud provider is a high-stakes one: finding a provider that offers reliability, performance, functionality, and above-all, foundational security, is essential. In addition to these various considerations, security companies must build and maintain trust with their provider, as they rely on protecting their reputations as “secure.”At Google Cloud, we are proud to help numerous security companies deliver services to protect organizations around the world. Here are a few examples:Palo Alto Networks is a global cybersecurity leader which safely enables tens of thousands of organizations and their customers, and in December 2018, we expanded our partnership. Palo Alto Networks will run Cortex on Google Cloud to take advantage of Google Cloud Platform’s secure, durable cloud storage and highly-scalable AI and analytics tools. Services such as BigQuery will help Cortex customers accelerate time-to-insight as they work to detect and respond to security threats. Palo Alto Networks will also run their GlobalProtect cloud service on Google Cloud Platform. Google Cloud’s reliable, performant, and secure global-scale network and infrastructure offer many advantages for a service to help protect branch and mobile workforces.“Being a Google Cloud customer allows us to run important cloud-delivered security services at scale with the benefits of Google’s AI and analytics expertise,” said Varun Badhwar, SVP Products & Engineering for Public Cloud Security at Palo Alto Networks.Shape Security helps organizations stop imitation attacks and ensure that only genuine customers use their websites and mobile apps. The company was looking for a scalable platform that could keep pace with the level of innovation required to stay ahead of attackers and fraudsters. Their deployment model, with appliances deployed in customer data centers, was difficult to scale and operate as their customer base grew. The answer was to transition to cloud-based service delivery.GCP’s intuitive user management allowed them rapidly on-board users and appropriately manage permissions for developers and admins. They take advantage of GCP’s modern microservices support to provide customized, isolated environments for each customer and, similar to Palo Alto Networks, leverage GCP’s advanced data analytics services like BigQuery and support for machine learning.“GCP’s robust support for Kubernetes and Spinnaker has made deployments significantly easier and more scalable. With Google Cloud, we have modernized our infrastructure so we can keep pace with our rapid growth.” said Andy Mayhew, Senior Director of Infrastructure Engineering at Shape Security.Area 1 Security is a performance-based cybersecurity company changing how businesses protect against phishing attacks. The company, through the Area 1 Horizon anti-phishing service, analyzes a vast amount of information daily using sensors across the internet, a high-speed web crawler that spiders up to eight billion URLs every few weeks, and a distributed sensor network that gathers billions of network events in a day. It sends that information to a massive data warehouse for analysis where it is processed to discover emerging and ongoing cyberattacks and then uses that insight to block phish before customers are breached. The company turned to Google Cloud Platform for its scalability, performance, and sophisticated data analytics tools.“With Google Cloud Platform, Area 1 Security has been able to identify millions of phishing attacks and malicious campaign events,” says Blake Darché, Chief Security Officer at Area 1 Security. “From reconnaissance through exfiltration, Google Cloud Platform provides us with unparalleled capabilities to discover attacks in their earliest formative stages and protect our customers.”As a security company, Area 1 Security demanded a public cloud provider that could provide a highly secure infrastructure foundation:“Google Cloud Platform has its own purpose-built chips, servers, storage, network, and data centers” says Phil Syme, Chief Technology Officer at Area 1 Security. “Google’s dedication to hardened security across the entire infrastructure means that Area 1 Security can trust the software that we run in Google Cloud Platform to be secure.”BlueVoyant helps defend businesses around the world against agile and well-financed cyber attackers by providing unparalleled visibility, insight and responsiveness. Time to market is essential for providers like BlueVoyant, and Google Cloud helps them innovate quickly without compromising on security or reliability.  “BlueVoyant chose to partner with Google Cloud because it is consistent with our security first philosophy, but also didn’t compromise on flexibility, allowing us bring our services to market faster.” said Milan Patel, COO Managed Security Services, BlueVoyant.As more security functionality is delivered through cloud-based services, Google Cloud remains deeply committed to serving this industry through a highly secure platform for security application development and delivery. To learn more, visit our Security page for Google Cloud.
Quelle: Google Cloud Platform