New research: Enterprises more confident than ever in cloud security

Cloud-based solutions were a technology life raft for organizations during the COVID-19 pandemic as employees took to the virtual office and companies scrambled to adjust to a distributed, remote reality. However, these rapid and substantial changes in the role of cloud technologies on the business came with an increased focus on security. The accelerated move to the cloud also meant companies needed to rapidly evolve existing security practices to protect everything that matters at the core of business—from their people and their operational and transactional data to customers and their most sensitive personal information. Suddenly, enterprises were keenly aware of where business practices, employee training, and security policies were falling short. A recent Google-commissioned study by IDG explored the details behind the heightened focus on security solutions since the start of the pandemic while highlighting the role cloud-based security solutions are playing in helping keep customers safe. The survey of 2,000 global IT leaders serves to illustrate that in this new and unfamiliar world, enterprises are more ready than ever to embrace cloud security.Related ArticleRead ArticleSecurity is an even higher priority post-pandemicIn the wake of the pandemic, many organizations are facing a broader attack surface than ever before as employees moved to temporarily working from remote home offices (and in some cases, encouraged to stay there for the foreseeable future). With fewer inherent security protections on personal internet connections and more work meetings happening via video conferencing, attackers have launched a cyber pandemic of their own designed to take advantage of and exploit new weaknesses. However, even as businesses amp up security initiatives and preventative measures, the growing wave of threats continue to keep security top of mind for IT leaders. Security risks and concerns remain one of the top pain points impeding innovation according to the IDG study respondents—only surpassed by insufficient IT and developer skills. Enterprises looking to cloud providers for help with securityAs a result, addressing security risks is a leading area where IT leaders turn to cloud providers for support. For these organizations, the ability to control access to data while using cloud services was the most required infrastructure security and compliance features from a cloud provider.Cloud security is more trusted than ever A deeper look into the results also revealed a shift in perspective about whether cloud security is really up to the task of protecting enterprises against modern attacks. Despite skepticism in the past, the majority of IT leaders are now comfortable with using cloud-based security solutions. Confidence in the security of cloud infrastructure is extremely high with 85% of respondents stating they feel secure (or more secure) than on-premises infrastructure—compared to just 15% who believe on-premises is still safer.This is a clear indication that there are fewer reservations around the efficacy of cloud-based security solutions, signaling an increase in trust as organizations invest in cloud-based infrastructure and solutions. We are committed to safe, secure solutions Google Cloud protects your data, applications, and infrastructure, as well as your customers, from fraudulent activity, spam, and other types of online abuse. We protect you against a growing list of cybersecurity threats using the same infrastructure foundation and security services that we use for our own operations, so you never have to compromise between ease of use and advanced security. To learn more about the IDG findings and how IT leaders are addressing security concerns post-COVID, download the full report.Interested in how Google Cloud’s commitment to providing safe, secure solutions helps you address your security needs?Our networking, data storage, and compute services encrypt data at rest and in transit to ensure the integrity, authenticity, and privacy of your data and your customers’ data. We also offer the ability to encrypt data in use, while it’s being processed in VM and container workloads, and our advanced security tools support compliance and data confidentiality with minimal operational overhead.Related ArticleBest practices to protect your organization against ransomware threatsRansomware attacks are growing in frequency and sophistication. Create a foundation to protect yourself from them with these five strateg…Read Article
Quelle: Google Cloud Platform

Choosing the right machine learning approach for your application

Many of our customers want to know how to choose a technology stack for solving problems with machine learning (ML). There are many choices for these solutions available, some that you can build and some that you can buy. We’ll be focusing on the build side here, exploring the various options and the problems they solve, along with our recommendations. The best ML applications are trained with the largest amount of dataBut first, keep in mind an important concept: the quality of your ML model improves with the size of your data. Dramatic ML performance and accuracy are driven by improvements in data size, as shown in the graph below. This is a text model, but the same principles hold for all kinds of ML models.The unreasonable effectiveness of data Deep Learning scaling is predictable, empiricallyThe X axis represents the size of the data set and the Y axis is the error rate. As the size of the data set increases, the error rate drops. But notice something critical about the size of the data set — the x-axis is2^20, 2^21, 2^ 22, etc. In other words, each new tic here is a doubling of the data set size. To get a linear decrease in your error rate you need to exponentially increase the size of your data set. The blue curve in the graph represents a slightly more sophisticated ML model than the orange curve. Suppose you are deciding between two choices: create a better model or double the data set size. Assuming that these two choices cost the same, it’s better to keep gathering more data. It’s only when improvements due to data size increases start to plateau that it becomes necessary to build a better model. Secondly, ML systems need to be retrained for new situations. For example, if you have a recommendation system in YouTube and you want to provide recommendations in Google Now, you can’t use the same recommendations model. You have to train it in the second instance on the recommendations you want to make in Google Now. So even though the model, the code, and the principles are the same, you have to retrain the model with new data for new situations. Now, let’s combine these two concepts: you get a better ML model when you have more data, and an ML model typically needs to be retrained for a new situation. You have a choice of either spending your time building an ML model or buying a vendor’s off-the-shelf model. To answer the question of whether to buy or whether to build, first determine if the buyable model is solving the same problem that you want to solve. Has it been trained on the same input and on similar labels? Let’s say you’re trying to do a product search, and the model has been trained on catalog images as inputs. But you want to do a product search based on users’ mobile phone photographs of the products. The model that was trained on catalog images won’t work on your mobile phone photographs, and you’d have to build a new model. But let’s say you’re considering a vendor’s translation model that’s been trained on speeches in the European Parliament. If you want to translate similar speeches, the model works well as it uses the same kind of data. The next question to ask: does the vendor have more data than you do? If the vendor has trained their model on speeches in the European Parliament but you have access to more speech data than they have, you should build. If they have more data, then we recommend buying their model. Bottom line: buy the vendor’s solution if it’s trained on the same problem and has access to more data than you do. Technology stack for common ML use casesIf you need to build, what is the technology stack you need? What are the skills your people need to develop? This depends on the type of problem you are solving.  There are four broad categories of ML applications: predictive analytics, unstructured data, automation, and personalization. The recommended technology stack for each is slightly different. Predictive analyticsPredictive analytics includes detecting fraud, predicting click-through rates, and forecasting demand. Step one: build an enterprise data warehouseHere, your data set is primarily structured data, so our recommended first step is to store your data in an enterprise data warehouse (EDW). Your EDW is a source of training examples and product histories tracked over time, and can break down silos and gather data from throughout your organization.Step two: get good at data analyticsNext, you’d build a data culture, get skilled at data analytics, start to build dashboards, and enable data-driven decisions. At this point, you have all of the data and you know which pieces are trustworthy. Step three: build MLFrom your EDW, you can build your models using SQL pipelines. We recommend using BigQuery ML when doing ML with the data in your EDW. If you want to build a more sophisticated model, you can train TensorFlow/Keras models on BigQuery data. A third option is AutoML tables for state-of-the-art accuracy and for building online microservices.Unstructured dataExamples of how our customers use ML to gain insights from unstructured data include annotating videos, identifying eye diseases, and triaging emails. Unstructured data can include videos, images, natural language, and text. Deep learning has revolutionized the way we do ML on unstructured data, whether you’re looking at language understanding, image classification, or speech-to-text. For unstructured data, the models you use will  employ deep learning. Here, the ROI heavily favors using AutoML. The amount of time that you’d spend trying to create a new ML model from scratch is almost never worth it. You can spend your money more effectively collecting more data than trying to get a slightly better model. Regardless of the type of unstructured data, our recommendation is to use AutoML for small and medium size data sizes.But AutoML has a limit to scale. At some point, the size of your data set is going to be so large that architecture search is going to get really expensive. At that point, you may want to go to a best-of-breed model with custom retraining from TensorFlow Hub, for example.  If you have data sets that are in the millions of examples, you can build your own custom neural network (NN) architectures. But determine if your data set size has started to plateau, by plotting a graph similar to the one at the top of this post. Build a custom NN architecture only after you’ve plateaued, where increasing amounts of data won’t give you a better model. AutomationSome examples of how customers are using ML for automation include scheduling maintenance, counting retail footfall, and scanning medical forms. The key thing to keep in mind as you pick a technology stack for these problems is that you’re not building just one ML model. If you want to schedule maintenance orwant to reject transactions, for example, you’ll need to train multiple linked models. Instead of individual models, think in terms of ML pipelines, which you can orchestrate using all of the technologies already mentioned. Then you have three choices for operationalizing, with three levels of sophistication.Vertex AI has turnkey serverless training and batch/online predictions. This is what is recommended for a team of data scientists. .Deep Learning VM Image, Cloud Run, Cloud Functions or Dataflow feature customized training and batch/online predictions. This is what is recommended if the team consists of  data engineers and  scientists.Vertex AI Pipelines are fully customizable and recommended for organizations with separate ML engineering and data science teams.When doing automation, the individual models that you chain together into a pipeline will be a mix – some will be prebuilt, some will be customized, and others will be built from scratch. Vertex AI, by providing a unified interface for all these model types, simplifies the operationalization of these models.PersonalizationML application examples of personalization include customer segmentation, customer targeting, and product recommendations. For personalization, we again recommend using an EDW, because customer segmentation uses structured marketing data. For product recommendations, you will similarly have prior purchases and web logs in your EDW., You can power clustering applications, or recommendation systems like matrix factorization, and create embeddings directly from your EDW for sophisticated recommendation systems.For specific use cases, choose the technology stack based on your data size and scope. Start with BigQuery ML for its quick, easy matrix factorization approach. Once your application proves viable and you want a slightly better accuracy, then try AutoML recommendations. But once your data set grows beyond the capabilities of AutoML recommendations, consider training your own custom TensorFlow and Keras models. To summarize, successful ML starts with the question, “Do I build or do I buy?” If an off-the-shelf solution exists that was trained with similar data and with access to more data than you have, then buy it. Otherwise build it, using the technology stack recommended above for the four categories of ML applications.Learn more about our artificial intelligence (AI) and ML solutions and check out sessions from our Applied ML Summit on-demand.Related ArticleWhy you need to explain machine learning modelsWhy explainable AI (XAI) is essential to widespread AI adoption, common XAI methods, and how Google Cloud can help.Read Article
Quelle: Google Cloud Platform

Creating a unified analytics platform for digital natives

Digital native companies have no shortage of data, which is often spread across different platforms and Software-as-a-service (SaaS) tools. As an increasing amount of data about the business is collected, democratizing access to this information becomes all the more important. While many tools offer in-application statistics and visualizations, centralizing data sources for cross-platform analytics allows everyone at the organization to get an accurate picture of the entire business. With Firebase, BigQuery and Looker, digital platforms can easily integrate disparate data sources and infuse data into operational workflows – leading to better product development and increased customer happiness.How it worksIn this architecture, BigQuery becomes the single source of truth for analytics, receiving data from various sources on a regular basis. Here, we can take use of the broad Google ecosystem to directly import data from Firebase Crashlytics, Google Analytics, Cloud Firestore and query data within Google Sheets. Additionally, third party datasets can be easily pushed into BigQuery with data integration tools like FiveTran. Within Looker, data analysts can leverage pre-built dashboards and data models, or LookML, through source-specific Looker Blocks. By combining these accelerators with custom, first party LookML models, analysts can join across the data sources for more meaningful analytics. Using Looker Actions, data consumers can leverage insights to automate workflows and improve overall application health.The architecture’s components are described below:Data sourceNameDescriptionGoogle data sourcesGoogle Analytics 4Tracks customer interactions in your applicationFirebase CrashlyticsCollects and organizes Firebase application crash informationCloud Firestore Backend database for your Firebase applicationGoogle SheetsSpreadsheet service that can be used to collect manually entered, first party data Third-party data sourcesCustomer Relationship Management Platform (CRM)Manages customer data. (While we use Salesforceas a reference, the same ideas can be applied to other tools)Issue tracking or Project Management softwareCan help product and engineering teams track bug fixes and new feature development in applications. (While we use JIRA as a reference, the same ideas can be applied to other tools)Customer support software or a help deskA tool that organizes customer communications to help businesses respond to customers more quickly and effectively. (While we use Zendesk as a reference, the same ideas can be applied to other tools)Cross-functional analyticsWith the various data sources centralized into BigQuery, members across different teams can use the data to make informed decisions. Executives may want to combine business goals from a Google Sheet with CRM data to understand how the organization is tracking towards revenue goals. In preparation for board or team meetings, business leaders can use Looker’s integrations with Google Workspace, to send query results into Google Sheets and populate a chart inside a Google Slide deck. Technical program managers and site reliability engineers may want to combine Crashlytics, CRM and customer support data to prioritize bugs in the application that are affecting the highest value customers, or are often brought up inside support tickets. Not only can these users easily link back to the Crashlytics console for deeper investigation into the error, they can also use Looker’s JIRA action to automatically create JIRA issues based on thresholds across multiple data sources. Account and customer success managers (CSMs) can use a central dashboard to track the health of their customers using inputs like usage trends in the application, customer satisfaction scores and crash reports. With Looker alerts, CSMs can be immediately notified of problems with an account and proactively reach out to customer contacts.Getting startedTo get started creating a unified application analytics platform, be sure to check out our technical reference guide. If you’re new to Firebase you can learn more here.To get started with BigQuery, check out the BigQuery Sandbox and these guides. For more information on Looker, sign up for a free trial here. Related ArticleSpring forward with BigQuery user-friendly SQLThe newest set of user-friendly SQL features in BigQuery are designed to enable you to load and query more data with greater precision, a…Read Article
Quelle: Google Cloud Platform

Rubin Observatory offers first astronomy research platform in the cloud

This week, the Vera C. Rubin Observatory is launching the first preview of its new Rubin Science Platform (RSP) for an initial cohort of astronomers. The observatory, which is located in Chile but managed by the U.S. National Science Foundation’s NOIRLab in Tucson, AZ and SLAC in California, is jointly funded by the NSF and the U.S. Department of Energy. The platform provides an easy-to-use interface to store and analyze the massive datasets of the Legacy Survey of Space and Time (LSST), which will survey a third of the sky each night for ten years, detecting billions of stars and galaxies, and millions of supernovae, variable stars, and small bodies in our Solar System.The LSST datasets are unprecedented in size and complexity, and will be far too large for scientists to download to their personal computers for analysis. Instead, scientists will use the RSP to process, query, visualize, and analyze the LSST data archives through a mixture of web portal, notebook, and other virtual data analysis services. An initial launch with simulated data, called Data Preview 0, builds on the Rubin Observatory’s three-year partnership with Google to develop an Interim Data Facility (IDF) on Google Cloud to prototype hosting of the massive LSST dataset. This agreement marks the first time a cloud-based data facility has been used for an astronomy application of this magnitude.Bringing the stars to the cloudFor Data Preview 0, the IDF leverages Cloud Storage, Google Kubernetes Engine (GKE), and Compute Engine to provide the Rubin Observatory user community access to simulated LSST data in an early version of the RSP. The simulated data were developed over several years by the LSST Dark Energy Science Collaboration to imitate five years of an LSST-like survey over 300 square degrees of the sky (about 1,500 times the area of the moon). The resulting images are very realistic: they have the same instrumental characteristics, such as pixel size and sensitivity to photons, that are expected from the Rubin Observatory’s LSST Camera, and they were processed with an early version of the LSST Science Pipelines that will eventually be used to process LSST data. “This will be the first time that these workloads have ever been hosted in a cloud environment. Researchers will have an opportunity to explore an early version of this platform,” says Ranpal Gill, senior manager and head of communications at the Rubin Observatory.Broadening access for more researchersOver 200 scientists and students with Rubin Observatory data rights were selected to participate in Data Preview 0 from a pool of applicants that represents a wide range of demographic criteria, regions, and experience level. Participants will be supported with resources such as tutorials, seminars, communication channels, and networking opportunities—and they will be free to pursue their own science at their own pace using the data in the RSP. “The revolutionary nature of the future LSST dataset requires a commensurately innovative system for data access and analysis paired with robust support for scientists,” says Melissa Graham, lead community scientist for the Rubin Observatory and research scientist in the astronomy department at the University of Washington. “I’m personally excited to enhance my own skills by using the RSP’s tools for big data analysis, while also helping others to learn and to pursue their LSST-related science goals during Data Preview 0.” At the same time, the fact that the RSP is hosted in the cloud provides researchers at smaller institutions access to state-of-the-art astronomy infrastructure that is comparable to that of the largest national research centers.The launch benefits the observatory too: the development team can learn what researchers are interested in while also testing and debugging the platform. Graham says that “the platform is still in active development so researchers using it will be able to follow along in the progress, and provide feedback on ways that we can optimize the development of the tools.”Next stepsThe LSST aims to begin the ten-year survey in 2023-24 and expects it to include 500 petabytes of data. Through the cloud, Google aims to help make this extraordinary project scalable and accessible to researchers everywhere. To learn more about Data Preview 0, watch this video.Want to ramp up your own research in the cloud? We offer research credits to academics using Google Cloud for qualifying projects in eligible countries. You can find our application form on Google Cloud’s website or contact our sales team.Related ArticleGoogle Cloud fuels new discoveries in astronomyHigh-performance computing and machine learning are accelerating research in the science of astronomy, and we’re excited to highlight new…Read Article
Quelle: Google Cloud Platform

Benutzerdefinierte Labels von Amazon Rekognition bieten Unterstützung für KMS-Verschlüsselung von Trainingsdaten für benutzerdefinierte Modelle

Benutzerdefinierte Labels von Amazon Rekognition unterstützen die Möglichkeit, einen AWS-Key-Management-Service-(KMS)-Schlüssel zu verwenden, um die an den Service für benutzerdefinierte Labels von Rekognition gelieferten Bilddaten zu verschlüsseln, um Ihr benutzerdefiniertes Modell für Machine Learning zu trainieren. Standardmäßig verschlüsseln benutzerdefinierte Labels von Amazon Rekognition die Bilddaten im Ruhezustand mit serverseitiger Verschlüsselung. Wenn Sie in dieser Version benutzerdefinierte Labels von Amazon Rekognition mit Ihrem KMS-Schlüssel oder Alias-ARN bereitstellen, verwenden die benutzerdefinierten Labels von Amazon Rekognition stattdessen diesen KMS-Schlüssel, um Ihre Trainings- und Testbilder zu verschlüsseln.
Quelle: aws.amazon.com

Amazon Neptune unterstützt jetzt das Kopieren von DB-Cluster-Ressourcen-Tags in DB-Snapshots

Sie können Amazon-Neptune-Ressourcen-Tags verwenden, um Metadaten hinzuzufügen und Zugriffsrichtlinien auf Ihre Amazon-Neptune-Ressourcen anzuwenden. Jetzt können Sie zulassen, dass auf Ihren Neptune-Clustern festgelegte Ressourcen-Tags automatisch in alle automatisierten oder manuellen Datenbank-Snapshots kopiert werden, die aus Ihren Clustern erstellt werden. Tags können mit AWS-Identity-and-Access-Management(IAM)-Richtlinien verwendet werden, um den Zugriff auf Neptune-Ressourcen zu verwalten und zu steuern, welche Aktionen auf diese Ressourcen angewendet werden können.
Quelle: aws.amazon.com