Januar 2021 - Seite 5 von 40 - Cloud Computing Köln

Organizations building applications on top of Google Cloud make heavy use of Google APIs, allowing developers to build feature-rich and scalable services on Google Cloud infrastructure. But accessing those APIs can be tough if an organization uses VPC Service Controls to isolate resources and mitigate data exfiltration risks. Today, we’re introducing Cloud DNS response policies. This new feature allows a network administrator to modify the behavior of the DNS resolver according to organizational policies, making it easier to set up private connectivity to Google APIs from within a VPC Service Controls perimeter. To date, this has been a challenge for customers, especially for services whose APIs are not available within restricted.googleapis.com and aren’t accessible within the VPC SC perimeter. In addition, configuring access to restricted.googleapis.com is not straightforward: you have to create a new private DNS zone just to access Google services in addition to any existing private DNS zones and add records corresponding to the APIs in use. The simple approach of creating a wildcard *.googleapis.com DNS zone and pointing it to the restricted VIP will break services that are not available on the restricted VIP. Using Cloud DNS response policies helps simplify the user experience. Based on a subset of the Internet Draft for response policy zones (or RPZ), they allow you to modify how the resolver behaves according to a set of rules. As such, you can create a single response policy per network that allows for:Alteration of results for selected query names (including wildcards) by providing specific resource records ORTriggering passthru behavior that exempts names from matching the response policy. Specifically, a name can be excluded from a wildcard match, allowing normal private DNS matching (or internet resolution) to proceed as if it never encountered the wildcard.You can use this to set up private connectivity to Google APIs from within a VPC Service Controls perimeter. It works by creating a response policy (instead of a DNS zone) bound to the network, then adds a localdata rule for *.googleapis.com containing the CNAME. You can then exempt unsupported names (like www.googleapis.com by creating a passthru rule. Queries then receive the restricted answer, unless they are for the unsupported name, in which case they receive the normal internet result.The following snippet illustrates how to achieve this:There are some caveats to using Cloud DNS response policies though—passthru configurations cannot generate NXDOMAINS so they are not a replacement for an actual DNS Zone. Response policies can also be used in a couple of other ways as described here. A DNS zone with a name like example.com becomes responsible for the entire hierarchy beneath it. Response policy rules do not require a DNS zone to be created to modify the behavior of specific DNS names. Matching the response policy also happens before other processing, allowing other private DNS resources to be overridden. For instance if a dev network environment imports (via DNS Peering) a production DNS private zone, specific names can be “patched” to refer to dev endpoints without affecting the rest of the DNS zone. For instance:In the snippet above, set up the response policy and attach it to your DNS Zone. Then create the rule that serves up the development server IP for names that end in dev.example.com. A second example here allows you to block dangerous names on the Internet by redirecting them to an informational IP, without the overhead of managing potentially thousands of “stub” private DNS zones.For instance:The snippet above first creates a response policy called ‘blocklist-response-policy’ that’s attached to your existing network/zone. It then creates a new rule that redirects all DNS requests for bad.actor.com to an informational webserver. Services without sacrificing securityBuilding rich applications cannot come at the cost of sacrificing security, especially in complex, multi-tenant environments. Cloud DNS response policies offer a new and flexible way to configure access to Google APIs. Learn more and try out this new feature by reading the documentation.Related ArticleUnderstanding forwarding, peering, and private zones in Cloud DNSCloud DNS private zones, peering, and logging and auditing enhance security and manageability of your private GCP DNS environment.Read Article
Quelle: Google Cloud Platform

28. Januar 2021

da Agency

Retailers find flexible demand forecasting models in BigQuery ML

Retail businesses understand the value of demand forecasting—using their intuition, product and market experience, and seasonal patterns and cycles to plan for future demand. Beyond the need for forecasts that are as accurate as possible, modern retailers also face the challenge of being able to perform demand planning at scale. Product assortments that span tens of thousands of items across hundreds of individual selling locations or designated marketing areas lead to a number of time series that cannot be managed without the help of big data platforms, and time series modeling solutions that scale accordingly. So far, there have been two ways to address this challenge: Purchase a full end-to-end demand forecasting solution, which takes significant time and resources to implement and maintain. Or leverage an all-purpose machine learning platform to run your own time series models, which requires deep experience in both modeling and data engineering. To help retailers with an easier, more flexible solution for demand planning, we’ve published a Smart Analytics reference pattern for performing time series forecasting with BigQuery ML using autoregressive integrated moving average (ARIMA) as a basis. This ARIMA model follows the BigQuery ML low-code design principle, allowing for accurate forecasts without advanced knowledge of time series models. Moreover, the BigQuery ML ARIMA model provides several innovations over the original ARIMA models that many are familiar with, including the ability to capture multiple seasonal patterns, automated model selection, a no-hassle preprocessing pipeline, and most of all, the ability to effortlessly generate thousands of forecasts at scale with nothing but a few lines of SQL.In this blog, we’ll take a look at the two most common ways demand forecasting teams have been organized, and how BigQuery ML fills a gap between the two, plus discuss how BigQuery ML can help your demand planning recover from unforeseen events like COVID-19. To see the end-to-end process to implement the demand forecasting design pattern, check out this video:Two types of demand forecasting teamsHistorically, large organizations have had two types of demand forecasting teams. We’ll call them the Business Forecasting team and the Science Forecasting team. The Business Forecasting team typically uses full enterprise resource planning (ERP) or software as a service (SaaS) forecasting solutions (or occasionally a homegrown solution) that don’t require an advanced level of data science skill to use. These ERPs produce entirely automated forecasts. Team members often come from the business side of the organization, and instead of deep technical skills, bring extensive domain and business knowledge to their role. Many large brick-and-mortar organizations often use this approach. These types of solutions may scale well, but they require significant time and resources, both to implement and to support. This typically includes large implementation and DevOps teams, multiple dedicated compute and data storage instances, and fixed-schedule hours-long batch cycles to refresh the forecasts.The Science Forecasting team typically features PhD or MSc-level practitioners working within a data science or a tech organization, who are fluent in Python or R. They work with a Cloud AI platform and perform all of the end-to-end forecasting themselves: choosing, building, training, and evaluating a model. Then they deploy the model to production and communicate results to business stakeholders and leadership. This type of team is often found in digital-native organizations.A new type of forecasting teamRecently, a new hybrid type of forecasting team has emerged. Often these are in businesses looking to become more data and model driven, but don’t have the resources to invest in an expensive ERP or hire a PhD-level data scientist. They may have a decent knowledge of forecasting and demand planning, but not enough experience or organizational resources to deploy custom models at scale. Still, this type of team, given the right tools, has the potential to merge the best of both worlds: the advanced modeling of the Science Forecaster and the deep domain knowledge of Business Forecaster.Responding to the unforeseen As nearly every business experienced firsthand in 2020, certain events like the COVID-19 pandemic throw a wrench into demand forecasting signals, making existing models questionable.With an ERP forecasting solution, even a small change to the supply chain and store network configuration will result in a change in demand patterns that requires extensive reconfiguration of the demand planning solution, and the help of a large support team. BigQuery ML reduces the complexity of making such adjustments due to both expected and unexpected events, and because it’s serverless, it autoscales and saves costs in DevOps time and efforts. Regenerating forecasts to adapt to a change in the supply chain network configuration is now a matter of hours, not weeks. Getting started with a BigQuery ML reference patternTo make it easier to get up and running with Google Cloud tools like BigQuery ML, we recently introduced Smart Analytics reference patterns—technical reference guides with sample code for common analytics use cases. We’ve heard that you want easy ways to put analytics tools into practice, and previous reference patterns cover use cases like predicting customer lifetime value, propensity to purchase, product recommendation systems, and more. Our newest reference pattern on Github will help you get a head start on generating time series forecasts at scale. The pattern will show you how to use historical sales data to train a demand forecasting model using BigQuery ML, and then visualize the forecasts in a dashboard. For more details and to walk you through this process, using historical transactional data for Iowa liquor sales data to forecast the next 30 days, check out our technical explainer. In the blog, you’ll learn how to:Pre-process data into the correct format needed to create a demand forecasting model using BigQuery MLFit multiple BQ ARIMA time-series models in BigQuery MLEvaluate the models, and generate forward-looking forecasts for the desired forecast horizonCreate a dashboard to visualize the projected demand using Data StudioSet up scheduled queries to automatically re-fit the models on a regular basisClick to enlargeLet’s do a deeper dive into the concepts we just introduced you to.BigQuery ML bridges the gap between the Businesses Forecaster and the Science ForecasterGiven the features we just described, we see how BQML helps fill the gap between the two current approaches to forecasting at scale, allowing you to build your own demand forecasting platform without the need for highly specialized time series data scientists. It’s an ideal solution for hybrid forecasters, featuring tools you can use to generate forecasts at scale on the fly. Since BigQuery ML lets you train and deploy ML models using SQL, it democratizes your data modeling challenges, opening up your demand forecasting tools and business insights to a larger pool of your organizational talent. For example, the BigQuery ML ARIMA model helps retailers recover from unexpected events with the ability to generate thousands of forecasts with fresh data over a shorter amount of time. You can recalibrate demand forecasts more cost effectively, detect changes in trends, and perform multiple iterations that capture new patterns as they emerge, without mobilizing an entire DevOps team in order to do so. Using BigQuery ML as your forecast engine allows you to bridge the gap between your business or hybrid forecasting teams and advanced data science teams. For example, your forecast analysts will own the task of generating baseline statistical forecasts with BigQuery and reviewing them, but they will loop in a senior data scientist to perform a more advanced causal impact analysis on some of their demand data as needed, or to measure the effect of COVID-19 on shifting demand patterns. Think of it as “DemandOps” instead of “DevOps.” This is also possible if you already have ERP demand planning tools as well, by simply exporting your forecasts and sales actuals into BigQuery whenever they are refreshed, or as needed. Chances are, a retail organization actually has multiple time series forecasts being run by separate business functions. Your merchandising team will be running tactical and operational demand forecasts, finance is performing top-line revenue forecasts, while supply chain are running their own forecasts for capacity planning at the data center level, each using their own specific tool set. These forecasts are being generated in isolation, but reconciling them would improve accuracy and provide the organization with valuable holistic insights into their business that siloed forecasts and analysis can’t provide. For example, based on market and product signals, merchandising may forecast an increase in demand for a certain product. Separately, supply chain will be aware of various manufacturing and logistics stressors that project a decrease in the product shipments. Typically this discrepancy won’t be caught for several weeks, and will then be resolved via emails and meetings. By then it’s too late, since conflicting planning decisions were already made by the separate teams, and the proverbial damage is done. Using BigQuery as a centralized forecast analysis platform would allow a retailer to detect such discrepancy in a matter of hours or days, and react accordingly, instead of having to roll back planning decisions several weeks after the fact. BigQuery and BigQuery ML provide the perfect platform for collaboration between disparate and diverse forecasting teams, beyond just the powerful modeling capabilities of BQARIMA. Google Cloud offers several solutions to help you enhance your demand forecasting capabilities and optimize inventory levels amidst changing times. Besides the BigQuery ML tools described in this blog, there are also: Building your own time series models, either statistical or ML-based, using your preferred open source frameworks on Cloud AI Platform Jupyterlab instancesUse AutoML Forecast to automatically select and train cutting edge deep learning time series models Use our upcoming fully managed forecasting solution, Demand AI (currently in experimental status)Work with a partner like o9 to implement their retail planning platform with forecasting capabilities on Google CloudFor more examples of data analytics reference patterns, check out the predictive forecasting section in our catalog. Ready to get started with BigQuery ML? Read more in our product introduction.Want to dig deeper into BigQuery ML capabilities? Sign up here for free training on how to train, evaluate and forecast inventory demand on retail sales data with BigQuery ML.Related ArticleMost popular public datasets to enrich your BigQuery analysesCheck out free public datasets from Google Cloud, available to help you get started easily with big data analytics in BigQuery and Cloud …Read Article
Quelle: Google Cloud Platform

28. Januar 2021

da Agency

The evolution of data architecture at The New York Times

Like virtually every business across the globe, The New York Times had to quickly adapt to the challenges of the coronavirus pandemic last year. Fortunately, our data system with Google Cloud positioned us to perform quickly and efficiently in the new normal. How we use dataWe have an end-to-end type of data platform; on one side we work very closely with our product teams to collect the right level of data that they’re interested in, such as which articles people are reading, and how long they’re staying onsite. We frequently measure our audience to understand our user segments, and how they come onsite or use our apps. We then provide that data to analysts for end-to-end analytics. On the other side, the newsroom is also focused on audience, and we build tools to help them understand how Google Search or different social promotions play a role in a person’s decision to read The New York Times, and also to get a better sense of their behavior on our pages. With this data, the newsroom can make decisions about information that should be displayed on our homepage or in push notifications. Ultimately, we’re interested in behavioral analytics—how people engage with our site and our apps. We want to understand different behavioral patterns, and which factors or features will encourage users to register and subscribe with us. We also use data to create or curate preferences around personalization, to ensure we’re delivering to our users fresh content, or content that they may not have normally read. Likewise, our data also gets used in our targeting system, so that we can send out the right messaging about our various subscription packages to the right users.Choosing to migrate to Google CloudWhen I came to The New York Times over five years ago, our data architecture was not working for us. Our infrastructure was gathering data that proved harder for analysts to crunch on a daily basis. We were also hitting hang ups with how that data was streaming into our system and environment. Back then we’d run a query and then go grab some coffee, hoping that the query would finish or give us the right data by the time we came back to our desks. Sometimes it would, sometimes it wouldn’t.We realized that Hadoop was definitely not going to be the on-premises solution for us, and that’s when we started talking with the Google Cloud team. We began our digital transformation with a migration to BigQuery, their fully managed, serverless database warehouse. We were under a pretty aggressive migration timeline, focusing first on moving over analytics. We made sure our analysts got a top-of-the-line system that treated them the way that they themselves would want to treat the data. One significant prominent requirement in our data architecture choice was to enable analysts to be able to work as quickly as they needed to provide high-quality deliverables for their business partners. For our analysts, the transition to BigQuery was night and day. I still remember when my manager ran his very first query on BigQuery and was ready to go grab his coffee, but the query finished by the time he got up from his chair. Our analysts talk about that to this day.While we were doing the BigQuery transition, we did have concerns about our other systems not scaling correctly. Two years ago, we weren’t sure we’d be able to scale up to the audience we expected on that election day. We were able to band-aid a solution back then, but we knew we only had two more years to figure out a real, dependable solution. During that time, we moved our streaming pipeline over to Google Cloud, primarily using App Engine, which has been a flexible environment that enabled quick scaling changes and requirements as needed. Dataflow and Pub/Sub also played significant roles in managing the data. In Q4 of 2020 we had our most significant traffic ever recorded, at 273 million global readers, and four straight days of the highest traffic we’ve had compared to other election weeks. We were proud to see that there was no data loss.A couple of years ago, on our legacy system, I was up until three in the morning one night trying to keep data running for their needs. This year, for election night, I relaxed and ate a pint of ice cream because I was able to more easily manage our data environment, allowing us to set and meet higher expectations for data ingestion, analysis and insight among our partners in the newsroom.How COVID-19 changed our 2020 roadmap The coronavirus pandemic definitely wasn’t on my team’s roadmap for 2020, and it’s important to mention here that The New York Times is not fundamentally a data company. Our job is to get the news out to our users every single day in paper, on apps, and onsite. Our newsroom didn’t expect the need to build out a giant coronavirus database that would enrich the news they share every day. Our newsroom moves quickly, and our engineers have built one of the most comprehensive datasets on COVID-19 in the U.S. With Google, The New York Times decided to make our data publicly available on BigQuery Google’s COVID-19 public dataset. Check out this webinar for more details on our evolution architecture:Flexible approachWe have many different teams that work within Google Cloud, and they’ve been able to pick from the range of available services and tailor project requirements keeping those tools available in mind.One challenge we think about with the data platform at The New York Times is determining the priorities of what we build. Our ability to engage with product teams at Google though the Data Analytics Customer Council allows us to see into the BigQuery roadmap, or the data analytics roadmap, and plays a significant role in determining where we focus our own development. For example, we’ve built tools like our Data Reporting API, which reads data directly from BigQuery, in order to take advantage of tools like BigQuery BI Engine. This approach encourages our analysts to be better managers of their domains around dimensions and metrics, but not have to focus on building caching mechanisms of their data. Getting that kind of clarity helps us plan how to build The New York Times in the new normal and beyond.If you are interested to learn more about the data teams at the New York Times, take a look at our open tech roles here and you’ll find many interesting articles at NYT data blog.Related ArticleThe democratization of data and insights: making real-time analytics ubiquitousWe examine data access, data insights, and machine learning in the context of real-time data analysis, and how Google Cloud is working to…Read Article
Quelle: Google Cloud Platform