#GoogleBigQueryML
Explore tagged Tumblr posts
Text
Google BigQuery ML Contribution Analysis for Metric Insights

Google BigQuery ML Contribution Analysis
BigQuery ML contribution analysis generates insights automatically and is now available.
BigQuery ML contribution analysis is now GA, per Google Cloud. The preview version of this function, released in September 2024, helps users understand metrics changes by finding major change sources from large-scale multidimensional data. BigQuery ML software engineers Jenny Ortiz and Katelin Amann announced it.
Manual examination and trial-and-error querying and visualisation have been needed to extract insights from massive volumes of multidimensional data, such as sales data across products, stores, locations, and consumers in conjunction with other events. This method is difficult and yields many combinations to study. Google BigQuery ML contribution analysis automates this process, allowing customers to quickly discover problem areas.
The GA release includes several new features to help identify the most important insights faster:
Using top-k insights, Apriori support tunes automatically: By giving the model the amount of insights they want, users may now let the model set the min_apriori_support threshold automatically instead of manually. Apriori support gives the model the most significant insights based on data segment size. Both methods reduce query latency compared to returning all possible insights, which could be millions.
The new pruning_method option removes extraneous insights, improving insight readability. Redundancy may occur when many insights, especially in connected data, have the same output metrics. If all of the store's sales took place in that city, then sales data for segments described by [city='Iowa City', store_name='General Store / Iowa City'] and [store_name='General Store / Iowa City'] may have the same metrics. Prune returns just distinct insights from the most descriptive portion.
The preview initially supported summable (aggregating a single measure) and summable ratio (aggregating a ratio of two measurements). The summable by category metric was added. The summable by category measure in the GA release allows users to analyse metric totals normalised by categorical variables like sales per customer or site visitors per day. This new statistic helps correct outliers when comparing groups with different quantities of rows, such as revenue per month across years with different data availability.
Contribution analysis in action
An example shows how to utilise the feature to understand a 2020–2021 reduction in apparel product sales per user on a public e-commerce dataset. To query the model for insights, users must create an input table, define a model with MODEL_TYPE=’CONTRIBUTION_ANALYSIS’, specify dimension and test columns, and use the new CONTRIBUTION_METRIC=’SUM(sales)/COUNT(DISTINCT user_id)’ with TOP_K_INSIGHTS_BY_APRIORI_SUPPORT = 15 and PRUNING_METHOD=’PRUNE_REDUNDANT_INSIGHTS’. The output, automatically sorted by contribution, shows the reduction in US revenues per user due to referral traffic. This knowledge supposedly aids corporate strategy.
#GoogleBigQuery#GoogleBigQueryML#BigQueryML#contributionanalysis#GoogleCloud#News#Technews#Techology#Technologynews#Technologytrendes#Govindhtech
0 notes