Contribution analysis overview

You can use contribution analysis, also called key driver analysis, to generate insights about changes to key metrics in your multi-dimensional data. For example, you can use contribution analysis to see the change in revenue numbers across two quarters, or to compare two sets of training data to understand changes in an ML model's performance. You can use a CREATE MODEL statement to create a contribution analysis model in BigQuery.

Contribution analysis is a form of augmented analytics, which is the use of artificial intelligence (AI) to enhance and automate the analysis and understanding of data. Contribution analysis accomplishes one of the key goals of augmented analytics, which is to help users find patterns in their data.

A contribution analysis model detects segments of the data that show statistically significant changes in a metric across time, by comparing a test set of data to a control set of data. This lets you see how the data changes across time, location, customer segment, or any other metric that you care about. For example, you might compare a table snapshot taken at the end of 2023 to a table snapshot taken at the end of 2022 to see how the data differs across two years.

The metric is the numerical value that contribution analysis models use to measure and compare the changes between the test and control data. You can specify either a summable metric or a summable ratio metric with contribution analysis models.

A segment is a slice of the data identified by a given combination of dimension values. For example, for a contribution analysis model based on the store_number, customer_id, and day dimensions, every unique combination of those dimension values represents a segment. In the following table, each row represents a different segment:

store_number customer_id day
store 1
store 1 customer 1
store 1 customer 1 Monday
store 1 customer 1 Tuesday
store 1 customer 2
store 2

To model just the largest and therefore most relevant segments, specify an apriori support threshold that prunes small segments from use by the model. This also reduces model creation time.

After you have created a contribution analysis model, you can use the ML.GET_INSIGHTS function to retrieve the metric information calculated by the model.

What's next