Menu

Using Google Analytics Data API to backfill GA4 data in BigQuery

Wolverine Worldwide + Napkyn

Challenge

Over the past decades, Wolverine Worldwide Inc. has built a brand reputation for creating good products for good people, with an extensive line of products found in approximately 200 countries and territories worldwide.

After transitioning from Universal Analytics to GA4, Wolverine had a period of instability in their GA4 data, resulting in an approximately 10-month data gap in BigQuery native exports in more than 40 properties,  as they waited to connect GA4 to BigQuery after the data was validated. 

Wolverine partnered with Napkyn to increase the historical GA4 data range that is consistent with Wolverine’s business requirements that would be available in BigQuery for analysis or data modeling.

Over the past decades, Wolverine Worldwide Inc. has built a brand reputation for creating good products for good people, with an extensive line of products found in approximately 200 countries and territories worldwide.

After transitioning from Universal Analytics to GA4, Wolverine had a period of instability in their GA4 data, resulting in an approximately 10-month data gap in BigQuery native exports in more than 40 properties,  as they waited to connect GA4 to BigQuery after the data was validated. 

Wolverine partnered with Napkyn to increase the historical GA4 data range that is consistent with Wolverine’s business requirements that would be available in BigQuery for analysis or data modeling.

Solution

It was determined that leveraging Google Analytics Data API and inserting the reports into BigQuery would be the best way to close the data gap. The data that has been decided to be backfilled is a subset of the data filled in for the dates before the native export.

Napkyn team constructed meaningful reports based on Wolverine’s business requirements, scripted them using Python in Google Analytics Data API, and pushed the reports to BigQuery where the rest of GA4 data for each property exists. Napkyn was able to use the iterative approach with reports being built and tested for one GA4 property and replicated further for 40 more properties to optimize the delivery due to the fact that all the sites have standard tracking implementation.

It was determined that leveraging Google Analytics Data API and inserting the reports into BigQuery would be the best way to close the data gap. The data that has been decided to be backfilled is a subset of the data filled in for the dates before the native export.

Napkyn team constructed meaningful reports based on Wolverine’s business requirements, scripted them using Python in Google Analytics Data API, and pushed the reports to BigQuery where the rest of GA4 data for each property exists. Napkyn was able to use the iterative approach with reports being built and tested for one GA4 property and replicated further for 40 more properties to optimize the delivery due to the fact that all the sites have standard tracking implementation.

Results

Increased historical data range: By reducing the 10-month gap in GA4 data that is available in BigQuery, Wolverine gains the ability to leverage that data to run predictive ML models that require at least 2 years of data to be accurate

Saved implementation resources: The iterative approach and standard tracking saved multiple hours of scripting with the ability to move the data scientist’s resources to other projects.

Increased 1P Data availability: While GA4 data retention is limited to 50 months, by adding first-party data to data warehouses like BigQuery it preserves it for future use



Increased historical data range: By reducing the 10-month gap in GA4 data that is available in BigQuery, Wolverine gains the ability to leverage that data to run predictive ML models that require at least 2 years of data to be accurate

Saved implementation resources: The iterative approach and standard tracking saved multiple hours of scripting with the ability to move the data scientist’s resources to other projects.

Increased 1P Data availability: While GA4 data retention is limited to 50 months, by adding first-party data to data warehouses like BigQuery it preserves it for future use



Back to Case Studies

Latest Insights and Education

Latest Insights and Education

Google Tag Manager Tag Diagnostics: Troubleshooting and Optimization Guide

Ricardo Cristofolini

Senior Implementation Specialist, Data Solutions

Dec 11, 2024

November 2024 GA4 & GMP Updates
November 2024 GA4 & GMP Updates

November 2024 GA4 & GMP Updates

Napkyn

Dec 4, 2024

Server-Side Google Tag Manager Multi-Region vs Single-Region Deployment setup illustration
Server-Side Google Tag Manager Multi-Region vs Single-Region Deployment setup illustration

Server-Side Google Tag Manager: Multi-Region vs. Single-Region Deployment

Server-Side Google Tag Manager: Multi-Region vs. Single-Region Deployment

Ketul Dave

Implementation Specialist

Nov 27, 2024

More Insights

Sign Up For Our Newsletter

Sign Up For Our Newsletter

Napkyn Inc.
204-78 George Street, Ottawa, Ontario, K1N 5W1, Canada

Napkyn US
6 East 32nd Street, 9th Floor, New York, NY 10016, USA

212-247-0800 | info@napkyn.com