How to Recreate the 'Page Value' Metric in GA4 with BigQuery

How to Recreate the 'Page Value' Metric in GA4 with BigQuery

Learn to recreate the 'Page Value' metric in GA4 with BigQuery. This guide covers SQL queries for GA4, BigQuery tutorials, and custom metrics for deeper insights into page performance.

Shreya Banker

Data Scientist

Data Analyst enthusiast. More than 7 years of exposure in Data Analysis and Software programming. I am a highly motivated, versatile IT professional with experience in Data Analysis, Visualization and Database Management. I look for the hardest problem to solve and where I can learn and develop the most. I love a challenge and never run from a difficult task. I'm determined to succeed, and I look forward to what life has to offer.

One of the core metrics used in Universal Analytics (UA) to measure page performance was Page Value. It provided insight into which pages were contributing most to conversions and revenue. However, with the switch to Google Analytics 4 (GA4), this metric is no longer available by default. 

Fortunately, GA4’s BigQuery integration allows us to recreate the Page Value metric and extend its capabilities beyond what was possible in UA.  In this post, we will walk through how to build the Page Value metric in GA4 using SQL queries in BigQuery, with an emphasis on technical implementation and flexibility for future customization.

What was the 'Page Value' Metric in Universal Analytics?

The Page Value metric in UA calculated the average value of a page by attributing revenue from e-commerce transactions or goal completions to all the pages a user visited before the transaction occurred. It was particularly useful for identifying which content is most effective at driving users toward conversion actions. Pages that were not involved in a conversion were assigned a value of $0.

In GA4, where the event-driven data model replaces the session-based model of UA, recreating this metric requires a different approach. In GA4, all user interactions are captured as events, and sessions are not as explicitly defined. Thus, we'll need to manipulate the data to approximate the session structure and ensure page views preceding conversions are credited appropriately.

Why BigQuery is Essential for Custom Metrics in GA4?

GA4’s BigQuery integration allows you to access raw event-level data, giving you complete control over how metrics and dimensions are calculated. While GA4’s standard reporting interface is sufficient for basic analysis, BigQuery unlocks the ability to go beyond the preset metrics and dimensions. This makes it possible to recreate metrics like Page Value and even customize them to suit specific business needs.

BigQuery’s SQL-like syntax allows for complex data transformations and calculations that are often not possible in the GA4 interface. For instance, GA4 does not track Page Value natively, but by querying GA4 data in BigQuery, we can assign revenue to pages based on user interactions.

Key GA4 Data Fields for Building the Page Value Metric

GA4’s event-based schema is structured differently from UA’s. Some of the key fields we will use to recreate the Page Value metric include:

  • event_name: Identifies the type of event, such as page_view or purchase.

  • event_params: A nested array storing key-value pairs related to each event. For example, page_location (URL) for page views and ga_session_id for session tracking.

  • user_pseudo_id: A unique identifier for users across sessions.

  • event_value_in_usd: This field captures the revenue associated with purchase events, in USD.

Each interaction—whether it’s a page view, purchase, or custom conversion—is stored as an event, and the fields relevant to that event are stored within the event_params array. To build a useful query, we must carefully extract and aggregate data from these nested structures.

Step-by-Step Guide to Recreate the Page Value Metric in GA4 Using BigQuery

Let’s walk through the SQL query to recreate the Page Value metric in GA4 using BigQuery. This process will involve several subqueries that transform raw event data into the desired output, attributing revenue to the pages viewed before a transaction.

1. Defining the Date Range in BigQuery

First, we define a date range for the query using a WITH clause. This helps to ensure consistency throughout the query and allows for dynamic date adjustments.

WITH dates AS (

  SELECT

    '20241001' AS start_date,

    FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) AS end_date

),

In this example, we’ve hardcoded a start date (20241001) and set the end date dynamically as yesterday (DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)).

2. Extracting Revenue Events from GA4 Data

Next, we extract all revenue-generating events. In GA4, this is typically captured under the purchase event with a corresponding event_value_in_usd. We’ll concatenate user_pseudo_id with ga_session_id to create a unique session identifier.

revenue_events AS (

  SELECT

    CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) AS session_id,

    event_timestamp,

    SUM(event_value_in_usd) AS event_value

  FROM `your-project.your-dataset.events_*`

  WHERE _TABLE_SUFFIX BETWEEN (SELECT start_date FROM dates) AND (SELECT end_date FROM dates)

    AND event_value_in_usd IS NOT NULL

  GROUP BY session_id, event_timestamp

)

This subquery selects all events with a non-null event_value_in_usd and aggregates revenue by session.

3. Identifying and Extracting Page Views

We need to identify the pages viewed before a transaction to calculate page value. We extract page views based on the event_name = 'page_view', and retrieve the page’s URL from the page_location parameter within the event_params array.

, page_views AS (

  SELECT

    CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) AS session_id,

    event_timestamp,

    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page_location

  FROM `your-project.your-dataset.events_*`

  WHERE _TABLE_SUFFIX BETWEEN (SELECT start_date FROM dates) AND (SELECT end_date FROM dates)

    AND event_name = 'page_view'

)

This subquery retrieves all page view events for the specified date range.

4. Combining Page Views and Revenue Events

Now we need to combine page views and revenue events within the same session. We create a "pseudo-session" that splits sessions based on revenue event timestamps. This allows us to track page views that occurred before each revenue event.

, revenue_and_pages AS (

  SELECT

    CONCAT(page_views.session_id, revenue_events.event_timestamp) AS pseudo_session_id,

    revenue_events.event_value,

    page_views.page_location,

    CASE

      WHEN page_views.event_timestamp < revenue_events.event_timestamp THEN TRUE

      ELSE FALSE

    END AS is_before_revenue_event

  FROM page_views

  FULL OUTER JOIN revenue_events

  ON page_views.session_id = revenue_events.session_id

)

Here, we perform a FULL OUTER JOIN between page_views and revenue_events, ensuring that all relevant combinations of page views and revenue in a session are considered. We then use a CASE statement to filter out page views after the revenue event.

5. Distributing Revenue Across Relevant Page Views

In this step, we distribute the revenue from each session across the relevant page views that preceded the conversion. This is done using a window function.

, page_revenue_split AS (

  SELECT

    pseudo_session_id,

    page_location,

    event_value / COUNT(page_location) OVER (PARTITION BY pseudo_session_id) AS page_revenue

  FROM revenue_and_pages

  WHERE is_before_revenue_event

)

The window function COUNT(page_location) OVER (PARTITION BY pseudo_session_id) ensures that the revenue is evenly distributed among all the page views that led up to the transaction.

6. Calculating Total Revenue

We now sum the total revenue attributed to each page across all sessions.

, total_page_revenue AS (

  SELECT

    page_location,

    SUM(page_revenue) AS total_revenue

  FROM page_revenue_split

  GROUP BY page_location

)

This gives us the total revenue contributed by each page.

7. Calculating Page Value

Finally, we calculate the average Page Value by dividing the total revenue for each page by the total number of times the page was viewed.

, page_views_count AS (

  SELECT

    page_location,

    COUNT(*) AS total_views

  FROM page_views

  GROUP BY page_location

)

SELECT

  views.page_location,

  views.total_views,

  IFNULL(revenue.total_revenue, 0) / views.total_views AS page_value

FROM page_views_count AS views

LEFT JOIN total_page_revenue AS revenue

ON views.page_location = revenue.page_location

ORDER BY views.total_views DESC

In this final query, we use a LEFT JOIN to combine the total number of page views with the total revenue for each page and calculate the Page Value by dividing the revenue by the number of views.

How to Use the Page Value Metric for Non-E-commerce Websites

For websites that do not have e-commerce transactions, the Page Value metric can still be applied by attributing value to other important conversion events, such as form submissions, sign-ups, or lead generation actions. Here’s how you can modify the query for non-e-commerce sites:

  1. Identify Key Conversion Events: Replace event_name = 'purchase' with your custom conversion event names, such as event_name = 'form_submit' or event_name = 'lead_generation'.

  2. Assign Default Conversion Values: If your site doesn’t track revenue, you can assign a default value to conversions. For instance, you might assign a value of 1 to each form submission, allowing you to measure the influence of pages on non-monetary goals.

, conversion_events AS (

  SELECT

    CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) AS session_id,

    event_timestamp,

    1 AS event_value  -- Assigning a default value to each conversion event

  FROM `your-project.your-dataset.events_*`

  WHERE _TABLE_SUFFIX BETWEEN (SELECT start_date FROM dates) AND (SELECT end_date FROM dates)

    AND event_name = 'form_submit'  -- Replace with your conversion event

)

By modifying the query to focus on conversions rather than revenue, you can still calculate a Page Value metric based on how pages contribute to non-revenue actions.

Putting it all together

Here’s a look at the full e-commerce version of the query described above:

WITH dates AS (

  -- Define the date range dynamically

  SELECT

    '20241001' AS start_date,

    FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) AS end_date

),

-- Extracting Revenue Events

revenue_events AS (

  SELECT

    CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) AS session_id,

    event_timestamp,

    SUM(event_value_in_usd) AS event_value

  FROM `{your_project}.{your_dataset}.events_*`

  WHERE _TABLE_SUFFIX BETWEEN (SELECT start_date FROM dates) AND (SELECT end_date FROM dates)

    AND event_value_in_usd IS NOT NULL

  GROUP BY session_id, event_timestamp

),

-- Extracting Page Views

page_views AS (

  SELECT

    CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) AS session_id,

    event_timestamp,

    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page_location

  FROM `{your_project}.{your_dataset}.events_*`

  WHERE _TABLE_SUFFIX BETWEEN (SELECT start_date FROM dates) AND (SELECT end_date FROM dates)

    AND event_name = 'page_view'

),

-- Combining Page Views with Revenue Events

revenue_and_pages AS (

  SELECT

    CONCAT(page_views.session_id, revenue_events.event_timestamp) AS pseudo_session_id,

    revenue_events.event_value,

    page_views.page_location,

    CASE

      WHEN page_views.event_timestamp < revenue_events.event_timestamp THEN TRUE

      ELSE FALSE

    END AS is_before_revenue_event

  FROM page_views

  FULL OUTER JOIN revenue_events

  ON page_views.session_id = revenue_events.session_id

),

-- Distributing Revenue Across Page Views

page_revenue_split AS (

  SELECT

    pseudo_session_id,

    page_location,

    event_value / COUNT(page_location) OVER (PARTITION BY pseudo_session_id) AS page_revenue

  FROM revenue_and_pages

  WHERE is_before_revenue_event

),

-- Summing Revenue by Page

total_page_revenue AS (

  SELECT

    page_location,

    SUM(page_revenue) AS total_revenue

  FROM page_revenue_split

  GROUP BY page_location

),

-- Counting Page Views

page_views_count AS (

  SELECT

    page_location,

    COUNT(*) AS total_views

  FROM page_views

  GROUP BY page_location

)

-- Calculating Page Value

SELECT

  views.page_location,

  views.total_views,

  IFNULL(revenue.total_revenue, 0) / views.total_views AS page_value

FROM page_views_count AS views

LEFT JOIN total_page_revenue AS revenue

ON views.page_location = revenue.page_location

ORDER BY views.total_views DESC

Tips to Optimize BigQuery Performance for Page Value Metrics

Consider materializing the results in a separate table to make this query more efficient, especially if you’re running it frequently. This can help reduce BigQuery processing costs and speed up query execution in tools like Looker Studio.

Conclusion

Recreating the Page Value metric in GA4 using BigQuery allows you to measure the impact of individual pages on conversions and revenue, even when these metrics aren’t available by default. Whether you’re running an e-commerce site or tracking non-revenue goals, BigQuery provides the flexibility to build custom metrics that suit your business needs.

For non-e-commerce websites, you can still use this approach to track page contributions to key actions such as form submissions, lead generation, or other goals. With the ability to tailor the metric to your site’s unique requirements, you can gain deeper insights into how content drives value across your digital property.

By leveraging the power of BigQuery, you're not only replicating the old Page Value metric but also unlocking new possibilities for analysis in GA4.

About Napkyn

Napkyn specializes in empowering businesses with tailored analytics solutions, offering expertise in GA4 and BigQuery to transform data into actionable insights. Their custom BigQuery reports are designed to meet unique business needs, providing deep insights into user behavior, campaign performance, and e-commerce trends. By combining advanced analytics with robust reporting, Napkyn helps organizations make data-driven decisions to optimize marketing strategies and drive growth.


Looking to harness the full potential of GA4 and BigQuery for your business? Let Napkyn guide you with custom reporting solutions that deliver real results. Get in touch with us today to start transforming your data into your most valuable business asset!

Sign Up For Our Newsletter

Sign Up For Our Newsletter

Napkyn Inc.
204-78 George Street, Ottawa, Ontario, K1N 5W1, Canada

Napkyn US
6 East 32nd Street, 9th Floor, New York, NY 10016, USA

212-247-0800 | info@napkyn.com  

Napkyn Inc.
204-78 George Street, Ottawa, Ontario, K1N 5W1, Canada

Napkyn US
6 East 32nd Street, 9th Floor, New York, NY 10016, USA

212-247-0800 | info@napkyn.com