What is data sampling?

Dimension cardinality – the number of unique values for a dimension – or the data amount of a data set can lead to too complex queries. To handle this, Google Analytics reads as much as it can until it hits a limit. The read data points are then used to estimate the rest of the data. This happens in the Google Analytics UI as well as in the API. In the UI, for many reports, you will see a yellow shield showing a warning when you hover over it. The warning in the Google Analytics UI will specify the percentage of data points that were read. Other reasons for receiving sampled data is when querying very recent data or historical data. Sampling of data can be avoided by making smaller queries, using fewer or less complex dimensions.

Sampling thresholds

The data amount for the queried date range is one reason your data can be sampled. The thresholds are listed below.

Multi-Channel Funnels reports

  • 1M conversions

All other reports

  • Analytics Standard: 500k sessions
  • Analytics 360: 1M sessions
  • Analytics 360 using resource based quota: 100M sessions

Read more about the thresholds in this Google help article.

Actions we take

  • We always ask for the highest precision when querying data, we do this by sending the LARGE or HIGHER_PRECISION (same thing but for different APIs) parameter value. This action will result in slightly slower but more accurate reports.
  • We split reports that exceed the sampling threshold, into multiple smaller reports. Lowest date range currently supported by GA is one day.
  • We ask for todays data separately to not get sampled data for nearby dates, this is because intraday queries always can be sampled.
  • We automatically use resource based quota for GA360 users (not available for the MCF report or for data farther back than 1 year) where we have encountered sampled data.

Actions you can take

Use less complex reports

The number of dimensions in data source and the complexity (cardinality) of the dimensions greatly impacts sampling level. Therefore not selecting unnecessary dimensions helps against sampling.

Using custom tables

If you are a GA360 user you can ask Google Analytics to setup a custom table that includes the dimensions and metrics (and segments) you want to have unsampled. The set of fields and segments should then match what you have selected for the data source you have in Funnel.

Read more about custom tables in this Google help article.

Did this answer your question?