What is data sampling
Dimension cardinality – the number of unique values for a dimension – or the data amount of a data set can lead to too complex queries. To handle this, Google Analytics reads as much as it can until it hits a limit. The read data points are then used to estimate the rest of the data. This happens in the Google Analytics UI as well as in the API. In the UI, for many reports, you will see a yellow shield showing a warning when you hover over it. The warning in the Google Analytics UI will specify the percentage of data points that were read. Other reasons for receiving sampled data is when querying very recent data or historical data. Sampling of data can be avoided by making smaller queries, using fewer or less complex dimensions.
The data amount for the queried date range is one reason your data can be sampled. The thresholds are listed below.
Multi-Channel Funnels reports
- 1M conversions
All other reports
- Analytics Standard: 500k sessions
- Analytics 360: 1M sessions
- Analytics 360 using resource based quota: 100M sessions
Read more about the thresholds in this Google help article.
Actions we take
- We always ask for the highest precision when querying data, we do this by sending the LARGE or HIGHER_PRECISION (same thing but for different APIs) parameter value. This action will result in slightly slower but more accurate reports.
- We split reports that exceed the sampling threshold, into multiple smaller reports. Lowest date range currently supported by GA is one day.
- We automatically use resource based quota for GA360 users (not available for the MCF report) where we have encountered sampled data.
Actions you can take
Using custom tables
If you are a GA360 user you can ask Google Analytics to setup a custom table that includes the dimensions and metrics (and segments) you want to have unsampled. The set of fields and segments should then match what you have selected for the data source you have in Funnel.
Read more about custom tables in this Google help article.