In data analysis, sampling is the practice of analysing a subset of all data in order to uncover the meaningful information in the larger data set. For example, if you wanted to estimate the number of trees in a 100-acre area where the distribution of trees was fairly uniform, you could count the number of trees in 1 acre and multiply by 100, or count the trees in a half acre and multiply by 200 to get an accurate representation of the entire 100 acres.
When requesting a large amount of data from Google Analytics, Google takes a smaller portion of this data in order to calculate the total. For example, if you receive 1,000 transactions per 100,000 sessions, you would expect to receive around 5,000 transactions for 500,000 sessions.
You can learn more about sampling in Google Analytics in their help center.
Sampled data in Funnel
The data that Funnel pulls in from Google Analytics can be sampled, depending on how complex the Google Analytics data is. In general, Google will look at a maximum of 500,000 sessions when calculating the results of a particular query. The number of sessions depends on the date range and what dimensions/metrics are included.
This means that if looking at one dimension and one metric in Google Analytics, let's say transactions by source, it's less likely that the data will be sampled. If you look at 3 dimensions with ten metrics, there's a good chance that the data will be sampled.
Whether data is sampled in Funnel or not is beyond our control. We can never guarantee un-sampled data, but we do what we can to avoid it. This includes requesting small batches of data and using the HIGHER_PRECISION for the samplingLevel attribute, which you can read more about here.
With the above in mind, it's probable that when viewing Google Analytics data in Funnel which has over 500,000 sessions, there's a good change that the metric values will differ slightly.