Before we go into the best practices it is important to understand the nature of the data that is collected from different marketing platforms. For example, most of the advertising platforms already have aggregated their metrics to a particular day, Google Ads gives the number of clicks on that day instead of each click so the data is not raw but aggregated per day. Since the metrics are already aggregated, the value may occasionally be updated one or more days after a click occurs. This data takes a few days to finalize due to conversions occurring days after the initial click and invalid traffic that needs to be removed.

Below are the things to keep in mind when configuring exports

  • Use case: As a digital marketer, start by defining the business question which you want to answer, then export only the fields which are required. If your use case is small with 5-10 business questions then you can probably include all your fields in one export.
    But, if the scope of the business need is wide then it is better to divide these exports into different business areas. A business intelligence analyst might split their exports by source types, to follow their organizational data warehouse architecture.

  • Stakeholders: Think about which teams/processes in your organization will be dependent on the exported data. If changes performed by one team to the export configuration can affect the other team, it might be a better idea to split them into two different exports.

  • Data enrichment: If you just want to enrich the data already available in your data warehouse with marketing data then it might be best to export the data based on platform type. Harmonize some of the fields in Funnel for platforms like Facebook ads, Google Ads, Tik-Tok, Twitter, and export them to different tables. Later on, this data can be joined using Date.

For companies that want to build robust and reliable data analytics, they must take time to carefully plan on how to organize their exports and what to include in them. For a BI team, it is critical to understand which metrics are coming from which sources for better data governance which can avoid future headaches.

When you have i) growing business needs with multiple KPIs to track, ii) data is managed by another team, and iii) reporting is centralized it’s almost always a good idea to split your exports into platform types and export them into different tables.

Some of the advantages when this is done right:

  • General Sanity: Building an analytics stack is already a complex procedure, nobody wants to make it even harder by adding fields that may/may not be used, but still have to be maintained.

  • Dependence: If all my fields are in one table, then all the processes which are dependent on Funnel data will be affected. For example, if the Marketing team wants to change a field that is only for a specific use case, it might affect the bigger analytics team pipeline/workload (which might take hours to run)

  • Size and Cost: Modern data warehouses are built to handle huge amounts of data, but adding general inefficiencies at an early stage can be costly over a long period. For example, a table with 1000 columns for the same date range will take more space than 10 tables with 100 columns each for the same data range.

  • Reduce noise: When it comes to data, less is often more. It’s not about the quantity of data exported, it’s about exporting good quality, meaningful data that key business decisions can be based on.

Frequently asked questions regarding Funnel exports

How to export data that has changed since the last export (deltas)?

Funnel makes sure that whenever you look at the exported data you get the correct picture of what is happening across different sources which are connected i.e. a single source of truth. Our exports by default will update time ranges where data has changed. A change in exported data may be triggered by the addition of (i) a new data source (where we typically collect 2 years of historical data), (ii) updated data in a data source (within the look-back window), or (iii) changes in dimension/metric rules within Funnel that affect their data.

In general, it's not possible to identify individual records that change, but only time ranges. The reason is that depending on their data source configuration, data is already aggregated from the source e.g. Facebook Ads data by date and campaign.

But, If you would like to get only the data that has changed from the last export, then you have to implement that logic in your data pipeline. This can be achieved in multiple different ways depending on your ingestion pipeline. The two most common ways to achieve this is a) Include a RunId when setting a file export, that creates unique files for each export. b) Think Funnel data export is continuously updated, then take the latest export and then compare with what you have in your data warehouse to get the difference.

Both these approaches have their limitations and require considerable engineering work, using a direct export to data warehouse from Funnel can help mitigate this.

Where can I get the primary keys to the data rows?

Most of the advertising platforms already have aggregated their metrics to a day, For example, Google Ads gives the number of clicks on that day instead of each click so the data is not raw but aggregated on a day from the source. There are no real "primary keys" in this data and a single logical "conversion" may move both between dates and between ads/campaigns in their reporting as it evolves until the data is settled.
If you’d like, you can combine different dimensions to create a unique key that could be possible to identify a set of observations, though make sure to carefully select all the fields, as this process is error-prone and could lead to wrong results.

My data is exported by each table per source type, how do I join them?

It is a best practice to harmonize fields like "campaign" and "impressions" across their sources in Funnel before exporting those, but if exporting a large number of fields we recommend having separate exports with native fields for each platform. After exporting them into different tables this data can be joined using date, as that field would be common across all the exports.

Which files/tables have changed since the last export?

If you are using Funnel’s file storage options such as Google Cloud Storage, Amazon S3, Azure Blob Storage, and SFTP. To enable better data cataloging, pipeline management, and error handling Funnel provides an option to export a summary file with a few options of file types such as JSON, CSV, and TXT. The summary file contains the information about the files that were exported and if they were updated or not since the last run.
This information can be used to design a data pipeline, where the next step of the pipeline can be triggered based on the summary file.
If you are using a BigQuery export you can choose to create a summary table containing the same information in the export

What does "Only export periods with updated data" mean?

To simplify it a bit we mark each day from each source as ‘clean’ or ‘dirty’ when exporting, we then export from the first of the ‘dirty’ days. Since we’re exporting monthly files, this, in turn, results in all months from the month with the first ‘dirty’ daily source file. We have an article describing it here

This option is enabled by default for new exports and means that Funnel will try to reduce the number of files that are exported. It works like this:

  • If account configuration is changed (e.g. custom dimensions or metrics, new data sources) all data is exported.

  • If the export configuration is changed all data is exported.

  • Otherwise, start from the month that contains the oldest new data, so if Adwords data has changed in Jan 2017 the export will start from 2017-01-01.

Hope this article helps you to plan and optimally configure your data exports.

Did this answer your question?