If you would like to access your Funnel data in Amazon Redshift, you can do so by setting up an S3 export and importing the exported file into Redshift. Funnel's Data Warehouse connector can export all your data to S3, so once you have a file in your bucket all you need to do is to configure a Lambda to periodically import the data into Redshift.

Setup Redshift

Once your export has run, you'll find a number of files in your S3 bucket, with names such as funnel_2018_01.tsv and so on, one file per month. Along with these you'll also find a file called schema.funnel_data_{startYYYY}_{startMM}.sql , which contains the CREATE TABLE syntax for a table that can hold your exported data. Run that query manually in Redshift and then continue to set up you Lambda import function.

Setup Lambda

We have created an example Lambda module that should provide the above for you, all you need to do is setup a Lambda function in AWS. The file redshift-import.zip should be ready to upload to AWS Lamdba.

The steps needed in Lambda are:

  1. Create a new Lambda function (Blank function)
  2. Configure Lambda:
    Runtime: Node.js 6.10
    Code entry: Upload (redshift-import.zip)
    Environment variables: REDSHIFT_CONNECTION_STRING  and SQL  (see below)
    Handler: redshift-import.handler
    Advanced: timeout >= 1 minute (see below)
    VPC: according to your setup to access Redshift
  3. Configure triggers (See below)

Environment variables

REDSHIFT_CONNECTION_STRING should have the full connection string on the format: postgresql://user:password@host:port/database

SQL should have the full SQL script you would like to run (all on one line). E.g:

set search_path to [SCHEMA]; begin; delete from funnel_data; copy funnel_data from 's3://[BUCKET]/funnel_data_' compupdate off delimiter as '\t' credentials '[CREDENTIALS]'; commit;

NOTE: In this example we have used the TSV format for the S3 export in Funnel, but you might as well use CSV as long as you adjust the delimiter in the above SQL accordingly. You might also need to adjust the Redshift table name and copy file pattern according to how you set up your export (funnel_data  and funnel_data_  respectively, in the above example).

Configure Triggers

You may use any of the available Lambda trigger events available in AWS. The most common setups are using a CloudWatch Event with a schedule such as rate(1 day) or an S3 trigger to run the Lambda function when an export completes. Each time an export is complete a file called lastRun.funnel_data_{startYYYY}_{startMM}.txt  or similar will be created. You can setup an S3 trigger on this file like so:

Timeout settings

The longest runtime you can configure in Lambda is 5 minutes. If you have a lot of data this may not be long enough to complete the COPY operation in Redshift. In that case you will see a timeout message in the Lambda. However, our tests indicate that (with Node.js in Lamda) the query will still run to completion within Redshift, as Lambda/Node will not signal any abort to the database. If you don't care about the timeout message setting the timeout to 1 minute should be more than enough to start the COPY operation within Redshift.

Multiple exports in same database

To handle multiple exports from Funnel and import them into the same Redshift database you either need to work with different Schemas (e.g.set search path to [SCHEMA]  before running our provided SQL) or adjust the SQL used to create the tables and copy the data to use unique table names.

Did this answer your question?