- The Athena Query instance and the AWS S3 bucket should be in the same region
- The default option for the funnel data export is compressed, using a compressed export format will help to keep the cost low when query large amount of data. Read more about How to set up an export to AWS S3.
Get started with setting up a crawler
- Connect a new data source by pressing the Connect data source button, then you should be able to see an option where you can choose the location of data and how to read its metadata catalog. Select the Query data in Amazon S3 for the location, and AWS glue data catalog for metadata.
- AWS glue data catalog would require a crawler that will retrieve schema information automatically, select that option and press Connect to AWS Glue which will open a new window.
- Setup a new crawler for your data
- Choose an appropriate name for your crawler like 'funnel-data-s3-crawler'.
- Select Data stores as the source type
- Choose S3 as a data store, and leave the Connection blank
- Select Specified path in my account, and paste the path where is data is located in the include path
- Select no, if you don't have any other data store
- Create a new IAM role with an appropriate name for the crawler to access the data files
- Choose a daily frequency for the crawler to run
- Add a database where the output data table will be created, like 'funnel-data'
- Review and submit, now select the crawler you just created and press Run Crawler to execute it for the first time (this might take a few minutes)
Start querying your data
If you go to Athena you should be able to see the database called 'funnel-data' under the Data source: AwsDataCatalog. Start by selecting your database and modifying your query