Amazon Athena is an interactive query service for analysing data in Amazon S3 using standard SQL. Athena is serverless, and you pay only for the queries that you run.
In Athena point to your data in Amazon S3, define the schema, and start querying using standard SQL. There’s no need for complex ETL jobs to prepare your data for analysis.
Athena is out-of-the-box integrated with AWS Glue Data Catalog, allowing you to create a unified metadata repository across various services, crawl data sources to discover schemas and populate your Catalog with new and modified table and partition definitions, and maintain schema versioning.
Amazon Redshift Spectrum resides on dedicated Amazon Redshift servers that are independent of your cluster. Redshift Spectrum pushes many compute-intensive tasks, such as predicate filtering and aggregation, down to the Redshift Spectrum layer. Thus, Redshift Spectrum queries use much less of your cluster’s processing capacity than other queries. Redshift Spectrum also scales intelligently. Based on the demands of your queries, Redshift Spectrum can potentially use thousands of instances to take advantage of massively parallel processing.
You create Redshift Spectrum tables by defining the structure for your files and registering them as tables in an external data catalog. The external data catalog can be AWS Glue, the data catalog that comes with Amazon Athena, or your own Apache Hive metastore.
Both are serverless and pay as you go. The processing costs are the same at around $5 per terabyte scanned. However with Spectrum you also pay for the Redshift cluster which can between $0.25 and £13.00 an hour depending on the vCPUs, memory and storage of the cluster.
If you already a Redshift customer then moving data out of Redshift into S3 and using Spectrum offers significant cost savings for storage of large volumes of data. The processing is unchanged.
If you are already using Athena then stick with Athena as it offers much the same capabilities as Spectrum without the cost of running a Redshift cluster. Athena is being exhanced with federated queries and with ML (abiity to call Sagemaker
It is also worth considering which analytic tool you plan to use. Not all tools are compatible with Athena and Spectrum.
The choice may boil to the complexity of the task in hand:
With Athena you can connect to data sources other than S3
With Spectrum you can connect to external tables: