AWS Data Pipeline

Last Updated : 25-Feb-2021

AWS Pipeline Overview

AWS Data Pipeline is a web based ETL service for processing and moving data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. It is used with AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR.

AWS Pipeline Benefits

  • reliable
  • easy to use
  • flexible
  • scalable
  • transparent
  • low cost

AWS Pipeline Features

  • distributed, highly available infrastructure designed for fault tolerant execution
  • automatic retry capability
  • configured through visual interface
  • library of templates
  • scheduling
  • dependency tracking
  • error handling
  • work can be dispatched to one machine or many in parallel
  • full execution logs are automatically delivered to Amazon S3
  • full control over the compute resources

AWS Pipeline Costs

  • low monthly rate with low frequency jobs at 6oc per month
  • high frequency jobs at $1 per month
  • high frequency is twice or more times per day)
Using Template: Template Post
magnifier linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram