Summery - Example AWS Pipeline for predicting stock market data. Leveraging Yahoo Finance API to stream stock market ticker data into AWS and data flowing into AWS SageMaker is an AI Machine Learning Platform that builds, trains, and deploys machine learning models. Data can be streamed in and processed to predict time series data.
Project Resource Use Cases:
Python code leveraging Yahoo Finance API: Python code streaming real time stock market data into AWS Kinesis.
Amazon Elastic Container Service (ECS): Container service running the Python Stock Market Streaming Code.
Amazon Kinesis: Messaging queue services for real time streaming data.
Amazon Kinesis Firehose: Service that takes the streaming data from Kinesis and transports it to a data source in real time. In this project, Firehose is transporting data to Redshift.
Amazon Redshift: Data warehouse chosen for it's scalability in data processing. In this project, it is used to store all the collected streaming dat in one place, before distributing to other data analysis resources.
Amazon SageMaker AI: An AWS Service that provides a single interface for building, training, and visualizing machine learning models.
Amazon Glue: A Service that prepare, transform, and move data. Used in this project to make data transformations to separate out tickers into individual data sets for consumption into SageMaker.
Note: AWS Glue supports scheduling at 5-minute intervals for automating the flow of data. For real-time processing, AWS Lambda can be used as an alternative to Glue.
The same process exporting data from Redshift, transforming it by separating tickers into separate data sets, and outputting to s3 for SageMaker consumption can be implemented using Lambda for real-time data pipelines.