Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda

Download Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda

Post on 14-Aug-2015

1.192 views

Category:

Technology

2 download

Embed Size (px)

TRANSCRIPT

<ol><li> 1. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Presenter: Vyom Nagrani, Sr. Product Manager, AWS Lambda Q&amp;A Moderator: Ajay Nair, Sr. Product Manager, AWS Lambda July 30th, 2015 Best Practices: Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda </li><li> 2. Amazon DynamoDB Streams time-ordered sequence of item-level changes Time and partition ordered log Provides a stream of inserts, deletes, updates Old item New item Primary key Change type Stream items delivered exactly once Streams are asynchronous Scales with your table DynamoDB DynamoDB Streams </li><li> 3. Benefits of DynamoDB Streams for real-time data processing Durability &amp; high availability High throughput consensus protocol Replicated across multiple AZs Managed streams Simply enable streaming Performance Designed for sub-second latency Native integration with AWS Lambda DynamoDB Triggers invoke a Lambda function to run your custom code DynamoDB DynamoDB Streams DynamoDB Triggers Lambda function Run custom code </li><li> 4. AWS Lambda: A compute service that runs your code in response to events Lambda functions: Stateless, trigger-based code execution Triggered by events: Direct Sync and Async invocations Put to an Amazon S3 bucket Table update on Amazon DynamoDB And many more Makes it easy to Build back-end services that perform at scale Perform data-driven auditing, analysis, and notification </li><li> 5. High performance at any scale; Cost-effective and efficient No Infrastructure to manage Pay only for what you use: Lambda automatically matches capacity to your request rate. Purchase compute in 100ms increments. Bring Your Own Code Productivity focused compute platform to build powerful, dynamic, modular applications in the cloud Run code in a choice of standard languages. Use threads, processes, files, and shell scripts normally. Focus on business logic, not infrastructure. You upload code; AWS Lambda handles everything else. Benefits of AWS Lambda for building a server-less data processing engine 1 2 3 </li><li> 6. DynamoDB Streams + Lambda = Database Triggers Run multiple real time applications in parallel DynamoDB Streams natively supports Cross Region Replication Triggers enables Filtering, Monitoring, Auditing, Notifications, Aggregation, etc. No charge for reads/polls that your AWS Lambda function makes to the DynamoDB Stream associated with the table </li><li> 7. Walkthrough of a simple stream logging application workflow Streams Amazon DynamoDB AWS Lambda Amazon CloudWatch Logs New table updates </li><li> 8. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 9. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 10. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 11. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 12. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 13. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 14. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 15. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 16. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 17. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 18. Walkthrough of setting up DynamoDB Triggers and Lambda functions through the AWS Console </li><li> 19. Todays demo: Workflow of cross-region replication and real-time data auditing Original Table Data Stream Amazon DynamoDB AWS Lambda Amazon DynamoDB Amazon SNS </li><li> 20. Loop through event array Replicate item to different table Send notification if suspicious record In both cases, wait for callbacks before exiting </li><li> 21. Demo: Cross region replication and real-time data auditing using Amazon DynamoDB and AWS Lambda </li><li> 22. Attaching Lambda functions to DynamoDB Streams Automatic Shards: One Lambda function concurrently invoked per DynamoDB shard Each individual shard follows ordered processing A given key will be present in at most one concurrently active shard All changes (insert, remove, modify) available for a rolling 24-hour basis Source DynamoDB Streams Destination 1 Lambda Destination 2 Pollers FunctionsShards Lambda will scale automaticallyDynamoDB Streams scales by grouping records into shards </li><li> 23. Attaching Lambda functions to DynamoDB Streams Reading the stream: Stream is exposed via the familiar Amazon Kinesis Client Library interface Read the stream using https://github.com/awslabs/dynamodb-streams-kinesis-adapter Records can be retrieved at ~2x rate of the tables provisioned write capacity Automatic Scaling: Both Dynamo DB and Lambda scale automatically with PUT rates Default limit of 100 concurrent Lambda functions, can be increased by AWS Support Center </li><li> 24. Performance tuning DynamoDB as an event source Batch size: Max records that AWS Lambda will retrieve from DynamoDB at the time of invoking your function Increasing batch size will cause fewer Lambda function invocations with more data processed per function Starting Position: The position in the stream where Lambda starts reading Set to Trim Horizon for starting with oldest record Set to Latest for starting with most recent data </li><li> 25. Best practices for creating Lambda functions Memory: CPU proportional to the memory configured Increasing memory makes your code execute faster (if CPU bound) Timeout: Increasing timeout allows for longer functions, but more wait in case of errors Retries: For DynamoDB Streams, Lambda has unlimited retries (until data expires) Permission model: Lambda pulls data from DynamoDB, so no resource policy needed, only execution role to allow Lambda access to DynamoDB </li><li> 26. Monitoring and Debugging Lambda functions Console Dashboard Lists all Lambda functions Easy editing of resources, event sources and other settings At-a-glance metrics Metrics in CloudWatch Requests Errors Latency Throttles Logging in CloudWatch Logs </li><li> 27. Three Next Steps 1. Enable DynamoDB Streams for your existing DynamoDB tables. DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table in the last 24 hours. 2. Create and test your first Lambda function. With AWS Lambda, there are no new languages, tools, or frameworks to learn. You can use any third party library, even native ones. 3. Use AWS Lambda with DynamoDB Streams to create DynamoDB Triggers no infrastructure to manage, and setup a clean and lightweight implementation of database triggers, NoSQL style! </li><li> 28. Thank you! Visit http://aws.amazon.com/dynamodb, the AWS blog, and the DynamoDB forum to learn more and get started using DynamoDB. Visit http://aws.amazon.com/lambda, the AWS Compute blog, and the Lambda forum to learn more and get started using Lambda. </li></ol>