AWS Lambda from the Trenches

Download AWS Lambda from the Trenches

Post on 18-Jan-2017

4.129 views

Category:

Technology

3 download

Embed Size (px)

TRANSCRIPT

<ul><li><p>+ =</p><p>AWS LAMBDA FROM THE TRENCHESwhat you should know before you go to production</p></li><li><p>hi, my name is Yan Cui</p></li><li><p>@theburningmonk</p></li><li><p>- Dan North</p><p>lead time to someone saying thank you is the only reputation </p><p>metric that matters.</p></li><li><p>security</p><p>complexity OUTSIDE the code</p><p>deployment</p><p>load balancing</p><p>caching</p><p>monitoring</p><p>config management</p><p>https://www.infoq.com/presentations/complexity-simplicity-esb</p><p>centralised logging</p><p>elastic scalingsetup server</p><p>https://www.infoq.com/presentations/complexity-simplicity-esb</p></li><li><p>THERE IS NO SERVER</p></li><li><p>automatic scaling</p></li><li><p>minimise undifferentiated </p><p>heavy-lifting</p></li><li><p>simple, fast deployment</p></li><li><p>- Dan North</p><p>lead time to someone saying thank you is the only reputation </p><p>metric that matters.</p></li><li><p>cost saving</p></li><li><p>not paying for idle servers</p></li><li><p>energy efficiency in DCs</p></li><li><p>easy to get started</p></li><li><p>fuelling the Yubl platform evolution </p></li><li><p>completely rebuilt search</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p><p>Amazon CloudSearchAmazon API Gateway Amazon Lambda</p></li><li><p>analytics pipeline</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p><p>Google BigQuery</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p><p>Google BigQuery</p><p>1 developer, 2 daysdesign production</p><p>(his 1st serverless project)</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p><p>Google BigQuerynothing ever got done </p><p>this fast at Skype!</p><p>- Chris Twamley</p></li><li><p>- Dan North</p><p>lead time to someone saying thank you is the only reputation </p><p>metric that matters.</p></li><li><p>Facebook login</p></li><li><p>Amazon Lambda GrapheneDBAmazon API Gateway</p><p>Amazon API Gateway Amazon Lambda Facebook Graph API</p></li><li><p>and many more</p></li><li><p>GET PRODUCTION-READY</p></li><li><p>USE ADEPLOYMENT FRAMEWORK</p></li><li><p>http://serverless.com</p><p>http://serverless.com</p></li><li><p>http://apex.run</p><p>http://apex.run</p></li><li><p>https://github.com/claudiajs/claudia</p><p>https://github.com/claudiajs/claudia</p></li><li><p>TESTING</p></li><li><p>Amazon Lambda</p><p>Amazon KinesisAmazon IOT Amazon IOT</p></li><li><p>I thought of objects being like biological cells and/or individual computers on a network, only </p><p>able to communicate with messages.</p><p>- Alan Kay</p></li><li><p>Amazon Lambda</p><p>Amazon KinesisAmazon IOT Amazon IOT</p></li><li><p>OOP to me means only messaging, local retention and protection and hiding of state-</p><p>process, and extreme late-binding of all things.</p><p>- Alan Kay</p></li><li><p>amzn.to/29Lxuzu</p><p>http://amzn.to/29Lxuzu</p></li><li><p>Level of Testing</p><p>1.Unitdo our objects do the right thing?are they easy to work with?</p></li><li><p>Level of Testing</p><p>1.Unit2.Integrationdoes our code work against code we cant change?</p></li><li><p>handler</p></li><li><p>handler</p><p>test by invoking the handler</p></li><li><p>Level of Testing</p><p>1.Unit2.Integration3.Acceptancedoes the whole system work?</p></li><li><p>Level of Testing</p><p>unit</p><p>integration</p><p>acceptance</p></li><li><p>Level of Testing</p><p>unit</p><p>integration</p><p>acceptance</p><p>can do all 3 with Lambda</p></li><li><p>We find that tests that mock external libraries often need to be complex to get the code into the right state for the functionality we need to exercise. </p><p>The mess in such tests is telling us that the design isnt right but, instead of fixing the problem by improving the code, we have to carry the extra complexity in both code and test</p><p>Dont Mock Types You Cant Change</p></li><li><p>The second risk is that we have to be sure that the behaviour we stub or mock matches what the external library will actually do </p><p>Even if we get it right once, we have to make sure that the tests remain valid when we upgrade the libraries</p><p>Dont Mock Types You Cant Change</p></li><li><p>Dont Mock Types You Cant ChangeServices</p></li><li><p>Wherever possible, an acceptance test should exercise the system end-to-end without directly calling its internal code. </p><p>An end-to-end test interacts with the system only from the outside: through its interface</p><p>Testing End-to-End</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p><p>Amazon CloudSearchAmazon API Gateway Amazon Lambda</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p><p>Amazon CloudSearchAmazon API Gateway Amazon Lambda</p><p>Test Input</p></li><li><p>Legacy Monolith Amazon Kinesis Amazon Lambda</p><p>Amazon CloudSearchAmazon API Gateway Amazon Lambda</p><p>Test Input</p><p>Validate</p></li><li><p>We prefer to have the end-to-end tests exercise both the system and the process by which its built and deployed </p><p>This sounds like a lot of effort (it is), but has to be done anyway repeatedly during the softwares lifetime</p><p>Testing End-to-End</p></li><li><p>Jenkins build config deploys and tests</p><p>unit + integration tests</p><p>deploy</p><p>acceptance tests</p></li><li><p>build.sh allows repeatable builds on both local &amp; CI</p></li><li><p>TEAM WORK</p></li><li><p>shared environments</p><p>GOALS</p></li><li><p>easily propagate environmental changes</p><p>GOALS</p></li><li><p>PRO TIPdont ignore _meta</p></li><li><p>centralised config service</p></li><li><p>config servicegoes here</p></li><li><p>APP SECRETS</p></li><li><p>GOALSsensitive data are encrypted at rest</p><p>(credentials, connection string, etc.)</p></li><li><p>GOALShas to work on CI</p></li><li><p>GOALSrole-based access</p></li><li><p>hand-rolled with KMS</p><p>(encrypted at rest)</p></li><li><p>hand-rolled with KMS</p></li><li><p>plug-ins</p><p>serverless-plugin-kmsvariables</p><p>serverless-secrets</p><p>serverless-meta-sync</p></li><li><p>centralised config service</p></li><li><p>DOCUMENTATION</p></li><li><p>set goals</p></li><li><p>set goals</p><p>choose a way</p></li><li><p>set goals</p><p>choose a way</p><p>document</p></li><li><p>create project templates/scaffolds</p></li><li><p>set goals</p><p>choose a way</p><p>evaluate document</p></li><li><p>set goals</p><p>choose a way</p><p>evaluate document</p></li><li><p>set goals</p><p>choose a way</p><p>evaluate document</p><p>share</p></li><li><p>LOGGING</p></li><li><p>2016-07-12T12:24:37.571Z 994f18f9-482b-11e6-8668-53e4eab441ae GOT is off air, what do I do now?</p></li><li><p>2016-07-12T12:24:37.571Z 994f18f9-482b-11e6-8668-53e4eab441ae GOT is off air, what do I do now?</p><p>UTC Timestamp API Gateway Request Id</p><p>your log message</p></li><li><p>organised by Function + Version</p></li><li><p>LOG OVERLOAD</p></li><li><p>centralise your logs</p></li><li><p>CloudWatch Logs AWS Lambda</p><p>LogStash ElasticSearch</p></li><li><p>CloudWatch Logs AWS Lambda</p><p>LogStash ElasticSearch</p><p>AWS Elasticsearch</p></li><li><p>CloudWatch Logs AWS Lambda</p><p>LogStash ElasticSearch</p><p>AWS Elasticsearch</p><p>Elastic Cloud</p></li><li><p>CloudWatch Logs AWS Lambda</p><p>LogStash ElasticSearch</p><p>AWS Elasticsearch</p><p>Elastic Cloud</p><p>?</p></li><li><p>correlation IDs</p></li><li><p>MONITORING</p></li><li><p>PRO TIPset up dashboards</p></li><li><p>PRO TIPdont forget to set </p><p>up alarms</p></li><li><p>PRO TIPadd application-level </p><p>metrics</p></li><li><p>ERROR HANDLING</p></li><li><p>how do I return HTTP error codes?</p></li><li><p>{ status : 404, errorMessage : oops }</p></li><li><p>{ status : 404, errorMessage : oops }</p></li><li><p>s-templates.json</p><p>{ status : 404, errorMessage : oops }</p></li><li><p>PRO TIPmap timeouts to 504</p></li><li><p>every Lambda function has a timeout setting</p></li><li><p>use error regex to map it to a HTTP 504</p></li><li><p>s-templates.json</p></li><li><p>PRO TIPavoid using 128mb </p><p>setting for production</p></li><li><p>continuous timeout loop</p></li><li><p>PRO TIPproactively time out </p><p>your function</p></li><li><p>whats the retry strategy with Kinesis and SNS?</p></li><li><p>If the invocation for one record times out, is throttled, or </p><p>encounters any other error, Lambda will retry until it </p><p>succeeds (or the record reaches its 24-hour expiration) before </p><p>moving on to the next record</p><p>http://aws.amazon.com/lambda/faqs</p><p>http://aws.amazon.com/lambda/faqs</p></li><li><p> do nothing swallow errors track retry count</p><p>effort</p></li><li><p> retry forever no retry retry N times</p></li><li><p>PRO TIPuse local state to track no. of retries; move on </p><p>after N retries</p></li><li><p>PRO TIPrecord CloudWatch </p><p>metrics for error count; alarm if necessary</p></li><li><p>retried 3-5 times</p></li><li><p>KEEP WARM</p></li><li><p>functions are unloaded if idle for a while</p></li><li><p>noticeable cold start time(package size matters)</p></li><li><p>CloudWatch Event AWS Lambda</p></li><li><p>CloudWatch Event AWS Lambda</p><p>ping</p><p>ping</p><p>ping</p><p>ping</p></li><li><p>CloudWatch Event AWS Lambda</p><p>ping</p><p>ping</p><p>ping</p><p>ping</p></li><li><p>CloudWatch Event AWS Lambda</p><p>ping</p><p>ping</p><p>ping</p><p>ping</p><p>HEALTH CHECKS?</p></li><li><p>even then</p></li><li><p>functions are recycled every few hours</p></li><li><p>functions are recycled every few hours</p></li><li><p>PRO TIPdont make hard </p><p>assumptions about function lifetime</p></li><li><p>KNOW YOUR LIMITS</p></li><li><p>max 50 MB deployment package size</p></li><li><p>max 50 MB deployment package sizemax 75 GB total deployment package size*</p><p>* limit is per AWS region</p></li><li><p>Janitor Monkey</p></li><li><p>Janitor Lambda</p></li><li><p>max 5 mins execution time</p></li><li><p>max 6 MB request payload size*</p><p>max 6 MB response payload size</p><p>* for a request-response event type</p></li><li><p>default max 100 concurrent executions*</p><p>* soft-limit, can be raised via support ticket</p></li><li><p>looking ahead</p></li><li><p>.Net core?SQS support?</p></li><li><p>v1.0 (coming soon)</p></li><li><p>MULTI-CLOUD FUTURE?</p></li><li><p>IBM OpenWhisk</p><p>Amazon Lambda Azure Web Functions</p><p>Google Cloud Functions</p><p>competition</p><p>faster innovation lower prices</p></li><li><p>@theburningmonk</p><p>@theburningmonktheburningmonk.comgithub.com/theburningmonk</p></li></ul>