how to build your next startup idea on google cloud platform
TRANSCRIPT
How to build your next startup idea on Google Cloud Platform
How to build your next startup idea on Google Cloud Platform
we built our
Disclaimer
● This talk is more about sharing information and ideas other than best practice
● The choices we made might not suite your needs and are probably not the best ones
● Take whatever you think is useful and use your own judgement
● Python user since 2007● Build web based systems
using Python + Django mostly● Gain interests in infrastructure
gradually● Working on my own startup
since 2014
About Me
twitter: @adieugithub: github.com/adieuwebsite: www.adieu.me
Porter.io
BigQuery
Dataflow
Prediction API
Cloud SQL
Cloud Storage
Cloud Datastore
Google Cloud Platform
App Engine
Compute Engine
Container Engine
Compute Storage Service
Why it matters for startup● Money
○ No initial investment○ Pay as you go○ Cheaper than self hosting
● Availability and Maintenance○ No hardware/system maintenance needed○ Zero system downtime expected
● Scalability○ Autoscale with App Engine○ Easy to scale with Compute Engine and Container Engine
● Fast development○ Ready to use service apis, sdks and 3rd party libraries○ Start with App Engine PaaS○ No need for dedicated SA
What We Built
Porter.iotail -f hackernews \| grep github.com/your/starred \> inbox
Requirements
● Distributed crawler for Hacker News stories and their linked web pages
● Store all the stories and pages then make it accessible for other services
● Process all stories and pages for information we want
● Right balance between easy to use and cost
Requirements
● Distributed crawler for Hacker News stories and their linked web pages Scrapy?
● Store all the stories and pages then make it accessible for other services MongoDB?
● Process all stories and pages for information we want Hadoop?
● Right balance between easy to use and cost
Porter.io v1
Cloud Storage
Task Queue
App Engine
App Engine
App Engine
Cloud DataStore
Compute Engine
watch notify push
push
save
save
fetch
fetch
Lessons Learned
● App Engine is more suitable for fast requests than slow ones
● Autoscaling of App Engine workers on Push Queues could cause trouble when tasks get stalled
● Data processed in cloud workers still needs to get to our local redis cache backend for web
Porter.io v2
Cloud Storage
Task QueueApp Engine
Cloud DataStore
Compute Engine
Compute Engine
Compute Engine
watch notify push
pull
save
save
fetch
serve
Lessons Learned
● Understand the pricing model● Use pull queue outside of App Engine● Use keys only query● Don’t index every field● Pay attention to list field● Batch mutations (up to 500), not transactions● gcloud-python● appengine-mapreduce
Wishlist
● Gevent for App Engine Python runtime● Shared redis or memcache between App
Engine and Compute Engine● Cheaper price for store and query indexed
data● More ready to use apps for common admin
tasks on App Engine
The hybrid cloud as a serviceCloud SQL
Cloud Storage
Cloud Datastore
Task Queue
…Shared Service
Frontend Backend Mapreduce
...
Web
Service
Runtime
Task Worker Logging
...
Service Service
...
Compute Engine
App Engine
Container Engine
The AWS version
● EC2● Container Service● Lambda
● SQS● SNS● CloudWatch
● S3● RDS● DynamoDB● Glacier
Benefits and challenges
● Adopting hybrid cloud as a service means adopting Microservice Architectural
● Right tool is available for the right problem● Solve the core problem and leave the
infrastructure problem to the service provider● Heavily rely on continuous integration and
monitoring● Vendor lock-in
Links
● https://cloud.google.com/● https://github.
com/GoogleCloudPlatform/gcloud-python● https://github.
com/GoogleCloudPlatform/appengine-mapreduce
● https://porter.io
Q & A