build a smart app
DESCRIPTION
Presentation given at DreamIt Fall '11 Philly class. Build a smart at using new technologies that allow you to scale easier down to the road while keeping maintenance and costs low.TRANSCRIPT
Ryan Hubbard
Build a Smart App
Who Am I?
I Am You!Just fast forward 4 yrs, add luck & late
nightsCombination of Biz & TechWorked on high velocity web apps2 Startups
eVariant - MyHealthConnect.com
Yellow Hammer Media Geo Marketing Technology
An Archaic Question
“Build it Fast or Build it to Last?”
Build it Smart!
Build It Smart
Ingredients Low Cost Scalable
Horizontal Vertical
HA (High Availability) Maintainable GeoSmart or GSLB (Global Server Load Balancing) Multiple Layers working Together
DNS, File Serving, CDN, etc
Don’t Implement just Plan for it (Procrastination can be a good thing)
Shiny Balls
Don’t be Distracted by Shiny Balls
Don’t use it just because it’s coolFocus on what you knowLeverage ExpertsIf all else fails, build it fast
Web Servers
IIS, Apache Rule w/ 80%But they are overhead pigsc10K ProblemEvent Driven / Newer Web Servers
Node.js Nginx (lighttpd) Twisted (Tornedo)
Web Servers - Node.js
Created by Ryan Dahl in 2009 (Sponsored by Joyent) Non Blocking, Event Driven (Not just a web server) Single Threaded (use Cluster, Monitoring is a pain) JavaScript based Use Cases
Push/Streaming Notification Stacks – Socket IO (SocketStream)
Super Fast Simple Apps Ex: Pixel Tracking (Click Redirect, Conversion, UI in
Rails/PHP) c10K Problem is not a problem
Don’t Use It For Complex Apps Regular Web Server
Web Servers - Other
Nginx c10k is not a problem, Single Threaded like Node Easy to Use 7.65% of the Web Great for
Content Serving (CDNs), Load Balancing Alt to Apache to reduce memory and overhead
Twisted / Tornado Event Driven based off of Ruby’s Event Machine Great for High
C10k, Performance Needs Long Polling HTTP Streaming
Databases
SQL MySQL, Postgres, SQL Server, etc
Big Data Vertica, Greenplum, Netezza, Teradata
NoSQL MongoDb, Cassandra, Redis, Riak, etc
MoSQL Cassandra
NewSQL VoltDB
Databases - MongoDB
NoSQL version of MySQLFast, Scalable, HA, MapReduceTons of Community SupportUse Cases
Capped Collections – Great for Event logging Async Transactions – Great for Analytics
Don’t Use it For Large data sets – 100-200GB+ Vertica, Greenplum, Netezza, Teradata
Databases – Vertica
Developed by Michael StonebrakerBig Data – Tera or PetraUse Cases
Lots of data need HA on cheap machines Correlation queries (Zynga) Swallow vast data sets quickly (5 TB in 1
hour) History
Its Bad At OLTP
Databases – Cassandra
Developed by Facebook, No Apache Project
NoSQL & MoSQL CQL
Use Cases Counters Advanced Replication (GeoSmart) Super Fast Memory to Disk
Its Bad At OLTP Relational Data
Databases – VoltDB
Created by Michael Stronebraker NewSQL? – Fast Relational Data Relational in Memory lightning fast HA, Shared Nothing, Self Healing 1.6 million transactions /sec on commodity servers Use Cases
Real Time Analytics OLTP Schemas that don’t change
Don’t Use It for Schemas that change often Complicated replicated Large data sets
Databases – VoltDB
Created by Michael Stronebraker NewSQL? – Fast Relational Data Relational in Memory lightning fast HA, Shared Nothing, Self Healing 1.6 million transactions /sec on commodity servers Use Cases
Real Time Analytics OLTP Schemas that don’t change
Don’t Use It for Schemas that change often Complicated replicated Large data sets
Cloud Serving / DNS
AWS, Rackspace, Linode, Firehost I like AWS
EC2 (Auto Scaling) ElisticMap Reduce S3, CloudFront Databases – RDS, SimpleDB ElastiCache ElasticBeanstalk & CloudFormation Route 53 (not UltraDNS)
Dyn $30 - $200 / month GSLB Round Robin / Fail Over
GeoMarketing App w/ Real Time Reports
1 Ad can produce 130 Million Inserts / day
Running on EC2 Poor I/O & Network
Tried MySQL (NDB), Cassandra, Redis, Riak
Solution VoltDB stores 2 hours of data Dump stats to MySQL Dump rows out to Vertica (BigData)
Reporting – Aggregate query of Volt & MySQL
Questions?