design for scale - building real time, high performing marketing technology presented with bizo
DESCRIPTION
DynamoDB presented by David Pearson from AWS Bizo Business Audience Marketing success story on AWS by Alex Boisvert, Director of Engineering, Bizo In today's world, consumer habits change fast and marketing decisions need to be made within seconds, not days. Delivering engaging advertising experiences requires real time, high performing architectures that provide digital advertisers the ability to measure and improve the performance of their campaigns and tie them more closely to corporate goals. The insights gleaned from the massive amounts of data collected can then be used to dynamically adjust media spend and creative execution for optimal performance. The AWS Cloud enables you to deliver marketing content and advertisements with the levels of availability, performance, and personalization that your customers expect. Plus, AWS lowers your costs. Join us to learn about how big data and low latency / high performing architectures are changing the game for digital advertising.TRANSCRIPT
October 3, 2013
Design for Scale Building Real Time, High Performing
Marketing Technology
David Pearson - Business Development
Amazon RDS
Amazon DynamoDB Amazon Redshift
Amazon ElastiCache
Compute Storage
AWS Global Infrastructure
Database
Application Services
Deployment & Administration
Networking
AWS Database
Services
Scalable High Performance
Application Storage in the Cloud
provision
manage
scale
EFFORT
differentiated?
AWS Big Data Services
Redshift DynamoDB
Elastic MapReduce Amazon S3
Object Storage
Batch Processing
Real-Time Transactions
Online Analysis and Reporting
Amazon DynamoDB
NoSQL Database
Predictable performance
Seamless & massive scalability
Fully managed; zero admin
Amazon DynamoDB
Amazon’s Path to DynamoDB
RDBMS DynamoDB
Amazon DynamoDB
DEVS
OPS
USERS
Fast Application
Development
Time to Build New Applications
• Flexible data models • Simple API • High-scale queries • Laptop development
Amazon DynamoDB
DEVS
OPS
USERS
Latest News… DynamoDB Local
• Disconnected development
• Full API support
• Download from http://aws.amazon.com/dynamodb/resources/#testing
Amazon DynamoDB
DEVS
OPS
USERS
Admin-Free (at any scale)
request-based capacity provisioning model
Provisioned Throughput
Throughput is declared and updated via the API or the console
CreateTable (foo, reads/sec = 100, writes/sec = 150)
UpdateTable (foo, reads/sec=10000, writes/sec=4500)
DynamoDB handles the rest
Capacity is reserved and available when needed
Scaling-up triggers repartitioning and reallocation
No impact to performance or availability
Amazon DynamoDB
DEVS
OPS
USERS Durable Low Latency
WRITES Replicated continuously to 3 AZ’s
Persisted to disk (custom SSD)
READS Strongly or eventually consistent
No latency trade-off
Average < 3ms TP90 < 4.5ms
server-side latency across all APIs
AD SERVING
EC2
Profiles Database
ad request
ad url
visitor
Ad Servers
DynamoDB
1. Visitor loads a web page
2. Web page issues a request to ad servers on EC2
3. Query to DynamoDB returns the ad to display
4. Link is returned to visitor
Real Time Bidding
EC2
Profiles Database Ad Servers
DynamoDB
EC2
Profiles Database Ad Servers
DynamoDB
RTB platform
Bidder DynamoDB
Ads Profiles Queues and Buffer bid response
20 ms
20 ms 20 ms 40 ms
Request network transit
Response network transit Decision on best ad and bid price based on
optimization that needs multiple data look-ups Contingency time buffer
…
bid request
EC2
Profiles Database
ad request
ad url
visitor
Ad Servers
DynamoDB
visitor
Optimize for scale, elasticity, and availability
• Multi-AZ: maintain EC2 capacity in multiple availability zones
• Auto Scaling: scale EC2 capacity to automatically manage variations in workload
• Elastic Load Balancing: automatically distribute incoming traffic across multiple EC2 instances
EC2 (MAZ)
ad request
ad url
Ad Servers
DynamoDB Elastic Load Balancing
Profiles Database
visitor
1. Ad files are downloaded from CloudFront
2. Impressions captured into logs on S3
CloudFront
advertisement
impression logs
Static Repository Files
Amazon S3
EC2 (MAZ)
ad request
ad url
Ad Servers
DynamoDB Elastic Load Balancing
Profiles Database
CloudFront
advertisement
impression logs
Static Repository Files
Amazon S3
Profiles Database
EC2 (MAZ)
ad request
ad url
Ad Servers
DynamoDB Elastic Load Balancing
visitor
Click-through requests are
captured via EC2 into log
files and persisted on S3
Click-through Servers
click through log files
click through requests
Elastic Load Balancing
EC2 (MAZ)
Analysis
CloudFront
advertisement
impression logs
Static Repository Files
Amazon S3
Profiles Database
EC2 (MAZ)
ad request
ad url
Ad Servers
DynamoDB Elastic Load Balancing
visitor
new bids
updated profiles
new requests
Redshift
ETL
Amazon EMR
unstructured log files
Click-through Servers
click through log files
click through requests
Elastic Load Balancing
EC2 (MAZ)
Amazon Redshift
Drive qualified users to advertiser’s sites
• Ad server logs • 3rd party data
• Bid history • User history
Bid Optimization
Business Analytics using Redshift
Optimize return on advertising expenditure
• Impressions • 3rd party data
• User history
• Enrichment
Cost Optimization
Optimizing the Data Tier
DynamoDB
cookies
writes reads
PutItem: insert new cookies into table
CreateTable
{ …
"ProvisionedThroughput": {
"ReadCapacityUnits": “100",
"WriteCapacityUnits": “10000"
},
"TableName": “User_Cookies_0"
}
User_Cookies_0
hash=userid
range=timestamp
<cookie payload>
DynamoDB
cookies GetItem: lookup profile table, return action (url)
User_Profile
hash=userid
<profile data>
url
User_Cookies_0
hash=userid
range=timestamp
<cookie payload>
DynamoDB
cookies
Time Series Data
CreateTable: new cookie ingest table
PutItem: insert new cookies into new table
User_Cookies_0
hash=userid
range=timestamp
<cookie payload>
User_Profile
hash=userid
<profile data>
User_Cookies_1
hash=userid
range=timestamp
<cookie payload>
url
User_Cookies_1
hash=userid
range=timestamp
<cookie payload>
DynamoDB
cookies
UpdateTable: prepare data for direct load into Redshift
User_Cookies_0
hash=userid
range=timestamp
<cookie payload>
User_Profile
hash=userid
<profile data>
writes reads
Redshift
url
User_Cookies_1
hash=userid
range=timestamp
<cookie payload>
DynamoDB
Redshift
cookies
User_Cookies_0
hash=userid
range=timestamp
<cookie payload>
User_Profile
hash=userid
<profile data>
COPY cookie_staging
userid
timestamp
:
insert new entries
user_history
userid
timestamp
:
url
Redshift
user_history
userid
timestamp
:
cookie_staging
userid
timestamp
:
updated
profiles
Conditional PutItem (insert new / update existing items)
User_Cookies_1
hash=userid
range=timestamp
<cookie payload>
DynamoDB
cookies
User_Profile
hash=userid
<profile data>
url
DeleteTable
query: build
profile
Four Drivers of DynamoDB Adoption
Resources
David Pearson [email protected]
Best Practices http://aws.amazon.com/dynamodb/resources/
Scalable Easy To Use (Durably) Fast Inexpensive
Questions David Pearson
© 2013 Bizo, Inc
The Future of Digital
Advertising with
Cloud Computing
October 3rd, 2013
© 2013 Bizo, Inc
© 2013 Bizo, Inc
Marketing Automation
Nurture Anonymous Site Visitors • 90% of site traffic doesn’t convert • Marketing automation rules to optimize display ads
Extend Email Nurturing to Display Advertising ● 70% of email are not opened ● Automatically coordinate display and email messaging ● Feedback Loop: metrics on performance, lift, ROI
Integrations • Eloqua (now subsidiary of Oracle) + Other Platforms
© 2013 Bizo, Inc
Targeting API
Dynamic Targets • Business Professionals (Cookies + User Ids) • Companies (by name or domain) • IPs (single or range)
Attribution / Analytics ● Impressions, Clicks, Conversions, New Visitors, etc.
Key Metrics (beta) • 300M cookies and user-ids (e.g., sha1 email hashes) • 30M company records -> 200M employee cookies • 100M Company -> IP range mappings
© 2013 Bizo, Inc
Environment
Project / Team • 8 Months, 1 - 3 Engineers, ≈ 18 Man-Months • Automated Deployment & Monitoring (“DevOps”)
Language • Java / Scala (JVM)
Infrastructure • EC2, ELB, AutoScaling • MySQL (RDS), DynamoDB, S3 • SQS, EMR, Cloudwatch, etc.
© 2013 Bizo, Inc
Architecture
© 2013 Bizo, Inc
SQL vs NoSQL: Different Starting Points
© 2013 Bizo, Inc
SQL vs NoSQL: How Much Do You Need?
???
© 2013 Bizo, Inc
SQL vs NoSQL: False Dilemma?
New
SQL ?
© 2013 Bizo, Inc
Database “Sweet Spots”
MySQL/RDS (SQL)
DynamoDB (NoSQL)
Access Pattern Small/Medium
Batches Per-Record;
Random
Scaling Moderate Growth Near Real-Time
Sharding Bring-Your-Own Built-In;
Transparent
Operations Self-managed;
Some Labor Costs Fully Managed
© 2013 Bizo, Inc
Challenges
Common: Data Modeling/Indexing for Performance
MySQL/RDS • Sharding • Rebalancing • Online Schema Migration
DynamoDB (NoSQL) ● Understanding Performance/Provisioned Capacity Model
(Throttling / Non-Uniform Access Pattern) ● Expiring Data