cassandra seattle tech startups 3-10-10
Post on 20-Jan-2015
738 Views
Preview:
DESCRIPTION
TRANSCRIPT
A highly scalable, eventually consistent, distributed, structured key-value store.
How we use it @ Frugal Mechanic
Eric Peters (@ericpeters) STS 3-10-10
About
• Search 2.5M unique Auto Parts, fitting 250M car configurations for over 100 retailers
Data Data Data, We Need To:• Quickly Process More than 50 Feeds & Data Sources• Support 10M source-specific SKUs• Handle 300M SKU Part Fitments• Be flexible with new columns• Store and persist raw data before we cherry pick which
data to use
Eric Peters (@ericpeters) STS 3-10-10
Who Uses Cassandra?
Eric Peters (@ericpeters) STS 3-10-10
Cassandra Design Goals
• High availability• Eventual consistency
– trade-off strong consistency in favor of high availability
• Incremental scalability• Optimistic Replication• “Knobs” to tune tradeoffs between consistency,
durability and latency• Low total cost of ownership• Minimal administration
Slide “borrowed” from Avinash Lakshman
Eric Peters (@ericpeters) STS 3-10-10
Cassandra write properties
• No reads
• No seeks
• Fast
• Atomic within ColumnFamily
• Always writable
Eric Peters (@ericpeters) STS 3-10-10 Slide “borrowed” from Avinash Lakshman
Cassandra read properties
• Read multiple SSTables
• Slower than writes (but still fast)
• Seeks can be mitigated with more RAM
• Scales to billions of rows
Eric Peters (@ericpeters) STS 3-10-10 Slide “borrowed” from Avinash Lakshman
MySQL Comparison
• MySQL > 50 GB Data Writes Average : ~300 msReads Average : ~350 ms
• Cassandra > 50 GB DataWrites Average : 0.12 msReads Average : 15 ms
Slide “borrowed” from Avinash LakshmanEric Peters (@ericpeters) STS 3-10-10
ColumnFamilies
{ // this is a column name: "emailAddress", value: ”eric@example.com", timestamp: 123456789 }
Examples from http://www.slideshare.net/jbellis/cassandra-open-source-bigtable-dynamoEric Peters (@ericpeters) STS 3-10-10
Super ColumnFamilies
{ // this is a SuperColumn name: "homeAddress”, // with an infinite list of Columns value: { // note the keys is the name of the Column street: {name: "street", value: "1234 x street", timestamp: 123456789}, city: {name: "city", value: "san francisco", timestamp: 123456789}, zip: {name: "zip", value: "94107", timestamp: 123456789}, } }
Examples from http://www.slideshare.net/jbellis/cassandra-open-source-bigtable-dynamoEric Peters (@ericpeters) STS 3-10-10
JSON Column ExampleUserProfile = { // this is a ColumnFamily
phatduckk: { // this is the key to this Row inside the CF
// now we have an infinite # of columns in this row
username: "phatduckk",
email: "phatduckk@example.com",
phone: "(900) 976-6666”
}, // end row
ieure: { // this is the key to another row in the CF
// now we have another infinite # of columns in this row
username: "ieure”,
email: "ieure@example.com",
phone: "(888) 555-1212”,
age: "66",
gender: "undecided”
},
}
Examples from: http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdfEric Peters (@ericpeters) STS 3-10-10
JSON Super Column ExampleAddressBook = { // this is a ColumnFamily of type Super
phatduckk: { // this is the key to this row inside the Super CF
// the key here is the name of the owner of the address book
// now we have an infinite # of super columns in this row
// the keys inside the row are the names for the SuperColumns
// each of these SuperColumns is an address book entry
friend1: {street: "8th street", zip: "90210", city: "Beverley Hills", state: "CA"},
// this is the address book entry for John in phatduckk's address book
John: {street: "Howard street", zip: "94404", city: "FC", state: "CA"},
Kim: {street: "X street", zip: "87876", city: "Balls", state: "VA"},
…
// we can have an infinite # of ScuperColumns (aka address book entries)
}, // end row
ieure: { // this is the key to another row in the Super CF
// all the address book entries for ieure
joey: {street: "A ave", zip: "55485", city: "Hell", state: "NV"},
William: {street: "Armpit Dr", zip: "93301", city: "Bakersfield", state: "CA"},
},
}
Examples from: http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdfEric Peters (@ericpeters) STS 3-10-10
Why Cassandra?
• It offers column-oriented data storage, so you have a bit more structure than plain key/value stores.
• Fast! Writes (.12ms) (300M == 10hrs)• Written in Java + Apache Foundation
Project• People smarter than us are using it to solve
even bigger problems than ours, if they can scale it, we will be able to
Eric Peters (@ericpeters) STS 3-10-10
Frugal Mechanic’s Data
Eric Peters (@ericpeters) STS 3-10-10
Modeling Part Informationcassandra> get FrugalMechanic.RawParts['amazon/b0002jmuwk']
=> (column=thumb_image_url, value=http://ecx.images-amazon.com/images/I/31JQJNSAV6L._SL75_.jpg, timestamp=1264701499346)
=> (column=sku/FA1632, value=manufacturer, timestamp=1264701499346)
=> (column=sku/B0002JMUWK, value=asin, timestamp=1266339317829)
=> (column=name, value=Motorcraft FA1632 Air Filter, timestamp=1264701499346)
=> (column=large_image_url, value=http://ecx.images-amazon.com/images/I/31JQJNSAV6L.jpg, timestamp=1264701499346)
=> (column=description, value=Motorcraft Air Filter is designed to filter outside air that enters the vehicle. It is manufactured from leak proof polyurethane seals. This air filter chemically treats dry type cleaner elements to withstand damage from oil and moisture. It is resistant to temperature extremes and has a 98.5% efficiency standard. This air filter is treated to enhance capacity and efficiency as well as facilitates hassle free installation. It is backed by a 12 month warranty.<br/><br/>
Features:<br />
<ul>
<li>Efficiently filters outside air</li>
<li>Withstands damage from oil and moisture</li>
<li>Easy to install</li>
<li>12 months warranty</li>
<li>Leak proof</li>
</ul>
, timestamp=1264701499346)
=> (column=category/name, value=Automotive / Categories / Replacement Parts / Filters / Air Filters & Accessories / Air Filters, timestamp=1264701499346)
=> (column=category/browsenodeid, value=15727081,15727321, timestamp=1264701499346)
=> (column=cassandra_write_date, value=2010-02-16T16:55:17.809Z, timestamp=1266339317813)
=> (column=brand/name, value=Motorcraft, timestamp=1264701499346)
=> (column=brand/manufacturer, value=Motorcraft, timestamp=1264701499346)
Returned 11 results.
Eric Peters (@ericpeters) STS 3-10-10
Modeling Part Pricescassandra> get FrugalMechanic.RawPartPrices['amazon/b0002jmuwk']
=> (column=ATVPDKIKX0DER, value={"site":"ATVPDKIKX0DER","price":"13.45","buyUrlVar1":"B0002JMUWK","buyUrlVar2":"ATVPDKIKX0DER","updatedOn":"2010-01-28T17:58:19.346Z"}, timestamp=1264701499346)
=> (column=AOMQHH38LHK76, value={"site":"AOMQHH38LHK76","price":"8.64","buyUrlVar1":"B0002JMUWK","buyUrlVar2":"AOMQHH38LHK76","updatedOn":"2010-01-28T17:58:19.346Z"}, timestamp=1264701499346)
=> (column=ADG953YR6NRBF, value={"site":"ADG953YR6NRBF","price":"16.5","buyUrlVar1":"B0002JMUWK","buyUrlVar2":"ADG953YR6NRBF","updatedOn":"2010-01-28T17:58:19.346Z"}, timestamp=1264701499346)
=> (column=A8F3HAQ1FDLH8, value={"site":"A8F3HAQ1FDLH8","price":"12.94","buyUrlVar1":"B0002JMUWK","buyUrlVar2":"A8F3HAQ1FDLH8","updatedOn":"2010-01-28T17:58:19.346Z"}, timestamp=1264701499346)
=> (column=A3TW1WCPSO49LP, value={"site":"A3TW1WCPSO49LP","price":"19.72","buyUrlVar1":"B0002JMUWK","buyUrlVar2":"A3TW1WCPSO49LP","updatedOn":"2010-01-28T17:58:19.346Z"}, timestamp=1264701499346)
=> (column=A3NMYM0J8WG63N, value={"price":"$18.14","buyUrlVar2":"A3NMYM0J8WG63N","buyUrlVar1":"B0002JMUWK"}, timestamp=1260473908931)
=> (column=A1DPIC5NQU31S0, value={"site":"A1DPIC5NQU31S0","price":"16.13","buyUrlVar1":"B0002JMUWK","buyUrlVar2":"A1DPIC5NQU31S0","updatedOn":"2010-01-28T17:58:19.346Z"}, timestamp=1264701499346)
=> (column=A1ATZ3MAARQNEF, value={"site":"A1ATZ3MAARQNEF","price":"16.0","buyUrlVar1":"B0002JMUWK","buyUrlVar2":"A1ATZ3MAARQNEF","updatedOn":"2010-01-28T17:58:19.346Z"}, timestamp=1264701499346)
Returned 8 results.
cassandra>
Modeling Part Fitmentscassandra> get FrugalMechanic.RawPartFitments['amazon/b0002jmuwk']
=> (column={"year":"2009","make":"Ford","model":"F-150","engine":"5.4L V8","notes":"TYPE: 269 - HEIGHT: 7.81 - OUTSIDE: 4.26B - INSIDE: 6.10T"}, value=1, timestamp=1266339317864)
=> (column={"year":"2009","make":"Ford","model":"F-150","engine":"4.6L V8","notes":"TYPE: 269 - HEIGHT: 7.81 - OUTSIDE: 4.26B - INSIDE: 6.10T"}, value=1, timestamp=1266339317862)
...
=> (column={"year":"1997","make":"Ford","model":"E-250 Econoline","engine":"5.4L V8 CNG","notes":"GAS ENG"}, value=1, timestamp=1266339317860)
=> (column={"year":"1997","make":"Ford","model":"E-150 Econoline","engine":"5.4L V8","notes":"All"}, value=1, timestamp=1266339317860)
=> (column={"year":"1997","make":"Ford","model":"E-150 Econoline Club Wagon","engine":"5.4L V8","notes":"All"}, value=1, timestamp=1266339317860)
=> (column={"year":"1996-1999","make":"Ford","model":"All","engine":"4.6L V8 DOHC","notes":"All"}, value=1, timestamp=1266339317860)
Returned 200 results.
cassandra>
Eric Peters (@ericpeters) STS 3-10-10
Great Resources
• NoSQL West Intro: http://cloudera-todd.s3.amazonaws.com/nosql.pdf (Video: http://www.vimeo.com/5145059)
• Cassandra Talk (Rackspace): Vid+PPT: http://www.parleys.com/#st=5&id=1866• Cassandra Talk (Facebook): PPT:
http://static.last.fm/johan/nosql-20090611/cassandra_nosql.ppt Video: http://vimeo.com/5185526
• Cassandra Talk (Digg): http://nosql.mypopescu.com/post/334198583/presentation-cassandra-in-production-digg-arin
• WTF is a Super Column: http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf
• Get Up and Running w/Cassandra: http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/
Eric Peters (@ericpeters) STS 3-10-10
Questions?
Eric Peters (@ericpeters) STS 3-10-10
top related