inside wordnik's architecture
DESCRIPTION
Slides about Wordnik's archTRANSCRIPT
![Page 1: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/1.jpg)
Inside Wordnik's Architecture
Tony Tam@fehguy
![Page 2: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/2.jpg)
Who is Wordnik?
•Founded in 2008 by Erin McKean
•"Understand meaning of words automatically"
•Patented "Free-Range Definition" technology
•Constructed largest (known) English Word Graph
We do Discovery
![Page 3: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/3.jpg)
It's all about Data!
![Page 4: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/4.jpg)
Data?
•Word Graph is built by data
•Runtime answers needed fast
50M+ Nodes!
80mS reads!
80M+ Edges!
![Page 5: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/5.jpg)
What we do with Data
•Update the Graph constantly
•Augment our NLP pipeline
•"Reality-based Annotation" with current, real-world data
![Page 6: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/6.jpg)
What we do with Data
•Update the Graph constantly
•Augment our NLP pipeline
•"Reality-based Annotation" with current, real-world data
Language is NOT static
![Page 7: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/7.jpg)
What we do with Data
•Update the Graph constantly
•Augment our NLP pipeline
•"Reality-based Annotation" with current, real-world data
Language is NOT static
Twitter?
Tumblr?
Wordpress
Next???
![Page 8: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/8.jpg)
Is a 20 year-old corpus good enough?
![Page 9: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/9.jpg)
How we do it
•Amazon EC2-based deployment
•Efficiency through constraint-based architecture
• Small is Big!
•Horizontal scaling by adding servers!
• Yea, we can always go vertical
•Blah, blah, more details!
![Page 10: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/10.jpg)
Micro Services
•Services are stand-alone building blocks
•Increase capacity through a "more like this" button
![Page 11: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/11.jpg)
Micro Services
•Big application => micro services
Monolithic application
"Isn't this just SOA?"
![Page 12: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/12.jpg)
Micro Services
•Big application => micro services
Monolithic application
"Isn't this just SOA?"
![Page 13: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/13.jpg)
Micro Services
•Big application => micro services
Monolithic application
"Isn't this just SOA?"
![Page 14: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/14.jpg)
Micro Services
•Big application => micro services
Monolithic application
"Isn't this just SOA?"
![Page 15: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/15.jpg)
Not PO-SOA
•This is different
• No proprietary message bus
• Decoupled objects
• Dedicated storage***
•Speak REST
• Develop your services in…
• Java
• Scala
• Ruby
• Php
![Page 16: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/16.jpg)
Al valid
!
Speak REST?
•Sounds good but…
• REST semantics vary wildly
• HATEOAS vs. practical REST?
/api/pet.json/1?delete (GET)
/api/pet.json/1 (DELETE)
/api/pet.json/1 (POST empty)
So…
![Page 17: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/17.jpg)
All valid
!
Speak REST?
•Sounds good but…
• REST semantics vary wildly
• HATEOAS vs. practical REST?
/api/pet.json/1?delete (GET)
/api/pet.json/1 (DELETE)
/api/pet.json/1 (POST empty)
So…API
Styleguide!
Peer Review!
Better Docs!
API Council!
![Page 18: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/18.jpg)
mSOA makes new Challenges
•It's communication (not easy)
•Need a consumer & provider contract
•Driving force to create Swagger
![Page 19: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/19.jpg)
What is Swagger?
•Swagger is…
• Spec for declaring and documenting an API
• A framework for auto-generating the spec
• A library for client library generation
• A JSON-based test framework
•It's open source!
• http://swagger.wordnik.com
![Page 20: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/20.jpg)
How?
•Swagger Codegen
• Creates a client based on your Swagger Specscala src/main/scala/Codegen.scala \ ${swagger-spec-url}
Scala
Ruby
![Page 21: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/21.jpg)
In the Wordnik Workflow
•Jenkins will…
• Build a service library
• Build a stand-alone application distro
• Build an installable image (RPM)
• Build a compatible client library
•Consumers will…
• Declare dependency on a service version
• Use a client for that version
• Be given a list of compatible services, by cluster, version
![Page 22: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/22.jpg)
Back to Data
•Micro services have small(ish) databases
• Share nothing across services
• YES To replica sets
•Deployed to ephemeral storage
• (more in a bit)
• Small by design
•How to keep them small?
![Page 23: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/23.jpg)
Keeping Databases Small
•Some easy tricks
• Schema-less => "schema per document"
• Keep field names short!
db.foo.save({user_name:"Tony"})
db.foo.save({un:"Tony"})
•Indexes
• They can get *huge*
• Make _id matter!
Repeat 10e9
times!
![Page 24: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/24.jpg)
Keeping Databases Small
•Some easy tricks
• Schema-less => "schema per document"
• Keep field names short!
db.foo.save({user_name:"Tony"})
db.foo.save({un:"Tony"})
•Indexes
• They can get *huge*
• Make _id matter!
Repeat 10e9
times!
![Page 25: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/25.jpg)
Keeping Databases Small
•Don't make _id just an "auto increment"You're stuck with it! Be smart
• User collection? Try _id: username
• Email collection? Try _id: email
• Date-driven collection? How about _id: "20120502"
• db.logins.find({_id:/^201205/}) 17
15
27
Be lazy until you can't anymore!
![Page 26: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/26.jpg)
Keeping Databases Small
•DAO or die!
• Fancy index scheme => control access to collections
NO!!!!
Yes
![Page 27: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/27.jpg)
Keeping Databases Small
•If/when you need to shard…
Don't make your
clients do this!
![Page 28: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/28.jpg)
Keeping Databases Small
•Again, why keep them small?
•Starting a new replica
• Initial sync
• Index rebuilding
•Backups
•Index Compaction
•Speed
•TCO
![Page 29: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/29.jpg)
Keeping Databases Small
•Again, why keep them small?
•Starting a new replica
• Initial sync
• Index rebuilding
•Backups
•Index Compaction
•Speed
•TCO
Everything is
easier
This can take DAYS
![Page 30: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/30.jpg)
Ephemeral Storage?
•Every EC2 instance type has some (except micro)
•Only available via EC2 API
•Less prone to issues than EBS
•Faster ***
•Included in cost of server
![Page 31: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/31.jpg)
Ephemeral Storage?
•Every EC2 instance type has some (except micro)
•Only available via EC2 API
•Less prone to issues than EBS
•Faster ***
•Included in cost of serverBut dies on host reboot!
![Page 32: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/32.jpg)
Keeping Data Safe
![Page 33: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/33.jpg)
Which Zone? Which Region?
![Page 34: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/34.jpg)
Which Zone? Which Region?
Arbiter handles external
connectivity issue
detection
![Page 35: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/35.jpg)
How does this really stack up?
•Tuned indexes & access, split with services
• Was: 3 DAS Devices w/18 TB disk
• Now: 21 M1.large + M1.xlarge instances
• 3 Zones, 2 regions
•The Gory Detailsblog.wordnik.com/with-software-small-is-the-new-big
![Page 36: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/36.jpg)
As for Services
•~1,000 requests/sec via Swagger-enabled micro services
•Direct to Consumer via SwaggerSocket
![Page 37: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/37.jpg)
What's Next
•Migrating all services to SwaggerSocket
• OSS WebSocket subprotocol
https://github.com/wordnik/swaggersocket
• 25%-100% speed increase (sync & async)
•Discovery via Wordnik
![Page 38: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/38.jpg)
If you're Interested…
![Page 39: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/39.jpg)
If you're Interested…
![Page 40: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/40.jpg)
If you're Interested…
![Page 41: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/41.jpg)
If you're Interested…
![Page 42: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/42.jpg)
If you're Interested…
![Page 43: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/43.jpg)
If you're Interested…
![Page 44: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/44.jpg)
If you're Interested…
![Page 45: Inside Wordnik's Architecture](https://reader033.vdocuments.mx/reader033/viewer/2022061618/554f9098b4c905d25b8b51ac/html5/thumbnails/45.jpg)
See more:
developer.wordnik.com
swagger.wordnik.com
github.com/wordnik
Questions?