scaling your web application
TRANSCRIPT
Scaling your Web Application
Scaling your Web App
Topics to be covered :
1. What is Scaling ?
2. Why do we need to scale ?
3. How do we scale ?- Scaling up Vs. Scaling out
4. What is Sharding ?
5. RDBMS vs NoSQL (Perspective : Scalibility)
Web is Data
Username Birthday Blog Post Images Vidoes
Desirable Properties of a Web App
Scalability High Availability Performance Manageability Low Cost Feature Rich Generates $$$ :D
What is Scaling ?
Scalability
It is the property of your system to handle growing amount of work in a graceful manner or to be readily enlarged as demand increases.
Scaling and performance are different.
So What is Performance ?
Performance :
The amount of useful work accomplished by a computer system compared to the time and resources used.
Better Performance means more work accomplished in shorter time and/or using less resources
Ok. Now What ?
I have an application up and running on servers and Its doing pretty well.
Why should I think about scaling it ?
Will it really require scaling ?
Why do I need scaling ? Who knows your app might be the next FB or
Twitter...
How will you handle so many Users doing so many things over network ??? It might go up to processing millions of request / second. So,
Scale it !!!!!! :)
OK. Cool.I really need to think about scaling my
application now.
Wait a minute !!!!!
How should I do it ?? How Should I Scale ??
Well there are a couple of ways to scale your application
1. Scaling Up 2. Scaling Out
Scaling Up vs. Scaling Out
Scaling Up : More CPU, Bigger HD, More RAM, etc. Biggest , fastest single computer that exists is still
not as fastest as two of such computers together. i.e. Diminishing returns => Not a good solution
Nah!! Not Efficient !!!!
Scaling Up Vs. Scaling Out
Scaling Out : Add more nodes
Master / Slave architecture Sharding
Master / Slave Architecture
Master / Slave Architecture
Pros : Increased READ speed Takes READ load off of master Allows us to join across all tables
Cons: Doesn't buy increased write throughput Single point failure
What is Sharding ?
Sharding is the method of splitting your database across several servers (called a Cluster).
Each shard can consist of one or multiple machines. No machine has all your data on it.
More machines =>
More RAM.CPU =>
More operations/sec => Improved Throughput.
Yeyyy !!!!
What is Sharding ?
Sharding
Pros : Increased READ and WRITE throughput No Single Point Failure
Individual features can fail But Whole system won't go down at a time.
Cons : Can't join queries between shards
RDBMS vs NoSQL
(Perspective : Scalibility)
Scaling : RDBMS
RDBMS guarantee ACID operationsACID operations but when a relational database grows out of one server, it is no longer that easy to use. In other words, they don't scale out very well in a distributed system.
The CAP theoremCAP theorem states that a distributed (i.e. scalable) system cannot guarantee all of the following properties at the same time:
Consistency Availability Partition tolerance
Most NoSQL Databases drop Consistency in favour of availability. Thats why they are better scalable.
Why NoSQL Databases Scale & RDBMS does not (?)
Data Sharding would require distinct data entities that can be distributed and processed independently.
RDBMS can't do that because of its table based nature
NoSQL do not distribute a logical entity across multiple tables, it’s always stored in one place.
NoSQL only enforce consistency inside a single entity and sometimes not even that.
Example
Does it mean that my app gives a better performance now ?
No It doesn't. Performance depends on how correctly you implement (scalable solution) for your case.
Besides there are several factors affecting like
Disk I/O Network Caching
So What Should I Choose ?
MySQLOR
MongoDB
MySQL OR MongoDB
“The real thing to point out is that if you are being held back from making something super awesome because you can’t choose a database, you are doing it wrong.
If you know mysql, just use it. Optimize when you actually need to.
Use it like a k/v store, use it like a rdbms, but for god sake, build your killer app! None of this will matter to most apps.
Facebook still uses MySQL, a lot. Wikipedia uses MySQL, a lot. FriendFeed uses MySQL, a lot.
NoSQL is a great tool, but it’s certainly not going to be your competitive edge, it’s not going to make your app hot, and most of all, your users won’t give a shit about any of this.”
MySQL OR MongoDB
“What am I going to build my next app on? Probably Postgres.
Will I use NoSQL? Maybe. I might also use Hadoop and Hive. I might keep everything in flat files.
Maybe I’ll start hacking on Maglev. I’ll use whatever is best for the job.
If I need reporting or ACIDIty, I won’t be using any NoSQL.
If I need caching, I’ll probably use Tokyo Tyrant.
If I need a ton of counters, I’ll use Redis.
If I need transactions, I’ll use Postgres.
If I have a ton of a single type of documents, I’ll probably use Mongo.
If I need to write 1 billion objects a day, I’d probably use Voldemort.
If I need full text search, I’d probably use Solr.
If I need full text search of volatile data, I’d probably use Sphinx.”
Conclusion of the Debate
If there’s anything to take away from the
RDBMS vs NoSQL debate,
it’s just to be happy there are more tools, because more cool tools means more win win situation for everyone.
SummaryWe covered following topics :
1. What is Scaling ?
2. Why do we need to scale ?
3. How to do we scale ?- Scaling up Vs. Scaling out
4. What is Sharding ?
5. RDBMS vs NoSQL (Perspective : Scalibility)
Thank You&
Questions ------------------------------------------------
Ketan Deshmukh.