overview of facebook scalable architecture

27
SEMINAR PRESENTATION PRESENTED BY SHARATH BASIL KURIAN GUIDED BY DR. SUDHEEP ELAYIDOM OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE

Upload: rishikese-mr

Post on 02-Jul-2015

598 views

Category:

Software


4 download

DESCRIPTION

This seminar will provide you information's about how Facebook work, what is it's architecture, etc.

TRANSCRIPT

Page 1: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

SEMINAR PRESENTATION

PRESENTED BY SHARATH BASIL KURIAN

GUIDED BY DR. SUDHEEP ELAYIDOM

OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE

Page 2: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

CONTENTS

Introduction

How Facebook works

Facebook’s Architecture

Front End

Back End

Conclusions

2

Page 3: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

Scalability is essential for web applications with massive numbers of users.

Hundreds of thousands of users accessing continuously lead to slow response time

The current architecture of Facebook is very large and consists of many technologies and thousands of servers.

In this presentation Facebook’s architecture and how it handles scalability issues is going to be described.

INTRODUCTION3

Page 4: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

HOW FACEBOOK WORKS

Facebook needs to handle large amounts of information every second.

Facebook continues to be a LAMP (Linux- Apache-Memcached-PHP) website, but

had to change and expand its operation to incorporate other elements and

services.

Personalized and implemented systems also exist inside Facebook, such as

Haystack.

4

Page 5: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

5Architecture of Facebook

Components of Facebook

Page 6: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

FRONT-END

Facebook uses a variety of services, tools, and programming languages to make

up its core infrastructure.

Main Components are

LAMP

PHP & BigPipe

HipHop

At the front-end, the servers run a LAMP (Linux, Apache, MySQL, and PHP) stack

with Memcache.

6

Page 7: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

LAMP 7

• Linux is the operating system used.

• It is open source, highly customizable and good for safety.

• Apache is also open source and is the most popular web serve Facebook uses

Linux with the Apache HTTP server.

Page 8: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

PHP & BIGPIPE 8

• fast iterations

• easy to use

• Simple to learn

• good web programming language with extensive community support.

• BigPipe is a dynamic web page system developed by Facebook.

• The general idea is to perform pipelining of sections through the implementation of various

stages within web browsers and servers.

WHY PHP ? ?

Page 9: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

9

• Browser sends an HTTP request to web server.

• Web server parses the request, pulls data from storage tier then formulates an HTML document

and sends it to the client in an HTTP response.

• HTTP response is transferred over the Internet to browser.

• Browser parses the response from web server, constructs a tree representation of the HTML

document, and downloads CSS and JavaScript resources referenced by the document.

• After downloading resources, browser parses and executes them.

• In this scenario, while the web server is processing and creating the HTML document, the

browser is idle and when the browser is rendering the html page, the web server remains idle.

Accessing Webpage

Page 10: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

Normal Web Page Request 10

Browser Web ServerData

base

HTTP request

Data

HTML Document

Displays

Webpage

Page 11: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

11

• The browser sends an HTTP request to web server.

• Server quickly renders a page skeleton containing the tags and a body.

The HTTP connection to the browser stays open as the page is not yet

finished.

• Browser will start rendering the page.

• The PHP server process is still executing and building the page lets. Once a

page let has been completed it's results are sent to the browser .

• Browser injects the html code for the page let received into the correct place.

If the page let needs any CSS resources those are also downloaded.

• After all page lets have been received the browser starts to load all external

files needed by those page lets asynchronously.

This results in a system where the page is rendered progressively. The initial page

content becomes visible much earlier, which dramatically improves user

perceived latency of the page.

PIPELINING

Page 12: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

HIP-HOP 12

• PHP compiler.

• Developed by Facebook

• The processing time for PHP language is slow

• Created to minimize server resources.

• Converts PHP scripts into optimized C++ code.

• Some key features of HipHop are:

Easy to implement extensions

Reduces CPU and Memory usage significantly

Page 13: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

13

Page 14: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

BACK-END 14

• At the heart of the application back end are the application servers.

• Application servers are responsible for answering all queries and take all the writes

into the system.

• They also interact with a number of services to achieve this

• Employ sharding and caching to process information

Page 15: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

HAYSTACK 15

• Haystack is an object store that is designed for sharing photos on Facebook where data

is written once, read often, never modified, and rarely deleted and replaced.

• traditional file systems perform poorly under huge work- load.

• High throughput and low latency.

• Fault-tolerant

• Efficient storage of billions of photos.• Highly scalable.

• Uses extensive caching in its main memory.

Page 16: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

Working of Haystack 16

Structure of Haystack

Traditional methods store files in directories

into several meta files..

No of Disk reads required for one photo

becomes high

• Translate filename to i-node

• Read i-node

• Read file

Haystack

• Stores several meta data into a large file

• Stored in main memory

• Data Replicated on physical volumes

helps against hard disk failures etc

Page 17: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

SCRIBE 17

• Scalable distributed logging framework

• Useful for logging a wide array of data

• Simple data model

• Built on top of Thrift

Page 18: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

THRIFT 18

• Lightweight software framework for cross- language development

• Very quick

• Swaps information between various applications developed in different languages.

• Thrift protocol offers serialization between several languages.

• Eg: PHP is used for the front-end, Erlang is used for Chat

Page 19: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

MYSQL 19

• Fast and reliable

• Thousands of MySQL servers

• Users randomly distributed across these servers

• Relational aspect of DB is not used

• No joins. Logically difficult(Data is distributed randomly)

• Primarily key-value store

Page 20: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

MEMCACHED 20

• System uses client-server architecture.

• Key-value memory storage system for arbitrary pieces of information that result either from

database research or page rendering.

• Server maintains a key-value vector association

• Client populates this vector and performs research on it.

• The server then stores these values in memory therefore decreases reading time.

• If a server consumes the available memory, it removes older values

Page 21: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

Memory Management using Memcached

21

Memcached based Architecture

Protects the main

database from high

read demands from

users

Page 22: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

HADOOP & HIVE 22

• Hadoop is ideal for scalable systems with an enormous need to store and process large

amounts of information.

• Scales horizontally.

• Instead of replacing existing storage systems, Hadoop complements them.

• This process allows existing systems to serve data in real time or provide transactional

interactive business intelligence.

Page 23: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

Storing 23

• Apache Hadoop is being used in

three broad types of systems:

• as a warehouse for web analytics

• as storage for a distributed database

• and for MySQL database backups.

Page 24: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

CONCLUSIONS 24

• Facebook has gained a tremendous amount of popularity in the last years and

has become the most successful social networking website ever created.

• Facebook realizes the power of social networking and is constantly innovating to

keep their service the best in the business.

Page 25: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

REFERENCES 25

• Research.fb.com

• https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Beaver.pdf

• https://www.fb.com/barrigas.pdf

Page 26: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

QUERIES ?? 26

Page 27: OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE

THANK YOU !

27