key-value databases in practice redis @ dotnettoscana

Post on 10-May-2015

743 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Key-value databases in

practice

Matteo Baglini Software Developer, Freelancematteo.baglini@gmail.comhttp://it.linkedin.com/in/matteobaglinihttp://github.cpom/bmatte

www.dotnettoscana.org

2

What is Redis?

«Advanced key-value store. It is often

referred to as a data structure server»

3

Key-Value

Key Value

page:index <html><head>[...]

user:123:session xDrSdEwd4dSlZkEkj+

user:123:avatar 77u/PD94bWwgdm+

Everything is a «blob»Commands, primarily, can GET and SET the

values

4

Advanced Key-ValueKey Value Type

page:index <html><head>[...] String

events:timeline { «Joe logged», «File X Uploaded», …} List

logged:today { 1, 2, 3, 4, 5 } Set

user:123:profile time => 10927353username => bmatte Hash

game:leaderboardjoe ~ 1.3483smith ~ 293.45fred ~ 83.22

Sorted Set

Different «data type/structure»Rich set of specialized commands

5

Advanced Key-Value Everything is stored in memory Screamingly fast performance Persistent via snapshot or append-only log file Replication (only Master/Slave) Extensible via embedded scripting engine (Lua) Rich set of client libraries High availability (In progress)

◦ Cluster (Fault tolerance, Multi-Node consistence) ◦ Sentinel (Monitoring, Notification, Automatic failover)

6

Project Created by Salvatore

Sanfilippo (@antirez) First «public release»

in March 2009. Since 2010 sponsored

by VMware.

Initially written to improve performance of Web Analytics product LLOOGG out of his

startup

7

Project

Written in ANSI C No external dependencies Single thread (asynchronous evented I/O) Works on all POSIX-like system Exist unofficial build for Windows Open-source BSD licensed Community (list, IRC & wiki)

8

Manifest

1. A DSL for Abstract Data Types.2. Memory storage is #1.3. Fundamental data structures for a

fundamental API.4. Code is like a poem.5. We're against complexity.6. Two levels of API.7. We optimize for joy.

9

Getting Started

10

Install

Latest stable version (2.6.*)

11

Install

Latest unstable version (2.9.7)

12

Server

13

Configuration

14

Client

15

Telnet

16

Data Types

17

Strings

18

Strings

Any blob will do(A value can be at max 512MB)

19

Strings

Operations on strings holding an integer

20

Strings Commands

21

Strings Patterns

Sharing state across processes◦Distribute lock, Incremental ID, Time series,

User session. Web Analytics

◦User visit (day, week, month), Feature Tracking.

Caching◦String values can hold arbitrary data.

Rate limiting◦Limit number of API calls/minute.

22

Keys

23

Expiration

Any item in can be made to expireafter or at a certain time.

24

Keys Commands

25

Lists

26

Lists

Sequence of string values

27

Lists

Sequence of string values(Max length is 232 - 1 elements)

28

Lists

Prevent indefinite growth

29

Lists Commands

30

Lists Patterns

Events Store or Notification◦Logs, Social Network Timelines, Notifications.

Fixed Data◦Last N activity.

Message Passing◦Durable MQ, Job Queue.

Circular list

31

Sets

32

Sets

Unordered set of unique values

33

Sets

Unordered set of unique values(Max number of members is 232 – 1)

34

SetsYou can do unions, intersections, differences of sets in very short

time.

35

Sets Commands

36

Sets Patterns

Web Analytics◦Unique Page View, IP addresses visiting.

Relations◦Friends, Followers, Tags.

Caching Result◦Store result of expensive intersection of data.

37

Sorted Set

38

Sorted Sets

Ordered set of unique values

39

Sorted Sets

Access by rank

40

Sorted Sets

Access by score

41

Sorted Sets Commands

42

Sorted Sets Patterns

Web Analytics◦Online users, Most visited pages.

Leaderbord◦Show top N.

Order by data◦Maintain a set of ordered data like user by

age.

43

Hashes

44

Hashes

Key → Value map (as value)

45

Hashes

Set attributes(Store up to 232 - 1 field-value pairs)

46

Hashes

Get attributes

47

Hashes Commands

48

Hashes Patterns

Storing Objects◦Hashes are maps between string fields and

string values, so they are the perfect data type to represent objects.

49

Persistence

50

Snapshot (RDB)Dump data to disk after certain

conditions are met

51

Snapshot (RDB) Pro:

◦ RDB is a very compact single-file.◦ RDB files are perfect for backups.◦ RDB is very good for disaster recovery.◦ RDB allows faster restarts with big datasets.◦ RDB maximizes performances (backgr. I/O  via

fork(2)). Contro:

◦ RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage).

◦ Fork can be time consuming if the dataset is very big.

52

Append-only (AOF)Append all write operations to a log

53

Append-only (AOF)Durability depends on fsync(2)

policy

54

Append-only (AOF) Pro:

◦ AOF is much more durable.◦ AOF is an append only log, no seeks, nor corruption

problems (for example after a power outage).◦ AOF contains a log of all the operations one after the

other in an easy to understand and parse format. Contro:

◦ AOF files are usually bigger than the equivalent RDB.◦ AOF can be slower then RDB depending on the exact

fsync policy.

55

What should I use? Use both persistence methods if you want a degree of

data safety comparable to what any RDBMS can provide you.

If you care a lot about your data, but still can live with a few minutes of data lose in case of disasters, you can simply use RDB alone.

There are many users using AOF alone, but we discourage it since to have an RDB snapshot from time to time is a great idea for doing database backups, for faster restarts.

56

C# Clients

57

C# clients

Rich set of clients

58

C# clients

59

BookSleeve

60

Code

61

Transactions

62

Transactions

Multiple commands (ACID)

63

Transactions Commands

64

Transactions Patterns Classic scenario

◦Multi atomic commands. Optimistic locking

◦Check and Set (CAS Pattern) write only if not changed.

65

Publish Subscribe

66

PubSub

Provide 1-N messaging

67

PubSub

Subscribe multi channels decoupled from the key space

68

PubSub

Publish on some channel

69

PubSub

Subscriber getting notified

70

PubSub Commands

71

PubSub Patterns

Message Passing◦Distribute message-oriented system, Event-

Driven Architecture, Service Bus.

72

Code

73

Replication

74

Replication

One master replicate to multiple slaves

75

ReplicationSlave send SYNC command and master transfers the database

file to the slave

76

ReplicationSlaves can perform only read

operation

77

Replication Patterns

Scalability◦Multiple slaves for read-only queries.

Redundancy◦Data replication.

Slave of Slave◦Graph-like structure for more scalability e

redundancy.

78

Performance

79

Performance

~50K read/write operations per seconds.

~100K read/write ops per second on a regular EC2 instance.

Screamingly fast performance

80

Performanceredis-benchmark tool on a Ubuntu

virtual machine ~36K rps

81

Application Architecture

82

Infrastructure

Application Server

SQL Server

Redis

83

Who is using Redis?

84

Finally

85

This is Redis

«I see Redis definitely more as a flexible tool than as a solution specialized to solve

a specific problem: his mixed soul of cache, store, and messaging server shows

this very well»

Salvatore Sanfilippo

86

Risources http://redis.io/ http://github.com/antirez/redis http://groups.google.com/group/redis-db

That’s all!

top related