Transcript
  • Scalable SQL and NoSQL Data StoresGVHD: PGS.TS. ng Th Bch ThyHVTH: Hunh Th Thu Nga 14 12 007

    Scalable SQL and NoSQL Data Stores

  • V sao cn NoSQL?*Maximal Objects and the Semantics of Universal Relation Databases*Khng ph hp trong thi i Internet

    Maximal Objects and the Semantics of Universal Relation Databases

  • V sao cn NoSQL?*Maximal Objects and the Semantics of Universal Relation Databases*Post A has ID: 1234Post A has ID: 1234Post B has ID: 1234Xung t d liu

    Maximal Objects and the Semantics of Universal Relation Databases

  • V sao cn NoSQL?*Maximal Objects and the Semantics of Universal Relation Databases*- c/ghi chmHN CH CA RDBMS- Lu tr b hn ch- Kh m rng - Chi ph vn hnh cao- c/ghi nhanhNHU CU THI I- Lu d liu ln, Big Data- D dng m rng - Chi ph vn hnh thp

    Maximal Objects and the Semantics of Universal Relation Databases

  • NoSQL *Scalable SQL and NoSQL Data Stores*Non-Relational19982009

    Scalable SQL and NoSQL Data Stores

  • c im nhn dng NoSQL*Scalable SQL and NoSQL Data Stores*Lc t do(Schema-free).H tr m rng d dng.API n gin.Eventual consistency (nht qun cui) v transactions hn ch trn cc thnh phn d liu n l.Khng gii hn khng gian d liu

    Scalable SQL and NoSQL Data Stores

  • Mt s khi nim ca NoSQL*Scalable SQL and NoSQL Data Stores*

    RDBMSNoSQLColumnsFieldsRowDocumentTableCollectionQuery: SQLQuery: using APIForeign keysNon Foreign keysSchemaFre schema

    Scalable SQL and NoSQL Data Stores

  • c im ca NoSQL*Scalable SQL and NoSQL Data Stores* Khng c tnh ACID Tnh cht BASE tng phn vi tnh ACID:- Basically Available- Soft state- Eventually consistent

    Scalable SQL and NoSQL Data Stores

  • u im ca NoSQL *Scalable SQL and NoSQL Data Stores* Hiu sut hot ng cao Kh nng phn trang Ngun m Kh nng m rng phm vi C cc CSDL NoSQL khc nhau cho nhng d n khc nhau Kinh t

    Scalable SQL and NoSQL Data Stores

  • Nhc im ca NoSQL *Scalable SQL and NoSQL Data Stores* Cu trc d liu phi quan h Open source h tr khng ng u gia cc doanh nghip Hn ch v tri thc nghip v Thiu s thng minh Nhng vn v tnh tng thch

    Scalable SQL and NoSQL Data Stores

  • Phn loi NoSQL *Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Key Value Store API (Application Programming Interface - Giao din lp trnh ng dng) n ginvoid Put(string key, byte[] data);byte[] Get(string key);void Remove(string key);

    *Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Key Value Store Truy xut, xa, cp nht gi tr thc (value) u thng qua key tng ng Gi tr c lu di dng BLOB (Binary large object) Hiu sut tt Xy dng n gin v d m rng L c s cho nhng loi CSDL NoSQL khc V d: gi mua hng Amazon (Amazon Dynamo)*Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Key Value StoreMt s loi key-value store ph bin:Key/value cache in RAM: memcached, Citrusleaf database, Velocity, Redis, Tuple space...Key/value save on disk: Memcachedb, Berkeley DB, Tokyo Cabinet, Redis...Eventually Consistent Key Value Store: Amazon Dynamo, Voldemort, Dynomite, KAI, Cassandra, Hibari, Project VoldemortOrdered key-value store: NMDB, Memcachedb, Berkeley DB...Distributed systems: Apache River, MEMBASE, Azure Table Storage, Amazon Dynamo ...

    *Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Column Families / Wide Column StoreColumn families database l h CSDL phn tn cho php truy xut ngu nhin/tc thi vi kh nng lu tr mt lng cc ln d liu c cu trc.Column families: Mt column family l cch thc d liu c lu tr trn a cng. Tt c d liu trong mt ct s c lu trn cng mt file. Mt column family c th cha super column hoc column.*Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Column Families / Wide Column StoreColumn: Mt column l mt b gm tn, gi tr v du thi gian (thng thng ch quan tm ti key-value). Super column: Mt super column c th c dng nh mt dictionary(kiu t in). N l mt column c th cha nhng column khc (m khng phi l super column).

    *Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Document Database*Scalable SQL and NoSQL Data Stores*V c bn th document database l mt key-value store vi value nm trong mt nh dng: XML, YAML, JSON, v BSON, kiu nh phn Key l chui n gin: URI hoc path

    Scalable SQL and NoSQL Data Stores

  • Document Database*Scalable SQL and NoSQL Data Stores*

    Document 1Document 2{ FirstName:"Bob", Address:"5 Oak St.", Hobby:"sailing" } { FirstName:"Jonathan", Address:"15 Wanamassa Point Road", Children:[ {Name:"Michael",Age:10}, {Name:"Jennifer", Age:8}, {Name:"Samantha", Age:5}, {Name:"Elena", Age:2} ] }

    Scalable SQL and NoSQL Data Stores

  • Document Database*Scalable SQL and NoSQL Data Stores*Thc hin php chiu d liu ca mt document sang mt nh dng khc.Chy php tnh tp hp trn mt tp hp cc document.Cp nht mt phn d liu D phn tn

    Scalable SQL and NoSQL Data Stores

  • Graph Database*Scalable SQL and NoSQL Data Stores*Graph database l mt dng CSDL c thit k ring cho vic lu tr thng tin th nh cnh, nt, cc thuc tnh. Graph database document database vi cc kiu document c bit v cc mi quan h.

    Scalable SQL and NoSQL Data Stores

  • Graph Database*Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Graph Database*Scalable SQL and NoSQL Data Stores*Graph database thng c s dng gii quyn vn v mng. M rng kh v kh tm th con c lp Mt s sn phm tiu biu ca graph database l: Neo4J, Sones, AllegroGraph, Core Data, DEX, FlockDB, InfoGrid, OpenLink Virtuoso,...

    Scalable SQL and NoSQL Data Stores

  • KT LUN*Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

  • Scalable SQL and NoSQL Data Stores

  • Tnh ACIDAtomicity: Thuc tnh ny m bo mi transaction l mt khi duy nht, c thc hin trn vn hoc hon ton khng c thc hin. Nu c mt li no xy ra trong transaction, n s c quay tr li (rollback) trng thi ban u. Khi bn gom nhiu lnh vo mt transaction (bao gia BEGIN TRAN v COMMIT), s ch c hai kh nng c php xy ra l, tt c cc lnh ny s c thc hin hoc khng c lnh no c thc hin. mc tng lnh, SQL Server cng m bo tnh atomicity, v d mt lnh INSERT cho 10 bn ghi, nu ang thm c 5 bn ghi th gp li, h thng s hy b v khng bn ghi no c thm. Nu lnh c km theo trigger, li trigger cng ko theo lnh b hy b. Khi bn pht ra lnh ROLLBACK, tt c cc lnh thc hin cng b quay lui v transaction tr li trng thi nh trc khi thc hin.Consistency: SQL Server m bo mi thi im d liu lun lun phi nht qun, tc l tun theo cc rng buc c nh ngha (v d trng kiu ngy phi cha d liu kiu ngy, bn ghi bn hng phi c m sn phm hp l). Khi transaction c thc hin, d liu sau khi cp nht cng phi trng thi nht qun. Nu transaction gy ra nhng vi phm v rng buc d liu, h thng s khng cho php thc hin tip v hy b ton b transaction.Isolation: Cng nh cc h thng server khc, SQL Server c th p ng nhiu yu cu xy ra ng thi. Nhng mi transaction c m bo thc hin trong mt ng cnh ring bit ca n v khng b nh hng bi cc transaction khc. Khi hai transaction cng cp nht mt d liu, SQL Server m bo chng c thc hin tun t khng dm ln chn ca nhau.Durability: Khi transaction thc hin xong ( commit), nhng cp nht tr nn c nh v d liu s lun lun l nh vy. Khi h thng gp s c bt ng, trong qu trnh khi phc li n s m bo khi phc li d liu cho nhng transaction c commit.*Scalable SQL and NoSQL Data Stores*

    Scalable SQL and NoSQL Data Stores

    Tc gi: David Maier SUNY at Stony Brook v Jeffrey D. Ullman i hc Stanford*Atomicity (nguyn t), Consitency (nht qun), Isolation (C lp), v Durability (Lu bn)Trong 40 nm qua, SQL v c s d liu quan h (RDBMS) lun l s la chn tin cy trong cc h thng lu tr d liu vi tnh ACID vn l im mnh ca m hnh d liu quan h, tuy nhin t sau khi Internet ra i, c bit l vo thi im Web 2.0 bung n th chnh im mnh ny li tr thnh nhc im ln nht ca n khi p ng vo mi trng Internet. SQL v m hnh d liu quan h gi y khng cn theo kp s pht trin ca Internet.

    *Trong mt th gii kt ni, m bo tnh tc thi ca truy cp, cc cng ty ln cn xy dng nhiu trung tm d liu nhiu ni khc nhau trn th gii, nhng trung tm d liu ny cn ng b vi nhau.H thng phn tn s dng RDBMS yu cu kt ni gia cc my ch d liu phi lin tc, lin mch. Nu xy ra li kt ni gia cc my ch d liu, s rt d pht sinh d liu trng lp, iu ny gy ra xung t d liu v vi phm tnh nht qun ca m hnh d liu quan h.Cc server kt ni di dng Master Master/s dng giao thc 2PC*Cc ng dng internet ngy nay c hng trm triu thm tr hng t ngi dng, iu ny khin cho cc my ch phi thc hin mt lng cc k ln cc lnh c ghi trong cng mt thi im. Nhng h thng my ch s dng c s d liu quan h v c nhiu rng buc ln nhau nn cc truy vn c x l chm nn khng cn p ng c nhng i hi ny.

    trong nhng nm qua, s pht trin ca Internet v cng ngh hin i ko theo s ra i ca hng lot cc nh dng d liu mi, c bit l cc loi d liu media.Trong khi RDBMS ch c thit k lu tr nhng d liu c dung lng ti a vi trm MegaByte, th nhng nh dng d liu mi c kch thc ln ti vi GigaByte, thm ch nh chp siu nt gi y c dung lng n hng Terabyte.Cng vi l vic scc loi d liu phi cu trc nh d liu thng tin v tr a l - GIS, d liu phin lm vic ngi dng, d liu thng tin hot ng ca thit b phn cng, d liu ng c my bay, d liu cm bin, Nhng loi d liu ny c gi chung l Big Data v chng vt ra ngoi kh nng x l ca SQL v c s d liu quan h.

    Nhu cu:- Kh nng x l s lng hng triu lt c/ghi tc nhanh ( tr thp)

    *NoSQL c ngha l Non-Relational - khng rng buc, tuy nhin hin nay ngi ta thng dch NoSQL l Not Only SQL - Khng ch SQL. y l thut ng chung cho cc h CSDL khng s dng m hnh d liu quan h.

    Thut ng NoSQL c gii thiu ln u vo nm 1998 s dng lm tn gi chung cho cc h CSDL quan h ngun m nh khng s dng SQL truy vn. Vo nm 2009, Eric Evans, nhn vin ca Rackspace gii thiu li thut ng NoSQL trong mt hi tho v c s d liu ngun m phn tn. Thut ng NoSQL nh du bc pht trin cath h database mi: distributed (phn tn) + non-relational (khng rng buc).

    **Fields: tng ng vi khi nim Columns trong SQLDocument: thay th khi nim row trong SQL. y cng chnh l khi nim lm nn s khc bit gia NoSQL v SQL, 1 document cha s ct (fields) khng c nh trong khi 1 row th s ct(columns) l nh sn trc.Collection: tng ng vi khi nim table trong SQL. Mt collection l tp hp cc document. iu c bit l mt collection c th cha cc document hon ton khc nhau.Key-value: cp kha - gi tr c dng lu tr d liu trong NoSQLCursor: tm dch l con tr. Chng ta s s dng cursor ly d liu t database.

    *Eventual consistency (nht qun cui): tnh nht qun ca d liu khng cn phi m bo ngay tc khc sau mi php write. Mt h thng phn tn chp nhn nhng nh hng theo phng thc lan truyn v sau mt khong thi gian (khng phi ngay tc khc), thay i s i n mi im trong h thng, tc l cui cng (eventually) d liu trn h thng s tr li trng thi nht qun.

    *Cc c s d liuNoSQLthng s dngcmmy chgi rqun l vic khai ph d liuvkhi lng giao dch,trong khiRDBMSc xu hng datrn cc my chc quyn t tinvh thng lu tr.Kt qu lchi ph cho miGBhocgiao dch/ giychoNoSQLc ththp hn chi phchoRDBMSnhiu ln, cho php bnlu tr vx l d liuhnvi mt mc githp hn nhiu.***API (Application Programming Interface - Giao din lp trnh ng dng)*Xy dng mt key/value store rt n gin v m rng chng cng rt d dng. *D liu c th tn ti dng bng vi hng t bng ghi v mi bng ghi c th cha hng triu ct. Mt trin khai t vi trm cho ti hng nghn node/commodity hardware dn n kh nng lu tr hng Petabytes d liu nhng vn m bo hiu sut cao.

    **Khi nim trung tm ca document database l khi nim document. Mi loi document database c trin khai khc nhau phn ci t chi tit nhng tt c documents u c ng gi v m ha d liu trong mt s nh dng tiu chun hoc m ha. Mt s kiu m ha c s dng bao gm XML, YAML, JSON, v BSON, cng nh kiu nh phn nh PDF v cc ti liu Microsoft Office (MS Word, Excel ). Trn thc t, tt c document database u s dng JSON(hoc BSON) hoc XML.Cc document c nh du trong document database thng qua mt kha duy nht i din cho documnet . Thng thng, kha ny l mt chui n gin. Trong mt s trng hp, chui ny c th l mt URI hoc ng dn (path). Chng ta c th s dng kha ny ly document t c s d liu. Thng thng, c s d liu vn lu li mt ch s (index) trong kha ca document document c th c tm kim nhanh chng. Ngoi ra, c s d liu s cung cp mt API hoc ngn ng truy vn cho php bn ly cc document da trn ni dung. V d, chng ta mun truy vn ly nhng document m nhng document c tp trng d liu nht nh vi nhng gi tr nht nh.

    *Khi nim trung tm ca document database l khi nim document. V c bn th document database l mt key-value store vi value nm trong mt nh dng c bit n (known format). Mi loi document database c trin khai khc nhau phn ci t chi tit nhng tt c documents u c ng gi v m ha d liu trong mt s nh dng tiu chun hoc m ha. Mt s kiu m ha c s dng bao gm XML, YAML, JSON, v BSON, cng nh kiu nh phn nh PDF v cc ti liu Microsoft Office (MS Word, Excel ). Trn thc t, tt c document database u s dng JSON(hoc BSON) hoc XML.C hai document trn c mt s thng tin tng t v mt s thng tin khc nhau. Khng ging nh mt c s d liu quan h truyn thng, ni mi record(row) c cng mt tp hp trng d liu (fields hay columns) v cc trng d liu ny nu khng s dng th c th c lu tr rng(empty), cn trong document database th khng c trng d liu rng trong document. H thng ny cho php thng tin mi c thm vo m khng cn phi khai bo r rng.

    *Li ch quan trng ca vic s dng document database l lm vic vi cc documents. Khng c hoc c rt t tr khng khng ph hp gia i tng v document. iu ny c ngha l vic lu tr d liu trong document database s d dng hn rt nhiu so vi vic s dng RDBMS trong trng hp m d liu cn lu tr c cu trc phc tp. Chng ta thng kh vt v thit k m hnh d liu vt l trong RDBMS bi v cch chng ta t d liu trong c s d liu v cch chng ta ngh v n trong ng dng hon ton khc nhau. Hn na trong RDBMS cn c khi nim lc v sa i lc l mt iu thc s kh khn nu chng ta trin khai trn nhiu node ca h thng.Document khng h tr mi quan h. iu c ngha l mi document l c lp v chng ta s d dng phn tn d liu hn so vi RDBMS bi v chng ta khng cn lu tr tt c cc quan h trn cng mt mnh ca h thng v khng cn h tr php join trn h thng phn tn.

    **Mt v d in hnh chnh l mng x hi, c th xem hnh bn di:Trong v d trn ta c 4 document v 3 mi quan h. Mi quan h trong graph database th c ngha nhiu hn con tr n thun. Mt mi quan h c th mt chiu hoc hai chiu nhng quan trng hn l mi quan h c phn loi. Mt ngi c th lin kt vi ngi khc theo nhiu cch, c th l khch hng, c th l ngi trong gia nhMi quan h t bn thn n c th mang thng tin. Trong v d trn ta ch n gin lu li li loi quan h v mc gn gi (bn b, ngi trong gia nh, ngi yu).Graph database thng c s dng gii quyt cc vn v mng. Trong thc t, hu ht cc trang web mng x hi u s dng mt s hnh thc ca graph database lm nhng vic m chng ta bit nh: kt bn, bn ca bn

    *Graph database thng c s dng gii quyt cc vn v mng. Trong thc t, hu ht cc trang web mng x hi u s dng mt s hnh thc ca graph database lm nhng vic m chng ta bit nh: kt bn, bn ca bnMt vn i vi vic m rng graph database l rt kh tm thy mt th con c lp, c ngha l rt kh ta phn tn graph database thnh nhiu mnh.

    *


Top Related