nosql - universitetet i agdergrimstad.uia.no/.../slides/ikt437-nosql-20151116-c.pdf · 11/16/2015...

55
IKT437 Knowledge Engineering and Representation NoSQL Terje Gjøsæter, Ph.D. UiA, Grimstad – 16. November 2015

Upload: others

Post on 24-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

IKT437 Knowledge Engineering and Representation

NoSQL

Terje Gjøsæter, Ph.D.

UiA, Grimstad – 16. November 2015

Page 2: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

2

Page 3: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Introduction

• NoSQL has become increasingly popular and important lately.

• NoSQL – No SQL, or Not Only SQL?

• Many different variants, covering many different needs and use cases.

• So what is NoSQL? Every data store that is not SQL-based RDBMS?

• Q: Opinions?

3

Page 4: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Typical characteristics

• Non-relational

• Flexible schema

• Less structured data

• Supports big data

• Other or additional query languages than SQL

• Distributed – horizontal scaling

• Eventual consistency – tradeoff due to CAP theorem

• Q: Are you all familiar with the CAP theorem and consistency models?

4

Page 5: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

CAP Theorem

• It is impossible for a distributed system to provide all three of the following at the same time:

• Consistency (all nodes see the same data at the same time)

• Availability (a guarantee that every request receives a)

• Partition tolerance (the system continues despite partitioning due to network failures)

5

Page 6: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Consistency Models (from distributed computing)

• Eventual Consistency

• A weak consistency model in a system with lack of simultaneous updates.

• If no update takes very long time, all replicas eventually will become consistent.

• Strict consistency

• The strongest consistency model.

• Requires that if a process reads any memory location, the value returned by the read

operation is the value written by the most recent write operation to that location.

6

Page 7: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Motivation

• Why NoSQL?

• Less structured databases needed.

• Not all data fit into relational table-based structure.

• Social Media and Big Data are the big drivers for new database types.

• Data tends to be less structured and too big for traditional RDBMS.

• Let’s briefly introduce data storage needs of Social Media and Big Data.

7

Page 8: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Social Media and Web 2.0 – Example of Big Data

• Google, Facebook, Twitter, Instagram, Amazon and Yahoo among others need to

store and handle enormous amounts of data

• These data tend to have different characteristics and requirements compared to

«typical» structured database data.

• Less strict structure in the data.

• Need for a way to distribute of data across clusters that is easy to manage and use.

• Different requirements for consistency (see CAP-theorem)

• Example: sometimes we see a post on facebook disappearing and then

showing up again later

8

Page 9: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Big Data – Early Definitions

9

Large data sets, taxing the capacities of main memory, local disk, and even remote disk (1997)

data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges

datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze

McKinsey

Page 10: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

10

HANDLING AND STORAGE OF «TOO BIG» DATA

Source: Georgia Tech Library (http://d7.library.gatech.edu)

Page 11: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Meanwhile: Big Data – Opportunity-Enablers

11

Business Intelligence Analysing data to make informed business decisions.

Data WarehousingCentral repository of integrated (and highly structured) data for reporting and analysis

Data MiningSearching for interesting trends and patterns in data

Page 12: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Defining Big Data as an Opportunity

Mayer-Schönberger & Cukier 2013: “The ability of society to

harness information in novel ways to produce useful insights…”

and “…things one can do at a large scale that cannot be done at

a smaller one, to extract new insights or create new forms of

value.”

12License: CC0 Public Domain

Page 13: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Big Data can be……..

13Source: Wikimedia common, Camelia Bobanlicensed under the Creative Commons Attribution-Share Alike 3.0 Unported

SECURITY AND PRIVACY

Page 14: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Big Data Aspects and Life Cycle Overview

14

• Machine learning; graphs, maps

• Security, privacy

• Sharing policy

• Structured or unstructured?

• With meta-information?

• SQL or NoSQL?

• Store all? Include external data from different sources?

Selection, Harvesting,

Data Integration

Structuringand

Storage

Analysis, Visualisation

Protection and Usage

Policy

Page 15: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

NoSQL to the Rescue!

• NoSQL is able to cover the needs of Social Media and Big Data

• But different variants of NoSQL also support

• Small data

• Simple data

• Awkwardly shaped data

• Funny data

• Odd data

15

Page 16: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Typical Features of NoSQL

• Running well on clusters

• Mostly open-source

• Schema-less

• Not having to convert your data to and from a relational data model but can use the

data model of your software directly.

16

Page 17: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

17

Page 18: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

History of NoSQL

• Q: When did people first start talking about NoSQL?

18

Page 19: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

History of NoSQL

• Q: When did people first start talking about NoSQL?

• The term NoSQL was used by Carlo Strozzi in 1998 to name his lightweight, Strozzi NoSQL open-

source relational database that did not expose the standard SQL interface, but was still relational.

• Johan Oskarsson of Last.fm reintroduced the term NoSQL in early 2009 when he organised an

event to discuss "open source distributed, non relational databases".

• Most early NoSQL systems did not support ACID and Joins. This is changing lately…

• Q: Are you all familiar with the ACID requirements for databases?

19

Page 20: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

ACID

• Atomicity means that database modifications must follow an all or nothing rule. Each transaction

is said to be atomic. If one part of the transaction fails, the entire transaction fails.

• Consistency means that only valid data will be written to the database.

• Isolation requires that multiple transactions occurring at the same time not impact each other’s

execution.

• Durability ensures that any transaction committed to the database will not be lost.

20

Page 21: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

21

Page 22: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Categories of NoSQL – Key-value-based

• Key-value-based

• Supports a dictionary or map of key-value pairs.

• Value may be simple or (un/semi/-)structured blob of

data.

• Often used as basis for more complex data models.

• Wide Column Store

• A type of key-value database. It uses tables, rows, and

columns, but unlike a relational database, the names

and format of the columns can vary from row to row in

the same table.

22

Page 23: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Categories of NoSQL – Document-oriented

• Document-oriented

• Supports storing, retrieving, and managing

document-oriented information.

• Documents encapsulate and encode data in

some standard formats or encodings.

• XML

• subclass of document-oriented databases that

are optimized to extract their metadata

from XML documents.

• Object store

• Object includes data itself, variable amount

of metadata, and globally unique identifier.

• Storing photos on Facebook, songs on Spotify,

or files in online collaboration service such

as Dropbox.23

Page 24: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Categories of NoSQL – Graph-based

• Graph-based

• uses graph structures for semantic queries with nodes, edges and properties to represent and

store data.

• Triplestore RDF

• Variant of graph-based.

• Stores triples: subject-predicate-object

• Alice knows Bob; Bob has Cat; Cat catches Mouse; Alice fears Mouse

• Adding a name to the triple makes a "quad store" or named graph.

24

Page 25: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Categories of NoSQL – Hybrids

• Multi-model

• Support multiple data models against a single, integrated backend.

• May also contain relational elements

• MultiValue

• Differs from RDBMS in that it support and encourage the use of attributes which can take a

list of values, rather than all attributes being single-valued.

25

Page 26: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Key-Value Databases

• Key-value systems treat the data as a single opaque collection which may have different fields for

every record.

• This offers considerable flexibility and more closely follows modern concepts like object-oriented

programming.

• Because optional values are not represented by placeholders as in most RDBs, key-value stores

often use far less memory to store the same database, which can lead to large performance gains

in certain workloads.

• Examples:

• CouchDB,

• Oracle NoSQL Database,

• Dynamo,

• MemcacheDB,

• Redis

26

Page 27: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Column-oriented Databases

27

• Wide Column-store

• The names and format of the columns can vary from row to row in the same table.

• A column has three elements:

• Unique name: Used to reference the column.

• Value: The content of the column. Simple type.

• Timestamp: The system timestamp used to determine the valid content.

• The timestamp is used to differentiate the valid content from stale ones.

• Examples:

• Accumulo,

• Cassandra,

• Druid,

• HBase,

• Vertica

Page 28: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Document-oriented Databases

• Databases has Collections that has Documents that has semi-structured data

• In a key-value store the data is considered to be opaque to the database.

• Document-oriented system relies on internal structure in the document to extract metadata that

the database engine uses for optimization.

• Designed to offer a richer experience with modern programming techniques.

• XML databases are a specific subclass of document-oriented databases that are optimised to

extract their metadata from XML documents.

• Examples:

• Clusterpoint,

• Apache CouchDB,

• Couchbase,

• Lotus Notes,

• MongoDB

28

Page 29: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Graph-based Databases

• Graph databases are based on graph theory.

• Nodes, properties, and edges.

• Nodes represent entities such as people, businesses, accounts, etc.

• Properties are information that relate to nodes.

• Edges are the lines that connect nodes to nodes or nodes to properties

• Most of the important information is really stored in the edges.

• Meaningful patterns emerge when one examines the connections and interconnections of nodes,

properties, and edges

• Examples:

• Allegro, Neo4J, InfiniteGraph, OrientDB,

• Virtuoso,

• Stardog

29

Page 30: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

How do we Choose a NoSQL Database for our Project?

• Key-value databases are generally useful for storing session information, user profiles,

preferences, shopping cart data. We would avoid using Key-value databases when we need to

query by data, have relationships between the data being stored or we need to operate on

multiple keys at the same time.

• Column oriented databases are generally useful for content management systems, blogging

platforms, maintaining counters, expiring usage, heavy write volume such as log aggregation. We

would avoid using column family databases for systems that are in early development, changing

query patterns.

• Document databases are generally useful for content management systems, blogging platforms,

web analytics, real-time analytics, ecommerce-applications. We would avoid using document

databases for systems that need complex transactions spanning multiple operations or queries

against varying aggregate structures.

• Graph databases are very well suited to problem spaces where we have connected data, such as

social networks, spatial data, routing information for goods and money, recommendation

engines.

30

Page 31: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

PerformanceData

ModelPerformance Scalability Flexibility Complexity Functionality

Key–Value Store

high high high none variable (none)

Column-Oriented Store

high high moderate low minimal

Document-Oriented Store

highvariable (high)

high low variable (low)

Graph Database

variable variable high high graph theory

Relational Database

variable variable low moderaterelational algebra

31

Source: Ben Scofield http://www.slideshare.net/bscofield/nosql-codemash-2010

Page 32: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

32

Page 33: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Examples of NoSQL Systems

• Examples of real world popular NoSQL database systems:

• MongoDB

• CouchDB

• BaseX

• Apache Cassandra

• Amazon DynamoDB

• Redis

• Neo4J

33

Page 34: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

MongoDB

• Document-oriented

• Search by field, range queries, regular expression searches. Queries can return specific fields of

documents and also include user-defined JavaScript functions.

• Any field in a document can be indexed

• Replication

• MongoDB provides high availability with replica sets.

• Load balancing

• MongoDB scales horisontally. The data is split into ranges and distributed across multiple servers.

• MapReduce can be used for batch processing of data and aggregation operations.

• JavaScript can be used in queries and aggregation functions (e.g. MapReduce).

34

Page 35: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Apache CouchDB

• Document-oriented

• “A database that completely embraces the web"

• Uses JSON to store data

• JavaScript as query language using MapReduce

• HTTP for API

35

Page 36: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

BaseX

• XML-database

• XPath query language

• XQuery 3.1

• Client-Server architecture with user and transaction management and logging facilities

• APIs: RESTful API, WebDAV, XML:DB, Java, C#, Perl, PHP, Python and others

• Supported data formats: XML, HTML, JSON, CSV, Text, binary data

• GUI including several visualisations: Treemap, table view, tree view, scatter plot

36

Page 37: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Apache Cassandra

• Hybrid between key-value and wide-column-oriented

• Decentralized: Every node in the cluster has the same role - no single point of failure.

• Data is distributed across the cluster but every node can service any request.

• Replication strategies are configurable.

• Scalable: Read and write throughput increase linearly as new machines are added.

• Data is automatically replicated to multiple nodes for fault-tolerance.

• Tunable consistency

• MapReduce support, Hadoop integration

• Query language: CQL

37

Page 38: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Amazon DynamoDB

• Key-value db

• fully managed proprietary NoSQL database service offered by Amazon.com.

• "built on the principles of Dynamo" (used initially for their own website).

• Language bindings for Java, Node.js, .NET, Perl, PHP, Python, Ruby, and Erlang.

38

Page 39: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Redis

• Key-value db

• Redis maps keys to types of values.

• Redis typically holds the whole dataset in memory.

• By default, Redis syncs data to the disk at least every 2 seconds, with more or less robust options

available if needed. In the case of a complete system failure on default settings, only a few

seconds of data would be lost.

• Language bindings include ActionScript, C, C++, C#, Clojure, Common Lisp, D, Dart, Erlang, Go,

Haskell, Haxe, Io, Java, JavaScript (Node.js), Julia, Lua, Objective-C, Perl, PHP, Pure Data,

Python, R, Racket, Ruby, Rust, Scala, Smalltalk and Tcl.

39

Page 40: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Neo4J

• Graph-oriented

• Implemented in Java and accessible from software written in other languages using the Cypher

query language through a transactional HTTP endpoint.

• ACID-compliant transactional database with native graph storage and processing.

• The most popular graph database.

• Everything is stored as an edge, a node or an attribute.

• Each node and edge can have any number of attributes.

• Both the nodes and edges can be labelled.

• Labels can be used to narrow searches.

40

Page 41: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

41

Page 42: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Encodings

• XML

• Includes structure and meta-data

• JSON

• JavaScript Object Notation

• Simpler and less formal than XML.

• YAML

• a human-readable data serialization format, inspired by XML and JSON

• BSON

• Binary variant of JSON used by MongoDB

• RDF – Many variants!

• Q: Mention some RDF encodings?

42

Page 43: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Encodings

• RDF – Many different serialisation formats for RDF graphs!

• Turtle a compact, human-friendly format.

• N-Triples a very simple, easy-to-parse, line-based format that is not as compact as Turtle.

• N-Quads a superset of N-Triples, for serializing multiple RDF graphs.

• JSON-LD a JSON-based serialization.

• N3 or Notation3, a non-standard serialization that is very similar to Turtle, but has some

additional features, such as the ability to define inference rules.

• RDF/XML an XML-based syntax that was the first standard format for serializing RDF.

43

Page 44: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

44

Page 45: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Querying

• How to Query a NoSQL database?

• SPARQL (SPARQL Protocol and RDF Query Language)

• HTTP REST API (with JSON)

• Specialised query language, e.g. CQL (Cassandra Query Language)

• Specialised API and/or client app (e.g. mongo client for MongoDB)

• Java or various other general purpose programming languages…

45

Page 46: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Querying

• How to Query a NoSQL database?

• SPARQL (SPARQL Protocol and RDF Query Language)

• HTTP REST API (with JSON)

• Specialised query language, e.g. CQL (Cassandra Query Language)

• Specialised API and/or client app (e.g. mongo client for MongoDB)

• Java or various other general purpose programming languages…

• Q: And?

46

Page 47: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Querying

• How to Query a NoSQL database?

• SPARQL (SPARQL Protocol and RDF Query Language)

• HTTP REST API (with JSON)

• Specialised query language, e.g. CQL (Cassandra Query Language)

• Specialised API and/or client app (e.g. mongo client for MongoDB)

• Java or various other general purpose programming languages…

• Q: And?

• SQL – NoSQL = «not only SQL»

• But SQL is still not very common.

47

Page 48: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Querying

• Lack of join means that we will often either:

• Do multiple queries.

• Create and store complex documents with all the application needs inside – e.g. a blogpost

with all comments included.

• Un-normalise the database by duplicating information that is needed in multiple locations.

48

Page 49: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

49

Page 50: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Example – Semistructured Sensor Data

• Case: Storing sensor readings from mobile phone (CIEM: SmartRescue Project)

• MongoDB for storing readings in semi-structured documents containing available measurements

at a given time.

• Transforming data to JSON document

• Querying, analysis and visualisation

50

Page 51: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Example – Native XML Storage

• Case: Storing anonymised IDS alarms in XML database.

• Alarms are formatted as IDMEF messages (an XML-based format)

• To be stored in BaseX XML database.

51

Page 52: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Example – Emergency information integration

• Case: Collecting information about emergencies from multiple sources (CIEM)

• Storage: Redis database.

52

Page 53: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Overview

• Introduction and Motivation

• History of NoSQL

• Categories of NoSQL

• Examples of NoSQL systems

• Encodings

• Querying

• Examples

• Summary

53

Page 54: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

Summary

• Scalable distributable databases for un/semi-structured (big) data.

• Very flexible concerning datamodels.

• Not always ACID.

• CAP-theorem -> may be less emphasis on consistency.

• Not only SQL means SQL still allowed!

• Relational databases are still alive and have their uses; choose wisely!

54

Page 55: NoSQL - Universitetet i Agdergrimstad.uia.no/.../slides/IKT437-NoSQL-20151116-c.pdf · 11/16/2015  · Social Media and Web 2.0 –Example of Big Data •Google, Facebook, Twitter,

The End

55