lecture 14, part 2 – review of concepts

51
November 15, 2017 Sam Siewert CS317 - File and Database Systems Lecture 14, Part 2 – Review of Concepts http://www.thedailybeast.com/articles/2013/03/22/less-is-moo-the-genius-of-gary-larson.html

Upload: others

Post on 15-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

November 15, 2017 Sam Siewert

CS317 - File and Database Systems

Lecture 14, Part 2 – Review of Concepts

http://www.thedailybeast.com/articles/2013/03/22/less-is-moo-the-genius-of-gary-larson.html

RemindersQuiz #3 - Finish Before End of Day Today[Quiz #4] - Take Anytime Between Now and Monday

Submit Assignment #6 This Weekend (No Later than End of Day Monday Next Week)

Exam #2, Part #1 - Monday (Shorter Questions)Exam #2, Part #2 - Wednesday (More in-depth)

Final Oral Exam on DB Design - Saturday, 12/9, 8AM, HERE, 20 Minutes Per Team

RubricInspection Review Sheets (All Must Participate)

Sam Siewert 2

Quiz #3 and #4Quiz #3 - Online Wed, 11/29 Until Friday End-of-Day5 So Far - 88% Ave, 100% High, 75% Low - Average of up to 3 TriesLog On and Take BEFORE 11:59 PM Tonight!

Quiz #4 - Available Online Today Until 12/3 for Practice– Transaction Concepts Review - Lecture Week-12-2– OODBMS Concepts Review - Lecture Week-13-1

E-mail Questions you Have

See Me During Monday Office Hours - 9-10am-11am -noon, 1pm-2pm Sam Siewert 3

REVIEWExam #2– Lecture-Week-8 -1 through Lecture-Week-14-1– DBS Text, Chapters 4, (SQL Review in 5), 6, 7 (Transactions) & 10

(Security, DBA)

Exam #2 Expects you to Recall Fundamentals from Exam #1 and Basic SQL

Final Exam – Assessment of your Ability to translate Theory and Concepts into a functional DB– SQL DDL, DML, DBA– View Integration Logical Design– Schema Design and Normalization (Domains)– Logical to Physical Design Mapping and Implementation– Views and Lossless Join (Relational Algebra)– Data and Referential Integrity (Triggers, Stored Procedures)– Scripts to Test and Evaluate– Transactions– User Access (Views and Connectors)

Sam Siewert 4

FIRST HALFWeek 1-6, Week 7 Exam-1

Sam Siewert 5

Week 1-6 – As Described in SyllabusFundamental Concepts Only - SQL, Theory for SQL

Notes on File Systems Presented Week-1– Meta-data, Namespace, I-nodes

DBS Text Chapters 1,2,3,6 [Ref. Connolly-Begg Chapters 1 to 13]– Part-I - DBMS Background

Background on DBMS, HistoryChapter 3 – Architecture (CB Ref. and Notes on ANSI SPARC)

– Part-II - Relational Model, E.F. Codd, SQL DML/DDLDBS Chapter 5, CB Ref. Chapters 4, 5 – Relational Model, Algebra and Calculus

– Relational Algebra (P. 132)– Tuple Relational Calculus (P. 133 to 136)– NO DOMAIN RELATIONAL CALCULUS

DBS Chapter 5, CB Ref. Chapters 6, 7, 8 – Core SQL, SQL/PSMNotes on OO vs. RDBMS and CB Ref. Chapter 9 – SQL:2011 ORDBMS Extensions

– OOA, OOD, OOP and OODBMS Alternative (Persistence of Instantiated Objects from Classes)– OODBMS Discussed in concept, but deferred until Chapter 27

– Part-III - Database Analysis and DesignNotes – DBMS Development LifecycleNotes – DreamHome Case Study (Assignment #3), CB Ref. P. 112Notes – Basic ER Models (Chen’s Diagrams)Notes – Enhanced ER Models

Sam Siewert 6

Study SuggestionsReview Exam-1

Scan Book Chapters 4, (5 for Review), 6, 7, 8

Read Through Notes and Study any Unfamiliar Concepts

Sam Siewert 7

NORMALIZATIONWeek 8

Sam Siewert 8

Normalization – Week 8Most Starting Points for RDBMS Design include UNF Data (e.g. Spreadsheets) – Duplication in Columns!

We studied Process to go from UNF to 3NF

We studied Residual Issues with 3NF and BCNF as Potential Solution

Goal in UNF->3NF->BCNF Normalization is to Eliminate Insert, Delete and Modify Hazards while Preserving Loss-less Join and Dependencies

Tends to Produce More Tables, Simpler, with FKs!

Sam Siewert 9

10

The Process of Normalization

Ignore Higher Forms

11

The Process of Normalization

Preview of UNF -> 3NFMinimizes Update Anomalies [Insert, Update, Delete], Page 420 to 426 – One Client Renting Multiple Properties – Typical of Spreadsheet

Sam Siewert 12

cNo cName pNo pAddr rentStart rentFinish Rent oNo oName

CR76 John Kay PG4,

PG16

6 Lawrence Street,5 Novar Drive

7/12,

9/13

8/31,

9/14

350,

50

CO40,

CO93

Tina Murphy,

Tony Shaw

Client

PropertyOwner

PropertyForRent OwnerRental

CONCEPTUAL / LOGICAL DBMS DESIGN

Week 9

Sam Siewert 13

Conceptual DBMS Design - Week 9User View Integration Strategy (Compared to Centralized) - CB Ref. pp. 304-307

MySQL Workbench CASE Tool

ER/EER Use to Specify an RDBMS Schema

Forward and Reverse Engineering of DB Schemas

Goals and Concepts for Conceptual and Logical Design

General Familiarity with MySQL Workbench Major Features and Methods of Use

Sam Siewert 14

PHYSICAL DBMS DESIGNWeek 10-11

Sam Siewert 15

DBMS Logical and Physical Design –Week 10/11

Conceptual DBMS Design Methods Mapping onto Physical

Physical Design– Indexing – B-Tree, B+ Tree, Hash Tables– Storage Engines – MyISAM, InnoDB (File or Partition),

MEMORY– RAID – RAID 1, RAID 0, RAID 5; RAID 50, RAID 10 (Ignore

other RAID levels)

Differences between B-Tree and B+ Tree

Value of R-Tree - Concept Only Sam Siewert 16

DBMS SECURITYWeek 11-12

Sam Siewert 17

DBMS Security - Week 11/12Authorization AuthenticationOne-Way HashSymmetric EncryptionPublic Key EncryptionSigning (E.g. Driver Signing)Networked DBMS Client to DBMS Server

Threats – What are they, how Significant to DBMS

Basic Encryption Concepts and Methods

RAID and RTO/RPO Disaster Recovery Concepts Sam Siewert 18

DBMS TRANSACTIONSWeek 12

Sam Siewert 19

DBMS Transactions – Week 12Recall That Many DBMS Clients can Issue Many Transaction Requests– Thread Per Connection– Thread Created for Each Transaction Request– ACID Goals (Atomicity, Consistency, Isolation, Durability)

Two Solutions in Use Today – Strict 2PL (Locks with Automatic Scoping), Time-stamp Request Queue– Similar Issues in OS, but Not Exactly the Same [DBMS - Lost

Update, Uncommitted Dependency, Inconsistent Analysis]– Similar Solutions in OS, but Not Exactly the Same (E.g.

Semaphores and Locks)

Alternatives for Scheduling Reads/Writes encapsulated in Transactions are Challenging (NP-Hard) and/or Insufficient– Conflict Serialization – Insufficient to Solve all 3 Problems– View Serialization – NP Hard Algorithm, Not Practical

Sam Siewert 20

Comparison of Methods

Pearson Education © 2014 21

Orders conflicting operationsAs “some” serial schedule

Including those with data corruptionAnd inconsistent DB states

Less Restrictive Operation OrderingAs “some” serial schedule[Impractical – NP-Complete]

DBMS Transaction SummaryConcurrency due to Multiple Clients and Transactions Mapped to OS Threads [Readers, Writers]

Causes Problems with Split Transactions– Lost Update– Dirty Read or Uncommitted Dependency– Fuzzy Read, Phantom Read

Serialization Using Locks – Not Sufficient for Consistency if Lock Scope Too Small and Can Cause Deadlock!Recoverability – Atomicity Requirement2PL – Growing and Shrinking Phase – Helps with Lock Scope, but Still have Deadlock/LivelockCascading Roll-back and Rigorous/Strict 2PLDeadlock Problem – Prevention, Detect & BreakAlternative Timestamp Methods

Rigorous/Strict 2PL with Deadlock Prevention or Detect & Break or Timestamp is Best

Sam Siewert 22

DBMS CONNECTORS [END OF NEW MATERIAL]

Week 14

Sam Siewert 23

RDBMS Client Connector - Week 14

Data Processing and User Interface Client Applications Need to Connect to DBMS Server

Improved Features Relative to Direct SQL Command Line for Users

Language Specific Connectors– ODBC – Open Database Connection, Middleware API Standard,

Typically Implemented in C Language– JDBC – Java Database Connector (ODBC Bridge)– MySQL C++ API Connector– MySQL C API and Driver for C Connector

Study C Examples – conntest.c

Sam Siewert 24

OODBMS AND UML [CONCEPTS ONLY]

Week 14

Sam Siewert 25

OODBMS and UML – Week 14OODBMS, Object Stores, and NoSQL Alternatives to RDBMS

Basic Definitions of Class Hierarchy, Inheritance, Encapsulation, Abstract Class, Concrete or Refined Class, Instantiation of Object, Dynamic Binding, Parametric and Ad-hoc Polymorphism

OOP – Programming Languages for Most Direction Implementation of OOA/OOD

PPL – Persistent Programming Language – One OODBMS Approach to Integrate OOP with DBMS

Type Gap in RDBMS and Type Mapping Between Store and Memory in OODBMS and Object Stores

Sam Siewert 26

Recall Object Definition – From Week 4Instance of a Class [Hierarchy from Abstract that Can’t Be instantiated to Concrete Sub-classes]Classes Define Public and Private Data and Methods to Operate on that DataSub-classes Inherit from Parent Classes and Refine Data Abstraction and MethodsOOPS – Java, C++, …, back to Smalltalk and Lisp CLOSEncapsulation and Abstraction is the GoalImplementation Hiding and Interface DefinitionEach OOP Has Variations on Support [E.g. Multiple Inheritance, Abstract Classes and Methods, Polymorphism [Parametric, Ad-hoc, Operator Overloading]E.g. Oracle’s Java Object Tutorial

Sam Siewert 27

Origins of the Object-Oriented Data Model

Pearson Education © 2014 28

Two-Level Storage Model for RDBMS

Pearson Education © 2014 29

Single-Level Storage Model for OODBMS

Pearson Education © 2014 30

OODBMS Type MappingPersistent OOPL Approach– Need to Store Instantiated Objects with Persistence Attributes– Store Attribute Data and Methods (Code)

Pointer Swizzling – OIDs (Index into Look-Up Table) Provides Mapping between Methods (Function Addresses) and Data (Data Addresses)– Similar to General Descriptor Concepts – E.g. a File pointer and a File

descriptor (index that is simple integer)– Involves OID Look-Up Indirection

No Swizzle – VM Paging of Objects from Storage to Memory– VM Basic Concept – Logical Memory Space Larger than Physical, Disk

Backs Up Pages in Virtual Memory Space not Loaded in RAM– VM Addressing is Logical Memory Address translated to Physical Address

with Page Fault to Handle Unloaded Pages– OODBMS can Page Objects from Disk to Memory– VM Pages are Not Normally Durable, but Objects Paged Must be!

Sam Siewert 31

Accessing an Object with an OODBMS

Pearson Education © 2014 32

Client-Server Architecture

Pearson Education © 2014 33

Exam #2 - SummaryOODBMS is a Work in Progress, but Promising

NoSQL, Object Stores, Innovation in File Systems

RDBMS Remains Ideal for Transaction Workloads– Security Challenges with Concentrated Large R-DBMS

Breaches– Block-chain ledgers and distributed DBs potential alternative– R-DBMS less well adapted to unstructured data (video, images,

documents), solved with parallel R-DBMS + file system

OODBMS, Object Stores, NoSQL – Growing Alternative for Newer Workloads – Object Oriented Data (BLOBs) and Big Data Analytics

Sam Siewert 34

Concepts Only for OODBMS and UML

UML, No-SQL (OODBMS) and Design Methods

Sam Siewert

35

RDBMS Impedance with Procedural Programming

SQL Extended to Include Stored Procedures, Functions and Type Extension– Useful for secure execution and data processing on a server– SQL is not ideal however for GUIs, Apps, etc.

Connectors are a patch solution for clients– Security issues between client App and SQL server– Apps can be any procedural programming language

No-SQL, OODBMS is another alternative– Persistent OO Class instances – objects are stored– UML and OOA/OOD - > OOP in SE310 for SEs

Sam Siewert 36

UML – Week 13Compare to ER/EER and Concepts OnlyUnified Set of OO Diagraming Methods for OOA/OODBasic Concepts on Indicated Diagram Types

Sam Siewert 37

UML – Design GoalsProvide ready-to-use, expressive visual modelinglanguage …

Provide extensibility and specialization …

Be independent of particular programming languagesand development processes.Provide a formal basis …

Encourage growth of object-oriented tools market.Support higher-level development concepts such ascollaborations, frameworks, patterns, and components.

Integrate best practices.

Pearson Education © 2014 38

UML - DiagramsStructural:– Class diagrams (similar to EER with Hierarchy,

Attributes, and Class Name, Methods Added)– object diagrams (similar to ER)– component diagrams– deployment diagrams.

Behavioral:– use case diagrams– sequence diagrams– collaboration diagrams– statechart diagrams– activity diagrams.

Pearson Education © 2014 39

UML Class DiagramVery Similar to EER (Almost Identical, Just Added Detail and Extensions) – Lecture-Week-6-2– Encapsulation of Attributes and Methods to Operate on Them– Recall RDBMS has SQL/PSM and ORDBMS Has Extensions for

Methods– EER However Has Class Hierarch and Attributes, but No

Methods Specified

Sam Siewert 40

http://www.agilemodeling.com/artifacts/classDiagram.htmLecture-Week-6-2, Slide 11 to 16

UML – Object DiagramsModel instances of classes and used to describe system at a particular point in time.

Can be used to validate class diagram with “real world” data and record test cases.

Similar to ER (Lecture Week-6-1)

Pearson Education © 2014 41

UML – Deployment DiagramsDepict configuration of run-time system, showinghardware nodes, components that run on thesenodes, and connections between nodes.

Pearson Education © 2014 42

UML – Use Case Diagrams

Pearson Education © 2014 43

UML – Use Case Diagrams

Pearson Education © 2014 44

UML – Statechart DiagramsShow how objects can change in response toexternal events.Usually model transitions of a specific object.

Pearson Education © 2014 45

DISTRIBUTED DBMS [NOT ON EXAM]

Week 15

Sam Siewert 46

Distributed DBMS – Week 15Why? – Scaling and Disaster Recovery

DR– Datacenters in 2 Geo Locations – Natural Disaster, Terrorist

Attack, WAN outages, Power grid failure, etc.– RTO / RPO – Recovery Time and Point Objectives– Asynchonrous Mirrors, Active / Passive Fail-Over and Fail-Back

Cluster Scaling – Shared Nothing– Load Balance Large Number of Clients and Transactions– Scaling Beyond Single Server– Synchronous Low-latency / High-bandwidth Network (e.g. 10GE,

40G Infiniband, 100G Converged Enhanced Ethernet)

Sam Siewert 47

MySQL ReplicationUsed for DR [Disaster Recovery] and for Geo Content and Services Distribution– East Coast / West Coast – ERAU Daytona, ERAU Prescott– ACTIVE-ACTIVE for Reads, Writes to MASTER

http://dev.mysql.com/doc/refman/5.0/en/replication.htmlAsynchronous Compared to Clustering [Synchronous]One Server is Master - MySQL Replication Solutions

Sam Siewert 48

MySQL Replication - DRFail-Over Master in Case of DisasterBEFORE MASTER FAILURE

Sam Siewert 49

MySQL Replication – DR Fail-OverFail-Over Master in Case of DisasterAFTER MASTER FAILUREChallenges – Fail-Back [Manual] and Split-Brain

Sam Siewert 50

To Restore MASTER We Must:

1) Repair Issue [E.g. HW fix]2) Re-Sync Writes3) Fail-Back to Original MASTER

MySQL ClusterMySQL Shared Nothing Cluster – NDBD, Network DB DaemonSynchronous – So Ideally Gigabit, 10GE, 40G IB or Better ClusterSimilar to PNFS Concept – Parallel Network File System (NDB MgtBottleneck?)

Sam Siewert 51