lecture 14, part 2 – review of concepts
TRANSCRIPT
November 15, 2017 Sam Siewert
CS317 - File and Database Systems
Lecture 14, Part 2 – Review of Concepts
http://www.thedailybeast.com/articles/2013/03/22/less-is-moo-the-genius-of-gary-larson.html
RemindersQuiz #3 - Finish Before End of Day Today[Quiz #4] - Take Anytime Between Now and Monday
Submit Assignment #6 This Weekend (No Later than End of Day Monday Next Week)
Exam #2, Part #1 - Monday (Shorter Questions)Exam #2, Part #2 - Wednesday (More in-depth)
Final Oral Exam on DB Design - Saturday, 12/9, 8AM, HERE, 20 Minutes Per Team
RubricInspection Review Sheets (All Must Participate)
Sam Siewert 2
Quiz #3 and #4Quiz #3 - Online Wed, 11/29 Until Friday End-of-Day5 So Far - 88% Ave, 100% High, 75% Low - Average of up to 3 TriesLog On and Take BEFORE 11:59 PM Tonight!
Quiz #4 - Available Online Today Until 12/3 for Practice– Transaction Concepts Review - Lecture Week-12-2– OODBMS Concepts Review - Lecture Week-13-1
E-mail Questions you Have
See Me During Monday Office Hours - 9-10am-11am -noon, 1pm-2pm Sam Siewert 3
REVIEWExam #2– Lecture-Week-8 -1 through Lecture-Week-14-1– DBS Text, Chapters 4, (SQL Review in 5), 6, 7 (Transactions) & 10
(Security, DBA)
Exam #2 Expects you to Recall Fundamentals from Exam #1 and Basic SQL
Final Exam – Assessment of your Ability to translate Theory and Concepts into a functional DB– SQL DDL, DML, DBA– View Integration Logical Design– Schema Design and Normalization (Domains)– Logical to Physical Design Mapping and Implementation– Views and Lossless Join (Relational Algebra)– Data and Referential Integrity (Triggers, Stored Procedures)– Scripts to Test and Evaluate– Transactions– User Access (Views and Connectors)
Sam Siewert 4
Week 1-6 – As Described in SyllabusFundamental Concepts Only - SQL, Theory for SQL
Notes on File Systems Presented Week-1– Meta-data, Namespace, I-nodes
DBS Text Chapters 1,2,3,6 [Ref. Connolly-Begg Chapters 1 to 13]– Part-I - DBMS Background
Background on DBMS, HistoryChapter 3 – Architecture (CB Ref. and Notes on ANSI SPARC)
– Part-II - Relational Model, E.F. Codd, SQL DML/DDLDBS Chapter 5, CB Ref. Chapters 4, 5 – Relational Model, Algebra and Calculus
– Relational Algebra (P. 132)– Tuple Relational Calculus (P. 133 to 136)– NO DOMAIN RELATIONAL CALCULUS
DBS Chapter 5, CB Ref. Chapters 6, 7, 8 – Core SQL, SQL/PSMNotes on OO vs. RDBMS and CB Ref. Chapter 9 – SQL:2011 ORDBMS Extensions
– OOA, OOD, OOP and OODBMS Alternative (Persistence of Instantiated Objects from Classes)– OODBMS Discussed in concept, but deferred until Chapter 27
– Part-III - Database Analysis and DesignNotes – DBMS Development LifecycleNotes – DreamHome Case Study (Assignment #3), CB Ref. P. 112Notes – Basic ER Models (Chen’s Diagrams)Notes – Enhanced ER Models
Sam Siewert 6
Study SuggestionsReview Exam-1
Scan Book Chapters 4, (5 for Review), 6, 7, 8
Read Through Notes and Study any Unfamiliar Concepts
Sam Siewert 7
Normalization – Week 8Most Starting Points for RDBMS Design include UNF Data (e.g. Spreadsheets) – Duplication in Columns!
We studied Process to go from UNF to 3NF
We studied Residual Issues with 3NF and BCNF as Potential Solution
Goal in UNF->3NF->BCNF Normalization is to Eliminate Insert, Delete and Modify Hazards while Preserving Loss-less Join and Dependencies
Tends to Produce More Tables, Simpler, with FKs!
Sam Siewert 9
Preview of UNF -> 3NFMinimizes Update Anomalies [Insert, Update, Delete], Page 420 to 426 – One Client Renting Multiple Properties – Typical of Spreadsheet
Sam Siewert 12
cNo cName pNo pAddr rentStart rentFinish Rent oNo oName
CR76 John Kay PG4,
PG16
6 Lawrence Street,5 Novar Drive
7/12,
9/13
8/31,
9/14
350,
50
CO40,
CO93
Tina Murphy,
Tony Shaw
…
Client
PropertyOwner
PropertyForRent OwnerRental
Conceptual DBMS Design - Week 9User View Integration Strategy (Compared to Centralized) - CB Ref. pp. 304-307
MySQL Workbench CASE Tool
ER/EER Use to Specify an RDBMS Schema
Forward and Reverse Engineering of DB Schemas
Goals and Concepts for Conceptual and Logical Design
General Familiarity with MySQL Workbench Major Features and Methods of Use
Sam Siewert 14
DBMS Logical and Physical Design –Week 10/11
Conceptual DBMS Design Methods Mapping onto Physical
Physical Design– Indexing – B-Tree, B+ Tree, Hash Tables– Storage Engines – MyISAM, InnoDB (File or Partition),
MEMORY– RAID – RAID 1, RAID 0, RAID 5; RAID 50, RAID 10 (Ignore
other RAID levels)
Differences between B-Tree and B+ Tree
Value of R-Tree - Concept Only Sam Siewert 16
DBMS Security - Week 11/12Authorization AuthenticationOne-Way HashSymmetric EncryptionPublic Key EncryptionSigning (E.g. Driver Signing)Networked DBMS Client to DBMS Server
Threats – What are they, how Significant to DBMS
Basic Encryption Concepts and Methods
RAID and RTO/RPO Disaster Recovery Concepts Sam Siewert 18
DBMS Transactions – Week 12Recall That Many DBMS Clients can Issue Many Transaction Requests– Thread Per Connection– Thread Created for Each Transaction Request– ACID Goals (Atomicity, Consistency, Isolation, Durability)
Two Solutions in Use Today – Strict 2PL (Locks with Automatic Scoping), Time-stamp Request Queue– Similar Issues in OS, but Not Exactly the Same [DBMS - Lost
Update, Uncommitted Dependency, Inconsistent Analysis]– Similar Solutions in OS, but Not Exactly the Same (E.g.
Semaphores and Locks)
Alternatives for Scheduling Reads/Writes encapsulated in Transactions are Challenging (NP-Hard) and/or Insufficient– Conflict Serialization – Insufficient to Solve all 3 Problems– View Serialization – NP Hard Algorithm, Not Practical
Sam Siewert 20
Comparison of Methods
Pearson Education © 2014 21
Orders conflicting operationsAs “some” serial schedule
Including those with data corruptionAnd inconsistent DB states
Less Restrictive Operation OrderingAs “some” serial schedule[Impractical – NP-Complete]
DBMS Transaction SummaryConcurrency due to Multiple Clients and Transactions Mapped to OS Threads [Readers, Writers]
Causes Problems with Split Transactions– Lost Update– Dirty Read or Uncommitted Dependency– Fuzzy Read, Phantom Read
Serialization Using Locks – Not Sufficient for Consistency if Lock Scope Too Small and Can Cause Deadlock!Recoverability – Atomicity Requirement2PL – Growing and Shrinking Phase – Helps with Lock Scope, but Still have Deadlock/LivelockCascading Roll-back and Rigorous/Strict 2PLDeadlock Problem – Prevention, Detect & BreakAlternative Timestamp Methods
Rigorous/Strict 2PL with Deadlock Prevention or Detect & Break or Timestamp is Best
Sam Siewert 22
RDBMS Client Connector - Week 14
Data Processing and User Interface Client Applications Need to Connect to DBMS Server
Improved Features Relative to Direct SQL Command Line for Users
Language Specific Connectors– ODBC – Open Database Connection, Middleware API Standard,
Typically Implemented in C Language– JDBC – Java Database Connector (ODBC Bridge)– MySQL C++ API Connector– MySQL C API and Driver for C Connector
Study C Examples – conntest.c
Sam Siewert 24
OODBMS and UML – Week 14OODBMS, Object Stores, and NoSQL Alternatives to RDBMS
Basic Definitions of Class Hierarchy, Inheritance, Encapsulation, Abstract Class, Concrete or Refined Class, Instantiation of Object, Dynamic Binding, Parametric and Ad-hoc Polymorphism
OOP – Programming Languages for Most Direction Implementation of OOA/OOD
PPL – Persistent Programming Language – One OODBMS Approach to Integrate OOP with DBMS
Type Gap in RDBMS and Type Mapping Between Store and Memory in OODBMS and Object Stores
Sam Siewert 26
Recall Object Definition – From Week 4Instance of a Class [Hierarchy from Abstract that Can’t Be instantiated to Concrete Sub-classes]Classes Define Public and Private Data and Methods to Operate on that DataSub-classes Inherit from Parent Classes and Refine Data Abstraction and MethodsOOPS – Java, C++, …, back to Smalltalk and Lisp CLOSEncapsulation and Abstraction is the GoalImplementation Hiding and Interface DefinitionEach OOP Has Variations on Support [E.g. Multiple Inheritance, Abstract Classes and Methods, Polymorphism [Parametric, Ad-hoc, Operator Overloading]E.g. Oracle’s Java Object Tutorial
Sam Siewert 27
OODBMS Type MappingPersistent OOPL Approach– Need to Store Instantiated Objects with Persistence Attributes– Store Attribute Data and Methods (Code)
Pointer Swizzling – OIDs (Index into Look-Up Table) Provides Mapping between Methods (Function Addresses) and Data (Data Addresses)– Similar to General Descriptor Concepts – E.g. a File pointer and a File
descriptor (index that is simple integer)– Involves OID Look-Up Indirection
No Swizzle – VM Paging of Objects from Storage to Memory– VM Basic Concept – Logical Memory Space Larger than Physical, Disk
Backs Up Pages in Virtual Memory Space not Loaded in RAM– VM Addressing is Logical Memory Address translated to Physical Address
with Page Fault to Handle Unloaded Pages– OODBMS can Page Objects from Disk to Memory– VM Pages are Not Normally Durable, but Objects Paged Must be!
Sam Siewert 31
Exam #2 - SummaryOODBMS is a Work in Progress, but Promising
NoSQL, Object Stores, Innovation in File Systems
RDBMS Remains Ideal for Transaction Workloads– Security Challenges with Concentrated Large R-DBMS
Breaches– Block-chain ledgers and distributed DBs potential alternative– R-DBMS less well adapted to unstructured data (video, images,
documents), solved with parallel R-DBMS + file system
OODBMS, Object Stores, NoSQL – Growing Alternative for Newer Workloads – Object Oriented Data (BLOBs) and Big Data Analytics
Sam Siewert 34
RDBMS Impedance with Procedural Programming
SQL Extended to Include Stored Procedures, Functions and Type Extension– Useful for secure execution and data processing on a server– SQL is not ideal however for GUIs, Apps, etc.
Connectors are a patch solution for clients– Security issues between client App and SQL server– Apps can be any procedural programming language
No-SQL, OODBMS is another alternative– Persistent OO Class instances – objects are stored– UML and OOA/OOD - > OOP in SE310 for SEs
Sam Siewert 36
UML – Week 13Compare to ER/EER and Concepts OnlyUnified Set of OO Diagraming Methods for OOA/OODBasic Concepts on Indicated Diagram Types
Sam Siewert 37
UML – Design GoalsProvide ready-to-use, expressive visual modelinglanguage …
Provide extensibility and specialization …
Be independent of particular programming languagesand development processes.Provide a formal basis …
Encourage growth of object-oriented tools market.Support higher-level development concepts such ascollaborations, frameworks, patterns, and components.
Integrate best practices.
Pearson Education © 2014 38
UML - DiagramsStructural:– Class diagrams (similar to EER with Hierarchy,
Attributes, and Class Name, Methods Added)– object diagrams (similar to ER)– component diagrams– deployment diagrams.
Behavioral:– use case diagrams– sequence diagrams– collaboration diagrams– statechart diagrams– activity diagrams.
Pearson Education © 2014 39
UML Class DiagramVery Similar to EER (Almost Identical, Just Added Detail and Extensions) – Lecture-Week-6-2– Encapsulation of Attributes and Methods to Operate on Them– Recall RDBMS has SQL/PSM and ORDBMS Has Extensions for
Methods– EER However Has Class Hierarch and Attributes, but No
Methods Specified
Sam Siewert 40
http://www.agilemodeling.com/artifacts/classDiagram.htmLecture-Week-6-2, Slide 11 to 16
UML – Object DiagramsModel instances of classes and used to describe system at a particular point in time.
Can be used to validate class diagram with “real world” data and record test cases.
Similar to ER (Lecture Week-6-1)
Pearson Education © 2014 41
UML – Deployment DiagramsDepict configuration of run-time system, showinghardware nodes, components that run on thesenodes, and connections between nodes.
Pearson Education © 2014 42
UML – Statechart DiagramsShow how objects can change in response toexternal events.Usually model transitions of a specific object.
Pearson Education © 2014 45
Distributed DBMS – Week 15Why? – Scaling and Disaster Recovery
DR– Datacenters in 2 Geo Locations – Natural Disaster, Terrorist
Attack, WAN outages, Power grid failure, etc.– RTO / RPO – Recovery Time and Point Objectives– Asynchonrous Mirrors, Active / Passive Fail-Over and Fail-Back
Cluster Scaling – Shared Nothing– Load Balance Large Number of Clients and Transactions– Scaling Beyond Single Server– Synchronous Low-latency / High-bandwidth Network (e.g. 10GE,
40G Infiniband, 100G Converged Enhanced Ethernet)
Sam Siewert 47
MySQL ReplicationUsed for DR [Disaster Recovery] and for Geo Content and Services Distribution– East Coast / West Coast – ERAU Daytona, ERAU Prescott– ACTIVE-ACTIVE for Reads, Writes to MASTER
http://dev.mysql.com/doc/refman/5.0/en/replication.htmlAsynchronous Compared to Clustering [Synchronous]One Server is Master - MySQL Replication Solutions
Sam Siewert 48
MySQL Replication – DR Fail-OverFail-Over Master in Case of DisasterAFTER MASTER FAILUREChallenges – Fail-Back [Manual] and Split-Brain
Sam Siewert 50
To Restore MASTER We Must:
1) Repair Issue [E.g. HW fix]2) Re-Sync Writes3) Fail-Back to Original MASTER