day1 - rhodeskirlinp/courses/db/f16/lectures/day1.pdf · equivalent representations of a relation...
TRANSCRIPT
Databases
Standardstuff
• Classwebpage:cs.rhodes.edu/db• Textbook:getitsomewhere;usedisfine– Stayupwithreading!
• Prerequisite:CS241• Coursework:– Homework,groupproject,midterm,final
• Bepreparedtobringlaptopseverysooften.
Groupproject• Youwilldesignandimplementyourowndatabase-drivenwebsite.
• Ideas:shopping,auctions,writeabetterBannerWeb,library/bibliographysystem,reviewsalaYelp,bank,finance/stocks,jobpostings,socialnetworkingalaFacebook,recipes,movies,apartments,…
• Groups:probably4-5people,formedonyourown.
• Spreadoutoverthewholesemester;check-insalongtheway.
Whystudydatabases?
• Academicreasons• Programmingreasons• Business(getajob)reasons• Studentreasons
Whatwillyoulearn?
• Databasedesign– Howdoyoumodelyourdatasoitcanbestoredinadatabase?
• Databaseprogramming– HowdoIuseadatabasetoaskitquestions?
• Databaseimplementation– Howdoesthedatabaseitselfwork;i.e.,howdoesitstore,find,andretrievedataefficiently?
Whatisthegoalofadatabase?
• Electronicrecord-keeping,enablingfast andconvenient accesstotheinformationinside.
• DBMS=Databasemanagementsystem– Softwarethatstoresindividualdatabasesandknowshowtosearchtheinformationinside.
– RDBMS=RelationalDBMS– Examples:Oracle,MSSQLServer,MSAccess,MySQL,PostgreSQL,IBMDB2,SQLite
DBMSFeatures
• Supportmassiveamountsofdata– Giga-,tera-,petabytes
• Persistentstorage– Datacontinuestolivelongafterprogramfinishes.
• Efficientandconvenientaccess– Efficient:don'tsearchtheentirethingtoansweraquestion!
– Convenient:allowuserstoaskquestionsaseasilyaspossible.
• Secure,concurrent,andatomicaccess
Example:buildabetterBannerWeb
• Professorsofferclasses,studentssignup,getgrades
• Whataresomequestionswe(studentsorfaculty)couldask?– FindmyGPA.– …
• Whyaresecurity,concurrency,andatomicityimportanthere?
Obvioussolution:Folders
• Advantages?
• Disadvantages?
Obvioussolution++
• TextfilesandPython/C++/Javaprograms
Obvioussolution++
• Let'suseCSV:
Hermione,Granger,R123,Potions,ADraco,Malfoy,R111,Potions,BHarry,Potter,R234,Potions,ARonald,Weasley,R345,Potions,C
Hermione,Granger,R123,Potions,ADraco,Malfoy,R111,Potions,BHarry,Potter,R234,Potions,ARonald,Weasley,R345,Potions,CHarry,Potter,R234,Herbology,BHermione,Graner,R123,Herbology,A
File1:Hermione,Granger,R123 Draco,Malfoy,R111 Harry,Potter,R234 Ronald,Weasley,R345File2:R123,Potions,AR111,Potions,BR234,Potions,AR345,Potions,CR234,Herbology,BR123,Herbology,A
Problems
• Inconvenient– needtoknowPython/C++/Javatogetatdata!
• Redundancy/inconsistency• Integrityproblems• Atomicityproblems• Concurrentaccessproblems• Securityproblems
Whyarethereproblems?
• Twomainreasons:– ThedescriptionofhowthefilesarelaidoutisburiedwithinthePython/C++/Javacodeitself(ifit'sdocumentedatall)
– Thereisnosupportfortransactions (supportingconcurrency,atomicity,integrity,andrecovery)
• DBMSshandleexactlythesetwoproblems.
Relationaldatabasesystems
• EdgarF.Codd wasaresearcheratIBMwhoconceivedanewwayoforganizingdatabasedonthemathematicalconceptofarelation.
• Relation:asetoforderedtuples(oh,no,CS172stuff…)
• RDBMS=Relationaldatabasemanagementsystem.
• Therelationalmodelusesrelations(akatables)tostructuredata.
• Gradesrelation:First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
• Relationalmodelisanabstraction.• Separatesthelogicalview(asviewedbytheDBuser)fromthephysicalview(DB'sinternalrepresentationofthedata)
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
• Structuredquerylanguage(SQL)foraccessing/modifyingdata:
• FindallstudentswhoaregettingaB.– SELECT First, Last FROM Grades WHERE Grade = "B"
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
Transactionprocessing• OneormoreDBoperationscanbegroupedintoatransaction.
• ForaDBMStoproperlyimplementtransactions:• Atomicity:All-or-nothingexecutionoftransactions.
• Consistency:ADBcanhaveconsistencyrulesthatshouldnotbeviolated.
• Isolation:Eachtransactionmustappear tobeexecutedasifnoothertransactionsarehappeningsimultaneously.
• Durability:Anychangesatransactionmakesmustneverbelost.
Ontotherealstuffnow…
DataModels
• Awayofdescribingdata.– Better:adescriptionofhowtoconceptuallystructurethedata,whatoperationsarepossibleonthedata,andanyconstraintsonthedata.
• Structure:howweviewthedataabstractly• Operations:whatispossibletodowiththedata?
• Constraints:howcanwecontrolwhatdataislegalandwhatisnot?
Relationalmodel
• Structure:relation(table)• Operations:relationalalgebra(selectcertainrows,certaincolumns,wherepropertiesaretrue/false)
• Constraints:canenforcerestrictionslikeGrademustbein{A,B,C,D,F}
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
Othermodels• Semi-structureddatathatisstill“structured”butnotinrelationalformat.– XML,JSON
• Objectdatabases,orobject-relational• Graphdatabases• NoSQL,NewSQL
Semi-structuredmodel
• Structure:Treesorgraphs– e.g.,XML
• Operations:Followpathsintheimpliedtreefromoneelementtoanother.– e.g.,XQuery
• Constraints:canconstraindatatypes,possiblevalues,etc.– e.g.,DTDs(documenttypedefinition),XMLSchema
Object-relational
• Similartorelational,but– Valuesinatablecanhavetheirownstructure,ratherthanbeingsimplestringsorints.
– Relationscanhaveassociatedmethods.
Relationalmodelismostcommon
• Simple:builtaroundasingleconceptformodelingdata:therelationortable.– Arelationaldatabaseisacollectionofrelations.– Eachrelationisatablewithrowsandcolumns.– AnRDBMScanmanagemanydatabasesatonce.
• Supportshigh-levelprogramminglanguage(SQL)– Limitedbutusefulsetofoperations.
• Haselegantmathematicaltheorybehindit.
RelationTerminology
• Relation==2Dtable– Attribute ==columnname– Tuple ==row(nottheheaderrow)
• Database==collectionofrelationsFirst Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
RelationTerminology
• Arelationincludestwoparts:– Therelationschema definesthecolumnheadingsofthetable(attributes/fields)
– Therelationinstance definesthedatarows(tuples,rows,orrecords)ofthetable.
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
Schema
• Aschemaiswrittenbythenameoftherelationfollowedbyaparenthesizedlistofattributes.– Grades(First, Last, Course, Grade)
• ArelationaldatabaseschemaisthesetofschemasforalltherelationsinaDB.
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
Domains
• ArelationalDBrequiresthateverycomponentofarow(tuple)haveaspecificelementarydatatype,ordomain.– string,int,float,date,time(nocomplicatedobjects!)
Grades(First:string, Last:string, Course:string, Grade:char)
Equivalentrepresentationsofarelation
Grades(First, Last, Course, Grade)• Relationisaset oftuples,notalist.• Attributesinaschemaareaset aswell.– However,theschemaspecifiesa"standard"orderfortheattributes.
• Howmanyequivalentrepresentationsarethereforarelationwithm attributesandn tuples?
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
Degreeandcardinality
• Degree/arity ofarelationisthenumberofattributesinarelation.
• Cardinality isthenumberoftuplesinarelation.
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
Keystoagoodrelation(ship)
Keysofarelation
• Keysareakindofintegrityconstraint.• AsetofattributesKformsakeyforarelationRif:– weforbidtwotuplesinaninstanceofRtohavethesamevaluesforallattributesofK.
First Last Course Grade
Hermione Granger Potions A
Draco Malfoy Potions B
Harry Potter Potions A
Ronald Weasley Potions C
Grades(First, Last, Course, Grade)
Keysofarelation
• Keyshelpassociatetuplesindifferentrelations.
SID CRN Grade
123 777 A
111 777 B
234 777 A
345 777 C
SID First Last
123 Hermione Granger
111 Draco Malfoy
234 Harry Potter
345 Ronald Weasley
CRN Name Semester Year
777 Potions Fall 1997
888 Potions Spring 1997
999 Transfiguration Fall 1996
789 Transfiguration Spring 1996
Example
• Let'sexpandtheserelationstohandlethekindsofthingsyou'dliketoseeinBannerWeb.
• Keeptrackofstudents,professors,courses,whoteacheswhat,enrollments,pre-requisites,grades,departments&theirchairs.– Onlyonechairperdepartment.– Studentcannotenrollinmultiplecopiesofthesamecourseinonesemester.
– Otherconstraintsthatarelogical.