creating and maintaining a database
DESCRIPTION
Creating and Maintaining a Database. The DBA’S Job. Design Logical Design Physical Design Documentation Implementation Test Performance Security Concurrent Updates. Maintenance Backup Recovery Data Integrity New Releases SIR Application. Tasks. Designing a Relational Database. - PowerPoint PPT PresentationTRANSCRIPT
New York Conference 2005
Creating and Maintaining a Creating and Maintaining a DatabaseDatabase
The DBA’S JobThe DBA’S Job
TasksTasks
DesignDesign– Logical DesignLogical Design– Physical DesignPhysical Design– DocumentationDocumentation
ImplementationImplementation– TestTest– PerformancePerformance– SecuritySecurity– Concurrent Concurrent
UpdatesUpdates
MaintenanceMaintenance– BackupBackup– RecoveryRecovery– Data IntegrityData Integrity
New ReleasesNew Releases– SIRSIR– ApplicationApplication
Designing a Designing a Relational DatabaseRelational Database
Normalization Normalization – Eliminate redundant data Eliminate redundant data – Identify data dependencies – keys Identify data dependencies – keys
11stst Normal Form Normal Form– One value per column One value per column – Unique primary keyUnique primary key
22ndnd Normal Form Normal Form– No subsets of data in multiple rows of a tableNo subsets of data in multiple rows of a table
33rdrd Normal Form Normal Form – All columns fully dependant on primary keyAll columns fully dependant on primary key
ExampleExampleOrder #Order # Customer Customer
##AddresAddresss
Part #Part # DescriptionDescription Unit PriceUnit Price QuantityQuantity TotalTotal
12341234 409409 xxxxxxxxxxxx 1010 xxxxxxxxxxxx $10.00$10.00 55 50.0050.00
12341234 409409 xxxxxxxxxxxx 2020 xxxxxxxxxxxx $15.00$15.00 33 45.0045.00
Possible TablesPossible Tables Order - Order #Order - Order #
– Customer #Customer # Order Item - Order # Line #Order Item - Order # Line #
– Product CodeProduct Code– QtyQty– Unit Price ?Unit Price ?
Customer – Customer #Customer – Customer #– AddressAddress
Product – Product #Product – Product #– DescriptionDescription– Unit PriceUnit Price
KeysKeys
Must be uniqueMust be unique Good if real worldGood if real world
– Employee Id/Product Code etc.Employee Id/Product Code etc. May not be the only access requiredMay not be the only access required Should be shortShould be short Avoid unformatted alphabeticAvoid unformatted alphabetic If subordinate repeating group, If subordinate repeating group,
consider sequence numberconsider sequence number
Normalized Normalized ImplementationImplementation
Know the rulesKnow the rules– Know the applicationKnow the application
AlternativesAlternatives– How many repeats of a column/group?How many repeats of a column/group?– Dependent data volatility/convenienceDependent data volatility/convenience
DocumentDocument– Variables – labels, descriptionsVariables – labels, descriptions– Records – keys, variables, foreign keysRecords – keys, variables, foreign keys
SIR SchemaSIR Schema Case DefinitionCase Definition
– Case IdCase Id– Max CountsMax Counts
Record DefinitionRecord Definition– Key FieldsKey Fields– Max CountsMax Counts– Default SecurityDefault Security– Variables within recordsVariables within records
Documentation Documentation command for case command for case and recordand record
Variable DefinitionVariable Definition– Format & PositionFormat & Position– Missing ValuesMissing Values– Valid ValuesValid Values– Value LabelsValue Labels– Categorical VarsCategorical Vars– Variable RangesVariable Ranges– Variable LabelVariable Label– Extended label for Extended label for
variable variable documentationdocumentation
– Variable SecurityVariable Security
Schema functions in PQLSchema functions in PQL 60+ database 60+ database
functionsfunctions 30 tabfile functions30 tabfile functions Examples:Examples:
– NRECSNRECS– RECNAMERECNAME– NKEYNKEY– KEYNAMEKEYNAME– NVARSNVARS– VARNAMEVARNAME– VARLABSCVARLABSC– VFORMATVFORMAT– VTYPEVTYPE
Sec Index Sec Index FunctionsFunctions– DBINDSDBINDS– DBINDRDBINDR– DBINDVDBINDV– DBINDTDBINDT
Quick Data DictionaryQuick Data Dictionary
Four Record TypesFour Record Types– VariablesVariables– RecordsRecords– Record keysRecord keys– Record dataRecord data
Populate from any databasePopulate from any database Check consistencyCheck consistency
Example Data DictionaryExample Data Dictionary
CreateCreate Populate from MNYRPopulate from MNYR
– 55 record types55 record types– 2216 variables in records2216 variables in records
Check consistent use of variablesCheck consistent use of variables– LabelsLabels– FormatsFormats– TypesTypes
Identify foreign keysIdentify foreign keys Look at secondary indexesLook at secondary indexes
SIR StructuresSIR Structures
Multiple DatabaseMultiple Database– Until SIR2000 exactly one database in SIR Until SIR2000 exactly one database in SIR
sessionsession– Design suggests separate databases for Design suggests separate databases for
separate hierarchiesseparate hierarchies– Had to use ‘dummy’ cases in single databaseHad to use ‘dummy’ cases in single database
Inverted ListsInverted Lists– Until SIR2002 no secondary indexUntil SIR2002 no secondary index– Had to use ‘dummy’ cases for inverted listHad to use ‘dummy’ cases for inverted list
Auto Increment KeysAuto Increment Keys
Physical StructurePhysical Structure
Single Data FileSingle Data File Two types of blocksTwo types of blocks
– IndexIndex Contain keys plus pointers to other blocks Contain keys plus pointers to other blocks Single top level blockSingle top level block From one to six further levelsFrom one to six further levels Bottom level points to data blockBottom level points to data block
– DataData Contain keys and dataContain keys and data
SIR Data FileSIR Data File
Top Level Index
Bottom Level Index
Index Level 1
Data Blocks
Index Level 1 Index Level 1
Data RecordData Record
HeaderHeader– SizeSize– Update levelUpdate level– Lock statusLock status
Separate Key in front of recordSeparate Key in front of record– All keys same size in single databaseAll keys same size in single database– Case id, record number,record key fieldsCase id, record number,record key fields– Special so can be searchedSpecial so can be searched
Record organized by data formatRecord organized by data format– Real8, real4, I4, I2, I1, CharacterReal8, real4, I4, I2, I1, Character
CIRCIR
One per caseOne per case Count for each record typeCount for each record type
– e.g. max rec types – 100e.g. max rec types – 100100 integers100 integers
– I1 – up to 123I1 – up to 123– I2 – up to 32,000I2 – up to 32,000– I4 – over thisI4 – over this
Common variablesCommon variables
Size of BlocksSize of Blocks
IndexIndex– Calculated from key size and number of Calculated from key size and number of
recordsrecords– Minimum 2K (253 dwords) Maximum 32KMinimum 2K (253 dwords) Maximum 32K
DataData– Calculated from maximum record size Calculated from maximum record size
and number of recordsand number of records– Minimum 2K (254 dwords) Maximum 32KMinimum 2K (254 dwords) Maximum 32K– Minimum 4 records per blockMinimum 4 records per block
Index CalculationIndex Calculation
Example: Key Length 16Example: Key Length 16Number of records 1 millionNumber of records 1 million– At 4 per block need 250,000 blocksAt 4 per block need 250,000 blocks– Each index entry takes 3 dwrds Each index entry takes 3 dwrds
Key in dwords + 1 dwrd for pointer & count)Key in dwords + 1 dwrd for pointer & count)
– Minimum block holds 253/3 – 84 entriesMinimum block holds 253/3 – 84 entries– Top level Single block 84 entriesTop level Single block 84 entries– Second level 84 blocks each with 84 entriesSecond level 84 blocks each with 84 entries– Two level index points to 7,056 data blocksTwo level index points to 7,056 data blocks– Three level points to 592,704 data blocksThree level points to 592,704 data blocks
Data Block GrowthData Block Growth Add first recordAdd first record Start with 3 blocksStart with 3 blocks
– Top Level Index Top Level Index 1 entry 1 entry key of recordkey of record Points to second levelPoints to second level
– Second level index Second level index 1 entry 1 entry Key of record Key of record Points to data blockPoints to data block
– Data Block - 1 entryData Block - 1 entry
Add recordsAdd records– Find data blockFind data block– Put record in block in Put record in block in
correct key sequencecorrect key sequence– If new record first in If new record first in
block, update higher block, update higher level that pointedlevel that pointed
Block Too BigBlock Too Big– Split into two blocksSplit into two blocks– Add new key to higher Add new key to higher
levellevel
Loading FactorLoading Factor
How to split blockHow to split block Records being added randomlyRecords being added randomly
– Split block in middleSplit block in middle– Any block added to likely to have spaceAny block added to likely to have space– .5 loading factor.5 loading factor
Records added in ascending keyRecords added in ascending key– Leave original block as full as possibleLeave original block as full as possible– All adding to new blockAll adding to new block– .99 loading factor.99 loading factor
Suggested Loading Suggested Loading FactorsFactors
Standard Updates - .5Standard Updates - .5– If set very high and activity all on original If set very high and activity all on original
block, lots of empty new blocksblock, lots of empty new blocks IMPORT - .99IMPORT - .99
– Let standard updates split blocks when Let standard updates split blocks when neededneeded
RELOAD - .99RELOAD - .99– no block splitting no block splitting – Block filled to loading factorBlock filled to loading factor– Space for largest recordSpace for largest record
List Stats InfoList Stats Info
Number of Index Levels 2Number of Index Levels 2 Max Entries Per Index Block 42Max Entries Per Index Block 42 Index/Data Block Size 253/3314Index/Data Block Size 253/3314 Active/Inactive Data Blocks 92/0Active/Inactive Data Blocks 92/0 Active/Inactive Index Blocks 4/0Active/Inactive Index Blocks 4/0 Keysize In Bytes 34Keysize In Bytes 34 Min/Max Record Size 0/808Min/Max Record Size 0/808
Inactive BlocksInactive Blocks
New Blocks added at endNew Blocks added at end List maintained of empty blocksList maintained of empty blocks Re-used when new block neededRe-used when new block needed All records deleted in stand aloneAll records deleted in stand alone Block update strategy in MasterBlock update strategy in Master
MasterMaster
Allow multiple users to update a Allow multiple users to update a database concurrentlydatabase concurrently
Intended primarily for multiple Intended primarily for multiple interactive usersinteractive users
Communicates via TCP/IPCommunicates via TCP/IP Machine dependent database accessMachine dependent database access Provides a consistent database view Provides a consistent database view
for independent retrievals from for independent retrievals from databasedatabase
Master OperationMaster Operation
Start MasterStart Master– Starts with an addressStarts with an address– Waits for client messageWaits for client message– Does nothing else, NO database Does nothing else, NO database
accessaccess Master is NOT permanently Master is NOT permanently
connected to any specific databaseconnected to any specific database Client tells master which database Client tells master which database
to connect toto connect to
Database AccessDatabase Access
Database is opened and closed Database is opened and closed during SIR session as neededduring SIR session as needed– During PQL retrievalsDuring PQL retrievals– During utilities (No master)During utilities (No master)
Batch Data InputBatch Data Input Export, Unload, Spreadsheet, …Export, Unload, Spreadsheet, …
– During schema updates (No master)During schema updates (No master)– ‘‘Old’ Forms while form is runningOld’ Forms while form is running
Database AccessDatabase Access Database open for writeDatabase open for write
– Single User Single User - Exclusive Use- Exclusive Use– MasterMaster - Shared Read- Shared Read
SIR database filesSIR database files– sr1/sr2 - meta data – needed by both client sr1/sr2 - meta data – needed by both client
(read only) and master(read only) and master– sr3 - data sr3 - data – controlled by master– controlled by master– sr4 - procedures sr4 - procedures – controlled by client– controlled by client– sr5 - journal sr5 - journal – controlled by master– controlled by master– sr6 - sec. indexsr6 - sec. index – controlled by master– controlled by master
Master: How it WorksPart 1
Single user SIR allows
Multiple Readers OR
a Single Writer to a database
User AReads & Writes
Exclusively
Users B & C
ShareRead Only
SIR(Copy 2)
SIR(Copy 3)
SIR(Copy 1)
OR
Master: How it WorksPart 2
Master allows multiple writers, readers plus independent
readers
Users A & BReads & Writes
User C Independent
Read Only
SIR(Copy 2)
SIR(Copy 3)
SIR(Copy 1)
SIR Data FileMaster
SIR Data File
How does Master work?How does Master work?
Client changes access to use Client changes access to use MasterMaster
Lock Manager for clients accessing Lock Manager for clients accessing through Masterthrough Master
Delayed view of updates Delayed view of updates ‘Difference File Copy’ for ‘Difference File Copy’ for independent readersindependent readers
ClientClient SIR session switches from single user to use SIR session switches from single user to use
a specific mastera specific master– Master must be available at this pointMaster must be available at this point– All subsequent retrievals then automatically use All subsequent retrievals then automatically use
MasterMaster– Various utilities not availableVarious utilities not available
Sends Master a request for single data Sends Master a request for single data record at a timerecord at a time– Master selects on key valuesMaster selects on key values– Client does any selection on data valuesClient does any selection on data values
Data requests preceded by lock requestsData requests preceded by lock requests
MasterMaster Gets initial logon from client Gets initial logon from client
– Allocates identifierAllocates identifier Gets database open from clientGets database open from client
– Checks if already known (open by another client)Checks if already known (open by another client)– Allocates identifierAllocates identifier
Database IdentificationDatabase Identification– Full pathname is passed by clientFull pathname is passed by client– Path is ‘as seen’ by clientPath is ‘as seen’ by client
Client needs to find databaseClient needs to find database
– Master needs to find databaseMaster needs to find database– Master needs to know that database referenced by Master needs to know that database referenced by
multiple clients is same databasemultiple clients is same database
Master Resource ControlMaster Resource Control
Gets request for lock on resource Gets request for lock on resource (e.g. case/record key)(e.g. case/record key)– Checks lock tableChecks lock table– Creates entry if resource availableCreates entry if resource available
Gets request for record retrievalGets request for record retrieval Gets request for record updateGets request for record update
Lock TypesLock Types Transmitted by client from PQL. Checks Transmitted by client from PQL. Checks
existing lock on resourceexisting lock on resource1 = Null - becomes exclusive in Update, 1 = Null - becomes exclusive in Update,
concurrent read in retrievalconcurrent read in retrieval2 = Concurrent read - Fails if exclusive2 = Concurrent read - Fails if exclusive3 = Concurrent write - Fails if protected or 3 = Concurrent write - Fails if protected or
exclusiveexclusive4 = Protected read - Fails if concurrent write, 4 = Protected read - Fails if concurrent write,
protected write or exclusiveprotected write or exclusive5 = Protected write - Fails if concurrent write, 5 = Protected write - Fails if concurrent write,
protected or exclusiveprotected or exclusive6 = Exclusive - Fails if any prior lock6 = Exclusive - Fails if any prior lock
Locks in PQLLocks in PQL
Ignored in single user modeIgnored in single user mode RETRIEVAL RETRIEVAL
– LOCK = CR, CW,PR,PW,EX (2,3,4,5,6)LOCK = CR, CW,PR,PW,EX (2,3,4,5,6)– CIRLOCK, RECLOCKCIRLOCK, RECLOCK– Default: Update –Ex Retrieval – CRDefault: Update –Ex Retrieval – CR
CASE/RECORD commandsCASE/RECORD commands– LOCK = numeric_expression LOCK = numeric_expression – Nested case/records inherit outer lockNested case/records inherit outer lock– Lock held until NEXT or EXIT at this levelLock held until NEXT or EXIT at this level
Locked Case/RecordLocked Case/Record
Block is enteredBlock is entered Variables set to undefinedVariables set to undefined Test status with functions:Test status with functions:
– SYSTEM(36) = 1 Record availableSYSTEM(36) = 1 Record available– SYSTEM(37) = 1 Case availableSYSTEM(37) = 1 Case available– SYSTEM(38) = 1 Master modeSYSTEM(38) = 1 Master mode
Wait and retry, tell user with option,Wait and retry, tell user with option,……– RETRY CASE|RECORDRETRY CASE|RECORD
‘‘Delayed’ UpdatesDelayed’ Updates
Enables independent retrieval to have Enables independent retrieval to have consistent view of data i.e. no updates consistent view of data i.e. no updates seen while retrieval runningseen while retrieval running
Master creates local copy of master Master creates local copy of master index blockindex block
Whenever index or data block rewritten Whenever index or data block rewritten for first time, Master allocates new blockfor first time, Master allocates new block
Keeps list of redundant blocks (index & Keeps list of redundant blocks (index & data)data)
Identical process on secondary indexesIdentical process on secondary indexes
Difference File CopyDifference File Copy
Makes updates available to Makes updates available to independent retrievalindependent retrieval
Increments update levelIncrements update level Creates journal headerCreates journal header Writes master indexWrites master index Makes redundant blocks available if Makes redundant blocks available if
no other users (can get exclusive no other users (can get exclusive access)access)
Managing MasterManaging Master
Start ParametersStart Parameters– MST =MST =– PW =PW =– DFC=DFC=
MST = parameterMST = parameter Master finds machine name, port 3000Master finds machine name, port 3000 MST = change port number to MST = change port number to
even_numbereven_number CLIENT MST = machine_name[:port]CLIENT MST = machine_name[:port]
– The machine name consists of a host and a The machine name consists of a host and a domain. It makes the start up for clients faster domain. It makes the start up for clients faster to quote both the host and domain name (DNS)to quote both the host and domain name (DNS)
– Start Master - Master started SirNT:3000Start Master - Master started SirNT:3000– Start Forms Start Forms
MST=SirNTMST=SirNT MST=SirNT.sir.com.auMST=SirNT.sir.com.au
Other parametersOther parameters
PW = passwordPW = password– Any remote user who wants to Any remote user who wants to
administer master must specify a administer master must specify a matching passwordmatching password
DFC = minutes since a difference DFC = minutes since a difference file copy which would force an file copy which would force an automatic copyautomatic copy
Administering MasterAdministering Master
Interrupt Interrupt – No users being servedNo users being served– CommandsCommands– No passwordNo password– Usage StatisticsUsage Statistics
RemoteRemote– Other users still activeOther users still active– Menu drivenMenu driven– Password ProtectedPassword Protected
Administering MasterAdministering Master
– List logged on usersList logged on users– List attached databasesList attached databases– StopStop
ImmediatelyImmediately After users logoffAfter users logoff
– Difference File copyDifference File copy Set intervalSet interval
Backup and RecoveryBackup and Recovery UnloadUnload
– HeaderHeader– Internal copy of databaseInternal copy of database– Machine specific/SIR version specificMachine specific/SIR version specific– Brings all unloaded records up to current Brings all unloaded records up to current
schema definitionschema definition– Can have multiple unloads on same fileCan have multiple unloads on same file– Accessed sequentiallyAccessed sequentially
JournalJournal– HeaderHeader– Copy of database record after updateCopy of database record after update
Update Level & Update Level & RestructureRestructure
Update level incremented when Update level incremented when database open/closed for updatedatabase open/closed for update
When record written, update level held When record written, update level held in record headerin record header
If schema changed old and new version If schema changed old and new version kept with update levelkept with update level
When record read, if record update level When record read, if record update level earlier than schema change record is earlier than schema change record is transformedtransformed
When record written, in new formatWhen record written, in new format
Immediate UnloadImmediate Unload
Length of key changesLength of key changes– Record type in key so increase in Record type in key so increase in
length forces (Number of record types length forces (Number of record types > 123)> 123)
Key definition changes for existing Key definition changes for existing recordrecord
Record is locked until reloaded Record is locked until reloaded
ReloadReload
Reload takes specific unloadReload takes specific unload Defined by update levelDefined by update level Database is recreatedDatabase is recreated If complete set of journals, can be If complete set of journals, can be
applied to bring reloaded database applied to bring reloaded database up to dateup to date
ITEMIZE lists unloads or journalsITEMIZE lists unloads or journals
EXPORTEXPORT
Text version of databaseText version of database Machine IndependentMachine Independent IMPORT rebuilds completelyIMPORT rebuilds completely Machine IndependentMachine Independent SIR version independentSIR version independent Choose for long term archiveChoose for long term archive
VERIFYVERIFY
Walks indexWalks index– Retrieves each data blockRetrieves each data block– Checks counts and pointersChecks counts and pointers– Reports structural problemsReports structural problems– Patch puts calculated value in countsPatch puts calculated value in counts– Clear corruption flagClear corruption flag
Walks secondary indexesWalks secondary indexes
New York Conference 2005
Creating and Maintaining Creating and Maintaining DatabasesDatabases