jennifer dilly ferris state university september 25, 2011 database design

Database Design

Jennifer DillyFerris State UniversitySeptember 25, 2011Database Design

Table of ContentsOverview of Relational Database TheoryDatabase design exampleDescription/overview Design standards Rationale Normalization examples

2This presentation will start with a brief, general overview of relational database theory. The overview will cover just the general background of this type of database theory because as the presentation moves forward an example of a database will be used to explain more aspects surrounding the elements of design and the rationale behind the design. Examples of correct and incorrect versions of normalization within the design will also be included and are meant to show why normalization is important. 2Overview of Relational Database TheoryWhat are relational databases?Useful why?How is structure important?

3What are relational databases and why are they useful?Relational databases are set up so organizations are not only able to store data, but also store it in an organized manner in which it can be retrieved in a variety of arrangements. For instance, think of the work of a physician. A physician might want to view a patients test result during a hospitalization. He also might want both a historical view of test results from prior hospitalizations and want to view them in combination with other pertinent information. Viewing the data together would help the physician make sound decisions about his patients care. Good relational databases are set up to link pieces of information together so that they can later be viewed in multiple ways. Other types of databases can easily store data, but if a user is unable to view the data in relation to other pertinent stored data, the information gathered in the first place becomes meaningless.Structure within the design of the database is important!Data is stored in tables which, by themselves are unique. To create relationships between the data, the tables can be linked. Therefore, when retrieving data, it can be done in a number of ways either by viewing one element of the data, or with multiple pieces together (of course if they have been linked in the design). When structuring the database certain rules need to be adhered to keep the data efficiently organized. Much of this is accomplished by ensuring that certain levels of normalization are maintained. There are multiple levels of normalization, but the three most important include; 1) everything depends on the key (the primary and foreign keys link the tables), 2) there are no repeating groups of data, and 3) no redundant data.

Other standards, such as keeping data atomic, also need to be adhered to for the database to be able to contain data that is consistent, accurate, logical, and easily retrievable. To help understand many of these standards needed for designing, an example of a database will be used to explain.3Database Design Example - Description How to start What data is needed?

FoleyFoleyID FoleyInsertDateDcDatePOAMedicalRecordMedRecID LNameFNameMNameDOBStaffStaffID LNameFNameTitleAdmissionAdmID AdmDateDiagnosis

BloodCxBloodCxID BloodCxCollectedDateResultUrineCxUrineCxID UrineCxCollectedDateResult

4Here is the example of the beginnings of a design. Subsequent slides will explain further standards for the design as well as their rationales and will eventually show relationships between the tables.

For this design, the first step was realizing what information an end user would need to view. This database deals with catheter-associated urinary tract infections (CAUTIs) and determining whether a patient has or has had a CAUTI. A healthcare provider would want to view certain details such as who the patient is, admission dates, diagnoses, when an indwelling catheter(Foley) was placed or if it was present on admission (POA), who placed it, and whether an infection developed evidenced by results of urine and blood cultures.All of this information is organized into unique tables listing what information surrounding that specific topic needs to be stored.

One piece of information surrounding patients is never enough. It would be ideal if a patient only had one admission to the hospital therefore making only one patient identifier needed, but unfortunately some patients experience multiple hospital admissions and so the database design needs to account for this with a table specific to admissions. The design also needs to account for multiple Foley insertions and multiple cultures (blood and urine) during one admission. There are certain guidelines to follow when determining whether the patient has developed a CAUTI. The end user needs to be able to determine if the patient actually had a Foley in place 48 hours prior to the urine culture collection, especially if the urine culture results show infection. There are other pieces to the puzzle the provider needs to know to either treat the patient appropriately or identify what staff was involved in the care. 4Database Design Example Standards /Rationales Naming Atomic data Keys Primary (PK) Foreign (FK)

FoleyFoleyID PK FoleyInsertDateDcDatePOAMedicalRecordMedRecID PKLNameFNameMNameDOB

StaffStaffID PKLNameFNameTitleAdmissionAdmID PKAdmDateDiagnosisStaffID FKBloodCxBloodCxID PKBloodCxCollectedDateResultUrineCxUrineCxID PKUrineCxCollectedDateResult

5Before getting to the creation of the relationships, it is first important to review a few other standards. NamingThe table names were created to accurately represent what is contained in the table and would make sense to most care providers within a hospital. For example, in the Medical Record table, most providers know that MedRec means medical record and in the UrineCx and BloodCx table, Cx means culture. Keeping the names as small as possible will help when typing in the names in SQL statements. A data dictionary would also be helpful to define what many of the abbreviations mean. Notice there are no spaces between any words and all are in camel case; or the first letter of each word is capitalized while the rest of the letters are lowercase. The field names follow the same naming standards. Atomic dataEach field must contain only one value. The MedicalRecord and Staff tables show a good instances of this, the patients first and last names are separated into separate fields. A field that contains more than one distinct value would create difficulty in sorting groups within the record with those specific fields and would affect the ability to retrieve the information.KeysAssigning keys within a table is one of the most important standard to follow. Each table contains a primary key making certain that each record is correctly identified. For example, the Admission table contains the AdmID which is the primary key for that table. A new AdmID is assigned each time a patient is admitted and every time a provider wants to view what happened in a specific admission, that specific admission ID is looked up.

Another key, a foreign key, that has its own level of importance in the design. The next slide will display further points about relationships between tables, but first, taking a primary key from one table and putting it into another, creates a relationship between the data in the tables. Using the Admission and Staff tables as examples once again, the diagram shows that each table has its own primary key, AdmID for the Admission table and StaffID for the Staff table. When a patient is admitted, a new admission ID is assigned, however, the staff taking care of the patient during that admission could change. When the StaffID field is put into the Admission table it becomes a foreign key, and the staff taking care of the patient for that particular admission will now be identified.

5Database Design Example Standards /RationaleReferential integrity Datatypes IndexesFoleyFoleyID INT PKFoley varchar(20)InsertDate datetimeDcDate datetimePOA char(1)MedicalRecordMedRecID INT PKLName varchar(30)FName varchar(30)MName varchar(30)DOB datetime

StaffStaffID INT PKLName varchar(30)FName varchar(30)Title varchar(30)AdmissionAdmID INT PKAdmDate datetimeDiagnosis varchar(30) AdmissionFoleyAdmID INT FKFoleyID INT FKBloodCxBloodCxID INT PKBloodCx varchar(20)CollectedDate datetimeResult varchar(30)AdmissionBloodCxUrineCxAdmID INT FKBloodCxID INT FKUrineCxID INT FKUrineCxUrineCxID INT PKUrineCx varchar(20)CollectedDate datetimeResult varchar(30)MedRecAdmissionMedRecID INT FKAdmID INT FKAdmissionStaffAdmID INT FKStaffID INT FK

6Referential integrityIn this design, the relationship of the staff table to the admission table through the use of primary and foreign keys as constraints is an example of referential integrity. Referential integrity ensures that the data, in this case data surrounding the staff member, actually exists in the database, and the relationships between tables remains consistent. Without the relationship, this type of data would stand alone and would not be very useful for the type of information the entire database aims to provide. In the diagram, the arrows indicate the relationships between the tables, showing a primary key in the parent table and a foreign key in the child or related table.

To take this further, another example includes the bridge table, MedRecAdmission, created between the MedicalRecord and the Admission table links information together. A patient usually has one medical record ID, but multiple admission IDs. While the patients name and date of birth will not change, each admissions admission date, diagnosis, and staff member assignment will. Creating a bridge table not only creates a relationship between the two tables, but also ensures that multiple admissions can be viewed when that particular patients medical record ID is pulled up. This is also the case with the AdmissionFoley and AdmissionBloodCxUrineCx bridge tables. The Admission and Foley tables are related using the AdmissionFoley bridge table and so on.

Datatypes The type of data that the field stores is very important. If the database user tries to type in a name in the date or time field, the database should reject the entry. Datatypes are chosen for each field based on what type of data is needed. For example, a patients name typically only contains text and no numeric data, so a data type such as Character Varying (varchar) should be used to define the field. This means that any length of datatype can be used as long as the amount of varchar specified is long enough. In this design, the field Lname and Fname have been defined with varchar 30 for both, giving enough space for a last name or a first name with up to 30 characters to be entered. Similarly, in the Foley table, the field POA (was the Foley present on admission?) will only have room for either a y or n indicating yes or no. Any of the IDs in the tables, such as MedRecID, FoleyID, StaffID, should be defined by Integer (INT) because these are all numeric fields. Also, the datatype for date of birth would include using datetime.

IndexesFinally, an index is a structure that increases the ability to find information within a database faster. In this design, the MedRecID is an index. Other indexes could include LName or even DOB (date of birth). Looking a patient up if the MedRecID is used would be the quickest way to find the appropriate patients record in the database. From there, further details, such as previous or current admission information linked to the MedRecID, can be found. LName or DOB could also be used as indexes, but since there may be other patients with the same last name or even date of birth, further searches may be needed to ensure the correct patient is found.6Examples of NormalizationKey dependencyViolation

CorrectionAdmissionAdmID INT PKAdmDate datetimeDiagnosis varchar(30)AdmissionAdmID INT PKAdmDate datetimeDiagnosis varchar(30)

AdmissionAdmIDAdmit DateDiagnosis90009/08/2011Hypertension90109/25/2011Heart AttackMedRecAdmissionMedRecIDAdmID1520900901MedicalRecordMedRecID INT PKLName varchar(30)FName varchar(30)MName varchar(30)DOB datetime

MedicalRecordMedRecIDLNameFNameMNameDOB1520BaseDateAh08/15/1942MedicalRecordMedRecID INT PKLName varchar(30)FName varchar(30)MName varchar(30)DOB datetime

MedRecAdmissionMedRecID INT FKAdmID INT FK

7So the structuring of the database with its rules have been covered except for the maintenance of normalization. Again these include; 1) everything depends on the key (the primary and foreign keys link the tables), 2) there are no repeating groups of data, and 3) no redundant data.

First of all, the keys identify the record. Take for instance again the Admission and Staff tables. The Violation on the left shows no foreign keys assigned in either table. There is no link between the tables, therefore the data on the bottom will also not show a relation. The correction on the right shows how adding a bridge table, linking the medical record to admission, can create the relationship. It is common practice to have a new admission number for each new admission, but if the Medical Record ID was not present as a key, the patient would also have a new medical record ID assigned each admission as well. Now we can tell that this patient has one medical record ID, but multiple admissions.7Examples of NormalizationNo repeating groupsViolationCorrectionAdmissionAdmID INT PKAdmDate datetimeDiagnosis varchar(30)Staff varchar(max)StaffStaffID INT PKLName varchar(30)FName varchar(30)Title varchar(30)AdmissionAdmID INT PKAdmDate datetimeDiagnosis varchar(30)AdmissionAdmIDAdmDateDiagnosisStaff90009/08/2011HypertensionJoe Schmo, Larry Parry90109/25/2011Heart AttackJean Bean90210/01/2001StrokeJean BeanAdmissionStaffAdmIDStaffID90001900409015090250AdmissionStaffAdmID INT FKStaffID INT FK

8Secondly, there should be no repeating groups of data. In the Admission table on the left, the fields include AdmID, AdmDate, Diagnosis, and Staff. The problem is, there could be multiple staff members taking care of the patient during the admission. One could type in the multiple staff member names if the number of characters allowed, but in the end there might still be issues if this data needed to be sorted say, by staff last name. The correction shows the conformity to the standards. A bridge table, AdmissionStaff, was created linking the admission to the staff . The table at the bottom shows that for admission 900 staff members 01 and 40 provided care, in admission 901, staff member 50 provided care and in admission 902, staff member 50 again provided care.

8Examples of NormalizationNo redundant dataViolationCorrectionMedicalRecordMedRecID INT PKLName varchar(30)FName varchar(30)MName varchar(30)DOB datetime

AdmissionAdmID INT PKLNameFNameMNameAdmDate datetimeDiagnosis varchar(30)StaffID INT FKAdmissionAdmID INT PKAdmDate datetimeDiagnosis varchar(30)StaffID INT FKMedicalRecordMedRecID INT PKLName varchar(30)FName varchar(30)MName varchar(30)DOB datetime

MedRecAdmissionMedRecID INT FKAdmID INT FKMedRecAdmissionMedRecID INT FKAdmID INT FK

9 Finally, there should be no redundant data. In the example on the left, the Medical Record as well as the Admission tables contain fields for last name, first name, and middle name. This means that even though the patients last, first and middle name has been entered in the medical record, each time the patient comes in for admission, these fields will be entered again. The example on the right shows the correct version, because there is a bridge table creating the relationship, the Admission table does not need to contain the information about the patient name again.

9SummaryOverview of Relational Database TheoryDatabase designStandards RationaleNormalization examples

10In summary, an efficiently designed database can be very helpful when tracking multiple points of information whether it be for patient or customer information. Designs that closely follow the accepted standards will be consistent and provide useful data that can be displayed in various arrangements. 10

jennifer dilly ferris state university september 25, 2011 database design

Documents

data atomic

redundant data

pertinent stored data

repeating groups of

elements of design

type of database theory

database certain rules

good relational databases