relational database cisc/qcse 810 some materials from software carpentry
TRANSCRIPT
Relational Database
CISC/QCSE 810some materials from Software Carpentry
Introduction
Text files have a long and useful history for storing files they are human-readable they can always be imported into new
formats simple to write reading routines in
any language
Limitations of Text Files 1
No meta-data experimental/simulation data has
conditions/parameters attached If your file names are like
"results_s10_n5_p43.txt", you should consider recording results in a database
Limitations of Text Files 2
Redundancy (and error) One way to include meta-data is to add extra
columns to your text files
"Any duplication will eventually lead to errors"
Extra columns of constants are extremely expensive, largely wasted space, and don't scale well as you add new meta-data
Limitations of Text Files 3
Searching and subsetting searching for matching entries in a
text file is cumbersome grep is fine for some easy matches,
but not for searches like "“Find all experiments done with the Mark VII that had yields greater than 30%, that didn't use cadmium disulfide as a reagant”
Non-text files
Collections of documents need be both stored and searched graphic images: index by date,
source, resolution, format, processing done, comments
video clips: same
Relational Databases
Relational Databases are data repositories that can help store relations between different types of dataAllows you to focus on one aspect of data modeling at a time reduce redundancy improve search
Getting Started
A database is a collection of zero or more tables, each of which: Has a name Stores a single relation (i.e., a set of information
of a particular kind)
Each table has a fixed set of named columns All the values in a column have the same type
Each table has zero or more rows Also called records
SQL Interface, DBMS
Interact with database management system (DBMS) using a specialized language called SQL Every vendor implements its own
extensions to the standard Table, database names are not case
sensitive: gravity, Gravity and GRAVITY are considered the same
Data can be case sensitive
Building vs Querying
Most time with database is spent querying generating graphs, tables from subsets of
data
Still need to build it in the first place! record-by-record table-by-table Perl/Python program to do fancy file-to-
database upload
Create a Database
Database is a grouped collection of tables CREATE DATABASE <database
name>
Creating a Table
To create a table, specify its name, and the names and types of its columns
CREATE TABLE Person(Login TEXT NOT NULL,LastName TEXT NOT NULL,FirstName TEXT NOT NULL
); The expression NOT NULL means that the value
must be present Data types are similar to C, Java types
To erase a table, use DROP TABLE name Very handy when you're first starting…
Filling in a Table
Adding records manually INSERT INTO <table> VALUES (….)INSERT INTO Person VALUES("skol", "Kovalevskaya", "Sofia");
Adding from a file LOAD DATA LOCAL INFILE
Filling in a Table 2
For anything more complicated, you'll want to write a script in some other language that does the uploading for you, through tailored "insert" commandse.g. experiment just finished check for existing experiment w/same
conditions create new record for experiment if necessary get new experiment ID (or old if it exists) upload results, with DB experimentID attached
Querying
Suppose we want to get everyone's name and login IDWrite a query that specifies what we want, and where to find it
SELECT Person.FirstName, Person.LastName, Person.Login
FROM Person;
Querying and Sorting
SELECT Person.FirstName, Person.LastName, Person.Login
FROM Person ORDER BY Person.Login;
Selection
Frequently only want a subset of data
SELECT FirstNameFROM Person WHERE Login = "skol"
Simple Joins
Getting project IDs from namesSELECT ProjectID, LastNameFROM Person, InvolvedWHERE Person.login = Involved.login
Query Joins w/3 Tables
More interesting query: what are the names of the projects is
Ivan involved in?
Far more to see!
Design of databases "normal forms" knowing the relationships between
your data tables
Efficient access of databases optimization of queries
Accessing an existing database
Download a MySQL clientTry to connect to 130.15.100.140 username aableson password
First steps
Create a database named for your Queen's NetIDTry some of the operations outlined in the MySQL tutorial skip the logging in with command
line, go straight to building tables, queries
Resources
MySQL Query Browser download http://dev.mysql.com/downloads/gui-tools/5.0.html
MySQL tutorial http://dev.mysql.com/doc/refman/5.0/en/tutorial.html tailored for command-line usage; can skip some if you're
using a GUI
MySQL reference http://dev.mysql.com/doc/refman/5.1/en/index.html