sql - structured query language for database analyses
TRANSCRIPT
SQL - Structured Query Language
For database analyses
SQL – Structured Query Language
• SQL is designed to do variable selection and manipulation on data stored in multiple tables
Why multiple tables? (aka database)• Avoid duplication of information• Ease of updating information• Linked by unique identifiers/codesWhere would we use? A single entity with many
different operations with common information
University
• Many functions: students taking classes & getting grades, scheduling classes in rooms, students paying tuition & fees, faculty teaching classes, faculty getting paid, etc
Tables of • Students, faculty with demographics• Students*classes, faculty*classes• Tuition payments, faculty payroll, etc.• Linked by studentID, facultyID, classID, etc
Business
• Multiple functions including selling, buying, production, and paying employees
Tables of orders, suppliers, customers, products, accounts payable & receivable, payroll, ….
Linked by productID, customerID, …
Baseball
Lahman database – season summaries•Tables of teams, batting (player*season), pitching (player*season), salaries, demographics (Master)•Linked by playerID, teamID, etc.•Why baseball? Nonproprietary, not small, interesting? … maybe
In SAS
• PROC SQL• Designed to run interactively (end with QUIT;)• Tables are stored as SAS datasets
Let’s look at the baseball data in SAS….