mis, database management system. management information system

Database Management SystemsDatabase Management Systems

MIS

Learning Objectives

What are the problems of managing data resources in a traditional file environment?

What are the major capabilities of database management systems (DBMS) and why is a relational DBMS so powerful?

What are the principal tools and technologies for accessing information from databases to improve business performance and decision making?

Why are information policy, data administration, and data quality assurance essential for managing the firm’s data resources?

Organizing Data in a Traditional File Environment

• File organization Term and Concepts• Computer system organizes data in a hierarchy

• Bit: Smallest unit of data; binary digit (0,1)• Byte: Group of bits that represents a single

character• Field: Group of characters as word(s) or

number• Record: Group of related fields• File: Group of records of same type


• File organization Term and Concepts• Computer system organizes data in a hierarchy

• Database: Group of related files• Entity: Person, place, thing on which we

store information.• Attribute: Each characteristic, or quality,

describing entity• E.g., Attributes Date or Grade belong to entity

COURSE

The Data HierarchyThe Data HierarchyA computer system organizes data in a hierarchy that starts with the bit, which represents either a 0 or a 1. Bits can be grouped to form a byte to represent one character, number, or symbol. Bytes can be grouped to form a field, and related fields can be grouped to form a record. Related records can be collected to form a file, and related files can be organized into a database.


Traditional File ProcessingTraditional File Processing

The use of a traditional approach to file processing encourages each functional area in a corporation to develop specialized applications and files. Each application requires a unique data file that is likely to be a subset of the master file. These subsets of the master file lead to data redundancy and inconsistency, processing inflexibility, and wasted storage resources.


Database

• A database is a collection of information that is organized so that it can easily be accessed, managed, and updated. In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images.

• A database is a logically coherent collection of data with some inherent meaning, representing some aspect of real world and which is designed, built and populated with data for a specific purpose. A database is not necessarily computerized. It can be generated and maintained manually, or it may be computerized.

http://searchsqlserver.techtarget.com/definition/information

Database

Databases are used in every part of day-to-day life. Examples of common database use include: depositing or withdrawing money from a bank, making a travel reservation, accessing a library catalog, buying something from the internet etc. These are examples of traditional database applications, where data is stored either in textual or numeric format. Less traditional database applications that are starting to become more popular include multimedia databases, which store pictures, video clips, and sounds, Geographic Information Systems (GIS) that store maps, satellite images and weather data.

Database Management System

A database management system (DBMS) is a collection of interrelated data and a set of programs to access those data. The collection of data, usually referred to a database, contains information relevant to an enterprise.

Database Management System

A database management system is a collection of programs that enable users to create and maintain a database. -----Elmarsi & Navathe

A database management system, or DBMS, is a software designed to assist in maintaining and utilizing large collections of data. --------- Ramakrishnan & Gehrke

Database Systems Vs File Systems ( Why DBMS?)

Ordinary file system has a number of major drawbacks:

1. Data redundancy and inconsistency- Multiple file formats, duplication of information in different files.

2.Difficulty in accessing data- Need to write a new program to carry out each new task


3.Data isolation -Multiple files and formats

4.Integrity problems - Integrity constraints (e.g. account balance >

0) become part of program code - Hard to add new constraints or change

existing ones.


5. Atomicity problems - Failures may leave database in an

inconsistent state with partial updates carried out. E.g., transfer of funds from one account to another should either complete or not happen at all.


6. Concurrent-access anomalies- Needed for system performance and usability

- Uncontrolled concurrent accesses can lead to inconsistencies. E.g. two people reading a balance and updating it at the same time.


7. Security problems: Not every user of the database system should be able to access all the data.

Database systems offer solutions toall these problems

Some Commercial Database Management Software

For Personal Computers

1. Microsoft Access2. FoxPro3. dBase

Some Commercial Database Management Software

1.Oracle – Oracle 8i, Oracle9i, Oracle 10g, 11g 2. Microsoft SQL Server3. IBM DB2/DB2UDB4. Informix5. Sybase

6. Ingress

Some Open Source Database Management Software

1. CUBRID 2. Firebird

3. MariaDB 4. MongoDB

5. Postgre SQL 6. MySQL

7. SQLite

Database System Applications

Databases are widely used. Some representative applications are:

1. Banking: For customer information, accounts, loans and banking transactions.2. Airlines/Railways/Road Transport: For ticket reservation, schedules and routes.


3. Universities: For student information, courses and grades (education management).

4.Credit card transaction: For purchases on credit card, monthly statement generation


5.Telecommunication: For keeping records of call made, generating monthly bills, maintaining balances on prepaid calling cards, storing information about the communication networks.


6.Finance: For storing information about holdings, sales, and purchases of financial instruments such as stocks and bonds.

7. Sales: For customer, product and purchase information.


8. Manufacturing: For management of supply chains and for tracking production of items in factories, inventories of items in warehouses/stores and order for items. 9. Human resources: For information about employees, salaries, payroll taxes and benefits and for generation of pay checks.

Data Models

A data model is a collection of conceptual tools for describing data, data relationship, data semantics and consistency constraints.

A. Base Models: Describes the design of the database at the logical level.

Data Models

A. 1. Entity-Relationship Model: This is a higher-level data model. It is based on a perception of a real world that consists of a collection of basic objects, called entities and the relationship among these objects.

Data Models

Entity: An entity is a “thing” or “object” in the real world that is distinguishable from all other objects. An entity has a set of properties, called attributes and the values for some set of properties/attributes may uniquely identify an entity. An entity may be concrete, such as a person or a book, or it may be abstract, such as loan, or a holiday, or a concept.

Data Models

customer- customer-id, customer-name, customer-street, customer-city

loan – loan-number, amount

Data Models

Relationship: A relationship is an association among several entities. A depositor relationship associates a customer with each account that he or she has.

The set of all entities of the same type and the set of all relationships of the same type are termed an entity set and relationship set, respectively

Data Models

Data Models

A. 2. Relational Model: This is a lower level model. It uses a collection of tables to represent both data and relationships among those data.

Each table has multiple columns, and each column has a unique name.

Data Models

The relational model is an example of a record-based model. This is because the database is structured in fixed-format records of several types. Each table contains records of a particular type. Each record type defines a fixed no. of fields, or attributes. The columns of the table correspond to the attributes of the record type.

Data Models

The relational model is the most widely used data model and a vast majority of current database systems are based on the relational model.The relational model is at a lower level of abstraction than the E-R model. Database designs are often carried out in the E-R model and then translated to the relational model.

Data Models

Data Models B. Other Models: B. 1. Object-oriented data model: Drawing increasing

attention. It can be seen as extending of E-R model with notions of encapsulation, methods (functions) and object identity.

An object database (also object-oriented database management system, OODBMS) is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are different from relational databases which are table-oriented.

Data Models

Data Models

B. 2. Object-relational data model: Combines the features of object-oriented data model and relational data model.

An object-relational database (ORD), or object-relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language.

https://en.wikipedia.org/wiki/Database_management_system

https://en.wikipedia.org/wiki/Relational_database

https://en.wikipedia.org/wiki/Object_database

https://en.wikipedia.org/wiki/Database_schema

https://en.wikipedia.org/wiki/Query_language

Data Models B. 3. Semi-structured data model: Permits the

specification of data where individual data items of the same type may have different sets of attributes. The extensible markup language (XML) is widely used to represent semi-structured data.

The semi-structured model is a database model where there is no separation between the data and the schema, and the amount of structure used depends on the purpose. It can represent the information of some data sources that cannot be constrained by schema.

Data Models

Data Models

C. Historical Models: These are in little use now.

C. 1. Network data model

C. 2. Hierarchical model

Data Models

The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice.

Network DBMS:• Depicts data logically as many-to-many

relationships

The Network Data Model

Data Models

TYPES OF RELATIONS

ONE-TO-ONE:ONE-TO-ONE: STUDENT ID

ONE-TO-MANY:ONE-TO-MANY: CLASS

STUDENTA

STUDENTB

STUDENTC

MANY-TO-MANY:MANY-TO-MANY:

STUDENTA

STUDENTB

STUDENTC

CLASS1

CLASS2

Data Models

A hierarchical database model is a data model in which the data is organized into a tree-like structure. The data is stored as records which are connected to one another through links. A record is a collection of fields, with each field containing only one value.

Hierarchical DBMS:

• Organizes data in a tree-like structure

• Supports one-to-many parent-child relationships

• Prevalent in large legacy systems

Data Models

A Hierarchical Database for a Human Resources System

HUMAN RESOURCES DATABASE WITH MULTIPLE VIEWS

A single human resources database provides many different views of data, depending on the information requirements of the user. Illustrated here are two possible views, one of interest to a benefits specialist and one of interest to a member of the company’s payroll department.

Primary Key and Foreign Key

Each record requires a key field, or unique identifier. The best example of this is your social security number—there is only one per person. That explains in part why so many companies and organizations ask for your social security number when you do business with them.


In a relational database, each table contains a primary key, a unique identifier for each record. To make sure the tables relate to each other, the primary key from one table is stored in a related table as a foreign key. For instance, in the customer table below the primary key is the unique customer ID. That primary key is then stored in the order table as the foreign key so that the two tables have a direct relationship.


Customer Table Order Table

Field Name Description Field Name Description

Customer Name

Self-Explanatory Order Number Primary Key

Customer Address

Self-Explanatory Order Item Self-Explanatory

Customer ID

Primary Key Number of Items Ordered

Self-Explanatory

Order Number

Foreign Key Customer ID Foreign Key

Relational DatabaseThere are two important points you should remember about creating and maintaining relational database tables. First, you should ensure that attributes for a particular entity apply only to that entity. That is, you would not include fields in the customer record that apply to products the customer orders. Fields relating to products would be in a separate table. Second, you want to create the smallest possible fields for each record. For instance, you would create separate fields for a customer’s first name and last name rather than a single field for the entire name. It makes it easier to sort and manipulate the records later when you are creating reports.

Relational Database

Name Address Telephone number

John L. Jones 111 Main St Center City Ohio 22334

555-123-6666

First Name

Middle Initial

Last Name

Street City State Zip Telephone

John L. Jones 111 Main St

Center City

Ohio 22334 555-123-6666

Wrong way:

Right way:

THE THREE BASIC OPERATIONS OF A RELATIONAL DBMS

The select, join, and project operations enable data from two different tables to be combined and only selected attributes to be displayed.

Non-Relational Databases and Databases in the Cloud

Data are now stored in text messages, social media postings, maps, and the like. Non-relational database management systems are better at managing large data set on distributed computing networks. They can easily be scaled up or down depending on the particular needs of your business at a particular time.

Cloud computing service companies provide a way for you to manage your company’s data through Internet access using a Web browser.

Non-Relational Databases and Databases in the Cloud

Non-relational databases: “NoSQL” More flexible data model Data sets stored across distributed machines Easier to scale Handle large volumes of unstructured and structured data (Web, social media, graphics)

Databases in the cloud Typically, less functionality than on-premises DBs Amazon Relational Database Service, Microsoft SQL

Azure Private clouds

Capabilities of Database Management Systems (DBMSs)

There are three important capabilities of DBMS that traditional file environments lack—data definition, data dictionary, and a data manipulation language. –Data definition capability: Specifies structure of database content, used to create tables and define characteristics of fields–Data dictionary: Automated or manual file storing definitions of data elements and their characteristics–Data manipulation language: Used to add, change, delete, retrieve data from database

• Structured Query Language (SQL)• Microsoft Access user tools for generating SQL

–Many DBMS have report generation capabilities for creating polished reports (Crystal Reports)

MICROSOFT ACCESS DATA DICTIONARY FEATURES

Microsoft Access has a rudimentary data dictionary capability that displays information about the size, format, and other characteristics of each field in a database. Displayed here is the information maintained in the SUPPLIER table. The small key icon to the left of Supplier_Number indicates that it is a key field.

AN ACCESS QUERY

Capabilities of Database Management Systems (DBMSs)

• Designing Databases– Conceptual (logical) design: abstract model from business perspective– Physical design: How database is arranged on direct-access storage

devices• Design process identifies:

– Relationships among data elements, redundant database elements– Most efficient way to group data elements to meet business

requirements, needs of application programs• Normalization

– Streamlining complex groupings of data to minimize redundant data elements and awkward many-to-many relationships

Normalization

Normalization: Database Normalization is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form by removing duplicated data from the relation tables.

Database normalization, or simply normalization, is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity.

Normalization Normalization is a process of organizing the data in database

to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly.

Through normalization process, the collection of data in a single table is replaced, by the same data being distributed over multiple tables with a specific relationship being setup between the tables.

Streamlining complex groupings of data to minimize redundant data elements and awkward many-to-many relationships.

Process of creating small stable data structure from complex groups of data.

AN UNNORMALIZED RELATION FOR ORDER

An unnormalized relation contains repeating groups. For example, there can be many parts and suppliers for each order. There is only a one-to-one correspondence between Order_Number and Order_Date.

NORMALIZED TABLES CREATED FROM ORDER

After normalization, the original relation ORDER has been broken down into four smaller relations. The relation ORDER is left with only two attributes and the relation LINE_ITEM has a combined, or concatenated, key consisting of Order_Number and Part_Number.

Tools for Improving Business Performance and Decision Making

Business intelligence infrastructure Today includes an array of tools for separate systems,

and big data Contemporary tools:

Data warehouses Data marts Hadoop In-memory computing Analytical platforms

Data Warehouses

A data warehouse is a large store of data accumulated from a wide range of sources within a company and used to guide management decisions.

A data warehouse is a collection of data drawn from other databases used by the business.

It is a database that stores current and historical data of potential interest to decision makers throughout the company.

Data Warehouses

Supports reporting and query tools Stores current and historical data Consolidates data for management

analysis and decision making Improved and easy accessibility to

information Ability to model and remodel the data

Components of a Data WarehouseDATABASE TRENDS

Data Mart

The data mart is a subset of the data warehouse and is usually oriented to a specific business line or team. Whereas data warehouses have an enterprise-wide depth, the information in data marts pertains to a single department.

A data mart represents the specific data from a data warehouse which a user needs.

It is a subset of data warehouse in which a summarized or highly focused portion of the organization’s data is placed in a separate database for a specified function or group of users.

CONTEMPORARY BUSINESS INTELLIGENCE INFRASTRUCTURE

A contemporary business intelligence infrastructure features capabilities and tools to manage and analyze large quantities and different types of data from multiple sources. Easy-to-use query and reporting tools for casual business users and more sophisticated analytical toolsets for power users are included.


Hadoop Enables distributed parallel processing of big

data across inexpensive computers Key services

Hadoop Distributed File System (HDFS): data storage MapReduce: breaks data into clusters for work Hbase: NoSQL database

Used by Facebook, Yahoo, NextBio


In-memory computing Used in big data analysis Uses computers main memory (RAM) for data storage

to avoid delays in retrieving data from disk storage Can reduce hours/days of processing to seconds Requires optimized hardware

Analytic platforms High-speed platforms using both relational and non-

relational tools optimized for large datasets


Analytical tools: Relationships, patterns, trends

– Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions• Multidimensional data analysis (OLAP)• Data mining• Text mining• Web mining

Data Mining

Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information that can be used to increase revenue, cuts costs, or both. It is a process used by companies to turn raw data into useful information.

Data mining is the analysis of data for relationships that have not previously been discovered. For example, the sales records for a particular brand of tennis. It is the technique of searching for patterns in the data.


• Online analytical processing (OLAP)– Supports multidimensional data analysis

• Viewing data using multiple dimensions• Each aspect of information (product, pricing, cost,

region, time period) is different dimension• Example: How many washers sold in the East in

June compared with other regions?– OLAP enables rapid, online answers to ad hoc

queries

Multidimensional Databases

A multidimensional database presents the data to the user in several dimensions. A three dimensional database might present the information by Sales Region Season Product Line


MULTIDIMENSIONAL DATA MODEL: The view that is showing is product versus region. If you rotate the cube 90 degrees, the face that will show product versus actual and projected sales. If you rotate the cube 90 degrees again, you will see region versus actual and projected sales. Other views are possible.


Data mining: Data mining technology allows a digital firm to get more information than ever before from its data. Finds hidden patterns, relationships in datasets

Example: customer buying patterns Infers rules to predict future behavior Types of information obtainable from data mining:

Associations Sequences Classification Clustering Forecasting


Text mining: Text mining tools help scrub text files to find data or to discern patterns and relationships.

Extracts key elements from large unstructured data sets Stored e-mails Call center transcripts Legal cases Patent descriptions Service reports, and so on

Sentiment analysis software Mines e-mails, blogs, social media to detect opinions


• Web mining– Discovery and analysis of useful patterns and

information from Web– Understand customer behavior– Evaluate effectiveness of Web site, and so on

– Web content mining• Mines content of Web pages

– Web structure mining• Analyzes links to and from Web page

– Web usage mining• Mines user interaction data recorded by Web server


• Databases and the Web– Many companies use Web to make some internal

databases available to customers or partners– Typical configuration includes:

• Web server• Application server/middleware/CGI scripts• Database server (hosting DBMS)

– Advantages of using Web for database access:• Ease of use of browser software• Web interface requires few or no changes to database• Inexpensive to add Web interface to system

Database Users

Users are differentiated by the way they expect to interact with the system. Four different types:

1. Naive users – are unsophisticated users who interact with the system by invoking one of the permanent application programs that have been written previously.

E.g. people accessing database over the web, bank tellers, clerical staff

Database Users

2. Application programmers – are computer professionals who write application programs. Application programmers can choose from many tools to develop user interface.

3. Sophisticated users – interact with the system without writing programs. Instead, they form their requests in a database query language. Analysts who submits queries to explore data in the database.

Database Users

Engineers, scientists, analysts who implement applications to meet their requirements.

e.g., analyst looking at sales data (OLAP – Online analytical processing), data mining – finds certain kinds of patterns in data.

Database Users

4. Specialized users – are sophisticated users who write specialized database applications that do not fit into the traditional data processing framework.

e.g., computer-aided design systems, knowledge-base and expert systems and environment-modeling systems – uses complex data types.

LINKING INTERNAL DATABASES TO THE WEB

Users access an organization’s internal database through the Web using their desktop PCs and Web browser software.

Managing the Firm’s Data Resources

Establishing an information policy Firm’s rules, procedures, roles for sharing, managing,

standardizing data Data administration

Establishes policies and procedures to manage data Data governance

Deals with policies and processes for managing availability, usability, integrity, and security of data, especially regarding government regulations

Database administration Creating and maintaining database


• Ensuring data quality – More than 25 percent of critical data in Fortune 1000

company databases are inaccurate or incomplete– Redundant data– Inconsistent data– Faulty input

– Before new database in place, need to:• Identify and correct faulty data • Establish better routines for editing data once database in

operation


• Data quality audit:– Structured survey of the accuracy and level of

completeness of the data in an information system• Survey samples from data files, or• Survey end users for perceptions of quality

• Data cleansing– Software to detect and correct data that are incorrect,

incomplete, improperly formatted, or redundant– Enforces consistency among different sets of data from

separate information systems

ThankYou

mis, database management system. management information system

Leadership & Management