cse3180 lecture 8 extended topics / 1 cse3180 extended topics and applications

72
Lecture 8 Extended Topics / 1 CSE3180 CSE3180 Extended Topics and Applications

Upload: darren-moody

Post on 11-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 1CSE3180

CSE3180

Extended Topics and Applications

Page 2: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 2

Introduction As a prelude, there is much development, change

and ‘new products’ in the database environment

In this lecture we will cover some of this in broad detail only

It is an environment of much energy - and it is driven largely by the Business sector - and by anticipation

It is an environment of constant change

Page 3: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 3

Some Extended Topics The materials in this lecture will cover a wide scope

of databases and their purpose or contribution to the Business world optimisation concepts some of the additional functions in SQL extended scope of access languages and

hardware relationship to changing and / or expanding

Business conditions relational and non-relational data Open Source DBMSs

Page 4: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 4

Introduction The Relational Data model was first proposed by

E.F. Codd in 1970. It is now the the dominant model for commercial

database implementations. It is based on sound theoretical foundation.

Dr. Codd also developed his 12 Rules which we will now examine briefly

Notice that he did not ‘create’ SQL - he did create its foundation

Page 5: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 5

1. The Information Rule

2. The Guaranteed Access Rule

3. Systematic Treatment of Nulls

4. Active On-Line Catalog Based on the

Relational Model

5. The Comprehensive Data Sub-Language

Rule

6. The View Updating Rule

Codd’s 12 Rules (plus 1)

Page 6: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 6

Codd’s 12 Rules 7. High-level Insert,Update and Delete

8. Physical Data Independence

9. Logical Data Independence

10. Integrity Independence

11. Distribution Independence

12. The ‘Non-Subversion’ Rule

Page 7: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 7

“For any system that is advertised, or claimed to be, a RELATIONAL DATABASE MANAGEMENT SYSTEM, that system must be able to manage databases entirely through its relational capabilities”

Codd, E.F., An Evaluation Scheme for Database

Management Systems that are claimed

to be Relational

Codd’s Rule 0

Page 8: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 32

If a relational system has a low level language (single record-at-a-time), that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher level relational language.

All data manipulation languages supported by the relational DBMS must rely only on the stored database definition (including integrity rules and security constraints) for control of processing. This Rule and Rule 5 imply that it should be not possible nor necessary to access a relational database using any language that bypasses the definition catalog

Rule 12: NonSubversion

Page 9: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 33

Some More Oracle Aspects In the next few overheads, we will be looking at

Some Inside Information on the Data Dictionary or the System Catalog (ue)

Query Optimisation

Page 10: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 34

Oracle maintains information about all tables, views, indexes and many other objects that make up the database in special system maintained tables.

(So do other RDBMs)

These tables make up the Oracle Data Dictionary and are accessible through data dictionary views.

Oracle Data Dictionary

Page 11: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 35

Oracle Data Dictionary Most data dictionary views begin with one of three

prefixes:

USER - objects owned by the account performing the query.

ALL - USER objects plus information about objects to which public or account user access has been granted.

DBA - all database objects regardless of owner.

Page 12: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 36

Oracle Data Dictionary

VIEW DESCRIPTION

DICT Data dictionary objects

USER_TABLES User’s own tables

USER_VIEWS User’s own views

USER_INDEXES User’s own indexes

USER_TABLESPACE User accessible tablespaces

USER_CATALOG Objects owned by user.

ALL_CATALOG As above + other accessible objects.

Some useful data dictionary views

Page 13: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 37

Oracle Data Dictionary

SQL> select table_name from user_tables;

TABLE_NAME

------------------------------

BONUS

DEPARTMENT

DEPT

EMP

SALGRADE

STUDENT

6 rows selected. (you should be familiar with this)

Page 14: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 38

Oracle Data DictionarySQL> select table_name,tablespace_name from

USER_TABLES;TABLE_NAME TABLESPACE_NAME

------------------------ ------------------------------

BONUS USER_DATA

DEPARTMENT USER_DATA

DEPT USER_DATA

EMP USER_DATA

SALGRADE USER_DATA

STUDENT USER_DATA

6 rows selected.

Page 15: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 39

Oracle Data Dictionary

SQL> select table_name,tablespace_name from ALL_TABLES;TABLE_NAME TABLESPACE_NAME

------------------------------ ---------------------------

DUAL SYSTEM

SYSTEM_PRIVILEGE_MAP SYSTEM

TABLE_PRIVILEGE_MAP SYSTEM

STMT_AUDIT_OPTION_MAP SYSTEM

AUDIT_ACTIONS SYSTEM

PSTUBTBL SYSTEM

DEPT USER_DATA

EMP USER_DATA

BONUS USER_DATA

…………………….and many more !

Page 16: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 40

Oracle Data DictionarySQL> select index_name,table_name from

USER_INDEXES;

INDEX_NAME TABLE_NAME

-----------------------------------------

DEPT_INDX DEPARTMENT

EXP_INDEX STUDENT

PK_DEPT DEPT

PK_EMP EMP

SURNAME_INDEX STUDENT

SYS_C00589 STUDENT

6 rows selected.

Page 17: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 41

Oracle Data DictionarySQL> select * from USER_CATALOG;

TABLE_NAME TABLE_TYPE

------------------------------ -----------

BONUS TABLE

DEPARTMENT TABLE

DEPT TABLE

EMP TABLE

ENROLMENT TABLE

SALGRADE TABLE

STUDENT TABLE

STUDENT_DATA VIEW8 rows selected.

Page 18: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 42

The “costs” of executing a query are made up of • access cost to files• main memory processing cost• writing and storing any intermediate results• communication costs (distributed system)• writing final results to storage

Some aspects : Hashing, Indexing, Clustered index, Secondary key indexing, Sorted file, B+ tree, No Index (Binary search) Linear Search (Unordered file, No Index)

etcetera, etcetera .......

Query Costs

Page 19: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 43

Optimiser Operation Modes The Oracle Optimiser has 3 modes of operation.

1. Rule

2. Cost

3. Choose (when a user cannot decide).

The choice of optimiser is normally set up at installation and resides in a table known as init.ora

It can be superceded by a user at query and session level.

Page 20: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 44

Modes1. Rule

This software evaluates possible execution paths and ranks the alternative execution paths on syntactical rules

(Rule Based Optimiser - RBO)

2. Cost

Evaluates the ‘cost’ of available execution paths, and

selects the least or lowest relative cost.

It needs to be associated with another function called

analyse (and this needs to be run frequently)

(Cost based optimiser - CBO)

Page 21: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 45

Modes3. Choose

Invokes the CBO if the tables have been analysed

Invokes the Rule Based optimiser if the tables have not been analysed.

Generally regarded as not being a good method.

And not available to the ‘normal’ user.

Page 22: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 46

Full Table Scan - Table Access Full Row access is by one of 2 methods

A full table scan A RowID based table access

So, what are these ? A full table scan reads each row of a table

sequentially. This operation is known as TABLE ACCESS FULL. Oracle reads multiple blocks during each database read.

A block is normally 2048 bytes. A full table scan is used if no WHERE clause

occurs in the query

Page 23: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 47

Table Access - Full Table Scan The ‘cost’ of this increases as the size of the table

increases - performance drops off.

If there are multiple concurrent users using full table scans on the same table, then performance drops off alarmingly - and the result is unhappy users.

A typical example of a full table scan query would be

select * from (tablename) with possibly an

extension of order by (attribute).

Page 24: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 48

RowID Access All rows in all tables, including the catalogue tables,

have RowID values.

The RowID records the file and block reference and also a sequential number in the block.

Oracle uses indexes to relate data values with RowID values - and this leads to the physical location(s) of data in the database.

Page 25: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 49

RowID Access

A ‘typical RowID’ would look like 00001234.0001.0010 meaning block 1234, file 0 and row 1 in that file. The Block Reference is 10.

‘File’ is an Oracle term meaning a storage area to store database data

The RowID given above is in hexadecimal.

Page 26: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 50

RowID Access

This is the fastest way to a access a Row in a table.

Not many person-users know the RowID values of data - and even if they did, the RowID’s would change as the table contents altered.

The use of indexing to access the RowID’s is used to improve performance

Page 27: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 51

Indexes and Indexing Oracle has 2 major type of indexes:-

Unique : - each row contains a unique value for the indexed columns (remember the ‘constraint’ features in the create table command ?)

Non-Unique : The indexed value for rows can repeat (remember the 1:M primary key/foreign key ?. And the distinct possibility of a repeating value in many rows - such as PostCode ?)

Page 28: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 52

Indexes and Indexing Consider this schema:

create table parts ( part_id varchar2(6) primary key,

part_name varchar2(25),

stocking_qty number(5,0),

charge_price number(4,2),

supplier_id varchar2(12));

The ‘primary key’ constraint motivates Oracle to create a unique index.

Supplier_id could be non-unique - an index on this would be useful.

Page 29: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 53

Indexes and Indexing This would be : create index parts$supplier_id

on parts(supplier_id)

tablespace INDEXES

There are now 2 indexes on this table, and either (or both) could be used in a query.

To use an index, the query must be written to allow the optimiser to use the index - normally via the ‘where’ clause.

E.g. Select * from parts where part_id = ‘CA4180’;

Page 30: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 54

Indexes and Indexing 2 operations take place.

1. The primary key index will be accessed via an INDEX

UNIQUE SCAN. The RowID which matches the part number (CA4180) will be returned from the index

2. This RowID will be used to locate the row via a Table Access by RowID operation.

IF the value required (CA4180) had been contained within the index, the Table Access by RowID would not have been necessary. The data would have been in the index.

By the way, how many rows would have been returned ?

Page 31: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 55

Indexes and Indexing An INDEX RANGE SCAN.

This is used wherea query is over a range of valuesor, a query uses a non-unique index

select part_id from parts where part_id like ‘C%’; the ‘where’ clause cannot specify a unique value This means that the Primary Key index will be

accessed by an Index Range Scan operation. This requires more accesses BUT as the values for

part_id are stored in the Primary Key index, the data table (parts) is not accessed.

Page 32: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 56

Indexes and Indexing In this case, the primary key INDEX will be accessed

by an Index Range Scan.

If the query was of the form

select supplier_id from parts where supplier_id =‘Smith’;

then the Index ‘parts$supplier_id’ would be scanned by an Index Range Scan, and Table Access by RowID performed for each occurrence of ‘Smith’.

If 2 columns or a string and a column are concatenated, then indexes on those columns will not be used.

Page 33: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 57

Indexes and Indexing

The previous notes led to the decision of an index being used, or not.

The Cost Based Optimiser (CBO) calculates whether the use of an index will lower the ‘cost’ of a query, or not.

Page 34: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 58

Indexes and Indexing If there are 800 distinct values in an attribute set

(column) of 1000 rows, then the index selectivity for that set of values is 800/1000 or 0.8. A higher selectivity value would result in fewer number of rows returned for each distinct value.

If the selectivity index is low, the inference is that the Index Range Scan operations and the Table Access by RowID may be more costly than a Table Access Full

Page 35: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 59

More Aspects of Optimising These are some additional aspects :-

Nested loops Hash join Subqueries Update Outer join Filter Database links - remote data Clusters (caution : cluster stored tables perform poorly

compared with data manipulation of non-clustered tables)

Page 36: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 60

Some of the 770 DBA TablesDBA_VIEWS USER_CLU_COLUMNS

ALL_ERRORS USER_AUDIT_STATEMENT

ALL_TABLES USER_AUDIT_TRAIL

ALL_OBJECTS USER_CATALOG

USER_COLL_TYPES USER_TAB_PRIVS

USER_COL_COMMENTS USER_ARGUMENTS

USER_COL_PRIVS USER_ALL_TABLES

USER_COL_PRIVS_MADE USER_TAB_PRIVS

USER_ASSOCIATIONS V$SQL

USER_AUDIT_OBJECT V$SQLAREA

USER_AUDIT_SESSION V$SHARED_MEMORY

USER_VIEWS GV$DISPATCHER

Page 37: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 61

Oracle SQL and DB2 (UDB V.7.1)The next few overheads are intended to convince you

that DBMSs do alter:

These are some of the recent changes to Oracle :-

Set SQLBlanklines (On or Off. Off is the default)

Show SQLBlanklines

A modification to the return message from Create or Replace/Alter/Drop Snapshot

Page 38: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 62

Oracle SQL and DB2 (UDB V.7.1)

New or Modified Commands:

Describe (m) Level as set in Set Describe

Sql> set describe depth 3 linenum on ident on

Sql> describe <table name>

Recover(m) - performs media recovery on one or more tablespaces, one or more datafiles or the entire database

Set (m) This has 4 new clauses -Autorecovery, describe, instance, logsource

Show - 6 new clauses. Autorecovery, Describe,Instance, Logsource,Parameters, SGA (system global area)

Page 39: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 63

Oracle SQL and DB2 (UDB V.7.1)

Shutdown - option of closing and dismounting a database

Connect Connect [logon] as Sysoper| Sysdba| Internal

logon is of the form

username/password @database_specification

These commands have been modified:

Create type

Describe

Password

Connect

Page 40: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 64

Oracle SQL and DB2 (UDB V.7.1)

Set maxdata closecursor compatability constraint newpage loboffset

Page 41: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 65

Oracle SQL and DB2 (UDB V.7.1)

Variable (bid variables : nchar, nchar2, nclob

Show Errors

Attribute

Exit - allows numeric bind variables to be used.

{Exit|quit} [success | Failure | Warning | n | variable | :BindVariable ] [Commit | Rollback]

Page 42: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 66

Oracle SQL and DB2 (UDB V.7.1)

These are some of the recent changes to DB2 :-

A Net Search Extender is included.

Joins between different versions of DB2 are now supported

An XML Extender is included

Net.Data (Connect Web applications to DB2)

DB2 Warehouse Manager (includes Query Patroller, QMF for Windows)

OLAP Server Starter kit

Spatial Extender - introduces time and distance attributes into business intelligence queries

Page 43: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 67

Oracle SQL and DB2 (UDB V.7.1)

And in DB2 SQL, these functions are included :

Moving average

Moving count

Moving sum

Rank

Correlation

Stddev

Variance

CoVariance

Page 44: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 68

The OLAP functions proposed for SQL-99 are

ceiling percentile_cont regr_slope

corr percent_rabk regr_sxx

covar_pop power regr_sxy

covar_samp range regr_syy

cume_dist rank row_number

dense_rank regr_avg sqrt

exp regr-avgx stddev_pop

floor regr_agvy stddev_samp

ln regr_count car_pop

moving_avg regr_intercept var_samp

moving_sum regr_r2

New SQL Commands

Page 45: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 69

Oracle SQL and DB2 (UDB V.7.1)

Oracle Data Types

Char(n)

Varchar2(n)

Long

Number(p,s)

Decimal(p,s)

Integer

Smallint

Raw(n)

Long Raw

Date

Date (only the date)

Date (only the time)

DB2 Data Types

Char(n)

Varchar(n)

Clob(2 Gb)

Numeric(p,s)

Decimal(p,s)

Integer

Smallint

Char(n) for Bit Data

Blob (2 Gb)

Time stamp

Date (MM/DD/YYYY)

Time (HH24:Mi:Ss)

Page 46: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 70

Some Interesting Aspects

An interesting about-face:

Work has been done on ‘unconventional’ concurrency models - and Oracle has implemented a non-locking based model (perhaps a cache based database ?).

Could all decision support systems work on the same ‘truth’ data ?

Page 47: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 71

IBM’s DirectionIBM have an ongoing learning optimisation research project (eLiza) which is aimed at

automating adjustments to the configuration parameters

memory space allotment

schemas (and more importantly changes to schemas as these do change over time)

Page 48: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 72

Emerging StandardsSQL-X is an emerging standard for using SQL together with XML syntax to navigate XML documents and to express XML-relayed queries

SQL offers a much simpler view of data

The language is about value-based relationships

Data (in many cases) is maintained without value-based relationships

Page 49: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 73

Emerging Standards XML is widely used for web based database

applications It is a standard for ‘describing’ data in data exchange. It ‘embeds’ information about the text in a text

message XML code can be reused The World Wide Web Consortium (WC3) completed

XML’s definition in 1998 It is a ‘language about languages’ It uses ‘embedded’ tags for its ‘intelligence’ X-Query runs queries against XML-tagged documents

Page 50: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 74

A Brief History of I.T. trends Move from centralised computing to distributed or

decentralised computing Business process re-engineering Rapid advance and development and establishment

in database technology Advanced systems in Enterprise Resource Planning,

Customer Resource Management Expansion and use of the World Wide Web Internet capabilities

Page 51: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 75

Business - Analytical Applications Growth and Expansion of Financial Analytic

Applications - 21st Century Focus

Why : Costs and Cost Management and Containment Profits and Profit Management Enterprise, Corporate, Business Management Regulation Compliance - e.g. Sarbanes-Oxley Act

Page 52: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 76

Pressures ? Client/Server computing Distributed Computing New generation of users + their requirements Intelligent Data Data Management

and more Data and more Data Management

Page 53: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 77

A solution ? Virtualisation

Addresses the problems of the rapid development of databases.

Resulting in a heterogeneous array of systems

A barrier to Business from exploiting or gaining full values from their information sources

Page 54: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 78

Data Federation To unite - On a common basis

- For a common objective Do these qualify ?

Law Enforcement Agencies Airline Industry Healthcare Providers Retailers Manufacturers Suppliers Insurance Agencies Government Agencies

Page 55: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 79

Data Federation The concept of Information as a ‘shared resource’. Insurers can improve satisfaction levels and reduce

costs doctors, health agents, hospitals with Web access

Required data is held in ‘older’ systems - legacy systems ?

New IT systems - business intelligence, enterprise portals, e-commerce

Critical to competitive positioning, cost efficiencies, operational performance monitoring

Page 56: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 80

Data Federation Can the gap be overcome ?

IBM’s product - Classic Federation provides means of access to mainframe non-

relational and relational databases and files employs ODBC and JDBC client tools and

applications ‘Fits seamlessly into existing mainframe

infrastructures, reporting tools and application environments

Page 57: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 81

Data Federation Standard SQL commands - Select, Insert, Update,

Delete Business ‘able to tap into’ multivendor legacy

systems - DB2, IMS, VSAM, Adabas, CA-Datacom, CA-IDMS

How ? DBll Classic Federation maps logical relational

table and view structures over existing physical databases

Unix, Linux and mainframe tools access this data using the SQL commands

Page 58: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 82

Data Federation ‘Classic Federation’ generates native data access

commands for each database and file type

JDBC Client provides SQL developers with WebSphere Studio (mainframe operational data to

customer Web site) WebSphere Portal (access to mainframe payroll,

policy, accounting, claims data) WebSphere Business Integrator

Page 59: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 83

Data Federation Oracle : Real Application Clusters (9i and 10g)

Shared disk approach - ‘unified view’ Transaction processing applications

current trends in storage networks ‘grid’ computing with ‘blade’ servers (attachable

software)

Page 60: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 84

Data Federation Oracle’s policy ? “ virtualisation enables each component of the grid to

react to changing circumstances more quickly and to adapt to component failures without compromising performance of the system as a whole”. (Brajesh Goyal)

Also interested in Linux and Intel based hardware

Page 61: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 85

Grid Computing Grid computing is based on the concept of

networked computing resources

And managed such that they can be quickly and efficiently re-allocated for use by different departments, applications, and users.

It embraces high speed networking technologies, advances in clustering and storage technologies

Page 62: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 86

Grid Computing It also embraces automation of system

administration

And the adoption of industry standard technologies

Allows ‘customers’ to provide cost efficient supply, access, management and sharing of computing and storagestorage

Page 63: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 87

Associated Technologies Data mining - the automated extraction of hidden

predictive information from databases It allows users to analyse large databases to solve

business decision problems It is not a business solution - it is a technology

Data Warehouse - A repository which stores integrated information for efficient querying and analysis.

This information may come from different sources. It is translated into a common data model and

integrated with existing data in the data warehouse.

Page 64: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 88

Future Directions Nanotechnology - smaller, faster, mobile, more

efficient

Mobile services will continue to becomesmaller fasterand embedded in many objects we touch

Page 65: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 89

Future Directions They will enable

real-time interaction with customers participation in collaborative projects access to a global network of intelligence

And the distinction between communication and computing will become imprecise

Page 66: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 90

Future Directions Molecular Memory

A means of ‘cramming’ more data into a memory cell

Molecular wires - nanotechnology

Molecular wires - parcels of charge around a molecule

A grid of wires, each about 2 nanometres in diameter A nanometre is one millionth of a millimetre

(roughly 10 carbon atoms long)

Page 67: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 91

Holographic Memory - What’s That ? It could be the replacement for hard disks

Devices which use light (photo-optic) to store and read data

Compact disks (CD) - 783 megabytes (soon 1.3 GB)

DVD (Digital Versatile Disks) - 15.9 Gigabytes

Data is stored as bits (binary digits) - and on the surface of the recording media

Page 68: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 92

Holographic Memory New optical storage research is focused on 3D

storage - to use the volume of the storage media - not just the surface area.

Possibility of storing a terabyte (212 bytes) of data in a sugar-cube-size crystal - 1,000 gigabytes

The data on 1,000 CDs could fit onto a holographic memory system

Page 69: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 93

Holographic Memory Current PC hard disk drives hold about 80/120

Gigabytes - which is considerably smaller capacity than 1,000 Gigabytes

Have you seen any advertising for an HDSS (desktop holographic storage system) ?

Data transfer rate at 40 Megabytes per second

Page 70: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 94

In the Future ? Not to be outdone, Microsoft has signalled that it

intends to ‘remove the divide between ‘High Performance Computing’ and ‘Personal Computing’.

This probably means that Microsoft will focus on Windows clustering, and exploiting Web services for large-scale, federated and distributed processing

Page 71: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 95

In the Future ?Open Source Database Management Systems

MySQL + SAP = MaxDB

PostgreSQL 7.4 Relational Database. 64 bit processor

Page 72: CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 96

In the Future ? Database replaced traditional ‘file keeping and

management’ Will Data Warehousing eventually replace existing

‘databases’ and database technology ? Will analytical tools (ERP, CRM, SCM, BOM …)

eventually be the ‘core’ processes of databases ? Will ‘Grid’ computing be the next wave of user

access capability

And then ? ? ? ? How will the ‘communications’ load be met or

supported ?