experimental methods and techniques in ...home.deib.polimi.it/schiaffo/cs/experimental methods...
TRANSCRIPT
EXPERIMENTAL METHODS AND TECHNIQUES IN ENGINEERING
The Data Management Perspective
Fabio A. Schreiber
Politecnico di Milano
Dipartimento di Elettronica, Informazione e Bioingegneria
THE DATA MANAGEMENT PERSPECTIVE
Experiments on Databases and DBMSs
Data organization and management as a service to the experiments of the scientific community
Experimenting with the Database content itself
F. A. Schreiber Experimental Methods ... Data Perspective
1
EXPERIMENTS & DATA MANAGEMENT
Experiments for optimizing data structures and management Database Management System (DBMS) Data Structures Conceptual/Logical Schema optimization and evolution Physical structures design
Data organization and management for collecting experimental results (e-Science)
Exploring Database content (data mining) Assessing Data Quality
F. A. Schreiber Experimental Methods ... Data erspective
2
EXPERIMENTS & DATA MANAGEMENT
F. A. Schreiber Experimental Methods ... Data Perspective
Goals Systems Performance Evaluation and Tuning How performant a system is? How can I improve its performance?
Benchmarking Comparison among different systems under similar workload
System Effectiveness How much a system conforms to the user’s needs
w.r.t. a defined metric
3
WORKLOAD AND FACTORS
F. A. Schreiber Experimental Methods ... Data Perspective
Synthetic vs. Real Synthetic workload allows for controlled experiment
repeteability. Useful in systems comparison Real workload can be highly variable and can be used in
assessing the overall performance of a single system in its real environment
Single-user (to test specific algorithms) vs. Multi-user (to test system procedures) Multiprogramming level Query mix Degree of data sharing (buffer and cache sizes)
4
FACTORS IN DBMS PERFORMANCE EVALUATION (Boral & DeWitt 84)[2]
F. A. Schreiber Experimental Methods ... Data Perspective
Multiprogramming level (MPL) Number of concurrent queries in any phase of execution Use precompiled queries and minimize the data volume of the results in order to exibit as much as possible the true «execution» time
5
FACTORS IN DBMS PERFORMANCE EVALUATION
F. A. Schreiber Experimental Methods ... Data Perspective
Degree of Data Sharing (DDS) Concurrent access affects both data pages (rare) and index
pages (frequent) Expressed as a percentage of the multiprogramming level: 0% each query references only its partition 100% all queries reference the same partition 0%<DDS<100%
Queries randomly distributed among partitions Application programs uniformly distributed among partitions
The DDS affects the buffer pages replacement algorithm MRU best for relational operators LRU best for replacement of shared data pages
6
FACTORS IN DBMS PERFORMANCE EVALUATION
F. A. Schreiber Experimental Methods ... Data Perspective
Query Mix Selection (multiuser) (Boral & DeWitt 84)[2]
Consumed resources CPU cycles: actual query execution, access path selection,
buffer pool management, OS disk operations Disk bandwith: get/store data, page swapping
Query type CPU Disk Query example I Low
0.18 s Low 2-3
Select one tuple from 10000 using a clustered index
II Low 0.90 s
High 91
Select 100 tuples from 10000 using a non-clustered index
III High 18.96 s
Low 206
Join 10000 tuples with 1000 tuples using a clustered index on the join attribute of the first relation
IV High 35.62 s
High 1008
Aggregate function on 10000 tuple relation (100 partitions)
7
METRIC (Schwartz 11) [15]
F. A. Schreiber Experimental Methods ... Data Perspective
Measured entities Observation interval (OI) Number of queries in the observation interval (NQ) Busy time: total time of the queries in the system (BT) Weighted time: total execution time of the queries (WT)
Derived variables Throughput: NQ/OI Execution time: WT/NQ Concurrency: WT/OI Utilization: BT/OI
8
METRIC
F. A. Schreiber Experimental Methods ... Data Perspective
Si,j , Ei,j starting and ending times of the jth query of the i concurrent program 1≤ i ≤ MPL , 1≤j≤N
Tlast-to-start = max{Si,1 , 1≤ i ≤ MPL } ; Tfirst-to-finish = min{Ei,N , 1≤ i ≤ MPL }
NQ number of totally executed queries
System throughput NQ/(Tfirst-to-finish - Tlast-to-start) Average Response time Σ exec_timesNQ /NQ
t
MPL
Tlast-to-start Tfirst-to-finish
Number of totally executed queries NQ
9
EXPERIMENTAL MODALITY
F. A. Schreiber Experimental Methods ... Data Perspective
Simulation vs. real life Testbeds simulation often provides imprecise results owing to many
parameters of the system which are not accounted for by the simulation programs
testbeds with a very large number of components are very difficult, if not impossible, to organize
use testbeds to tune and calibrate simulation
programs???
Repeatability is essential for the experiment credibility (Manolescu &Al.) [8]
10
A DBMS QUEUING MODEL
F. A. Schreiber Experimental Methods ... Data Perspective
USERS
TRANSACTION REQUEST
PRIORITY ASSIGNMENT
RESTART RESUBMIT TERMINATE
COMMIT ABORT REQUEST/RELEASE
A DATA OBJECT CONCURR. CONTROL
BLOCK WAIT DB OPERATION
BUFFER ACCESS
HIT
MISS DISK
ACCESS
COMPUTATION
CPU
11
SCALABILITY
F. A. Schreiber Experimental Methods ... Data Perspective
Systems COMPLEXITY The memory and time behaviour of algorithms cannot be
inferred by testing systems composed of only a bounce of nodes: constants matter!
n
O(n)
O(n2) O(2n)
O(log n)
12
DATABASE SYSTEMS BENCHMARKS
F. A. Schreiber Experimental Methods ... Data Perspective
Useful for comparing different DBMS Systems must be fully installed and operational They rely on the effectiveness of synthetic
workloads Benchmarking is an experimental activity which
requires three steps (as usual): Design Execution
Analysis
13
DATABASE SYSTEMS BENCHMARKS
F. A. Schreiber Experimental Methods ... Data Perspective
14
The good thing about standards is that there are so many of them
Unknown
… and each one has so many options that can be chosen at will …
DATABASE SYSTEMS BENCHMARKS
F. A. Schreiber Experimental Methods ... Data Perspective
Transaction Processing Performance Council (TPC) [16] TPC-C: for OLTP systems. It simulates a multi-user environment
making concurrent queries to a central Database. Suited for on-line handling of orders and for managing inventories.
TPC-E: similar to TPC-C, but with transactions designed for brokerage environments such as on-line trading, market research, account inquiries, …
TPC-H: tuned for Decision Support Systems, complex data mining queries, concurrent data modifications.
Other benchmarks for specific products (Oracle, MySQL, …) or functionalities (security, web servers, …)
15
DATA ORGANIZATION AND MANAGEMENT FOR COLLECTING EXPERIMENTAL RESULTS (e-Science) (Vanschoren & Blockeel 10) [14]
F. A. Schreiber Experimental Methods ... Data Perspective
Experimental data collection create searchable , community-wide repositories to
automatically publish experimental results on-line a formal experiment description language to import a large
number of experiments and make them immediately available to everyone
ontologies providing a controlled vocabulary clearly describing the interpretation of each concept
Generate a collaborative approach to experimentation Experiments freely shared Linked together Reused by querying and data mining
16
e-SCIENCES
F. A. Schreiber Experimental Methods ... Data Perspective
17
Computationally intensive sciences, which use the internet as a global collaborative workspace Bioinformatics Microarrays (Stoeckert & Al. 02) [12]
Proteomics (Masseroli 07) [10]
Astronomy Virtual observatories (Szalay & Gray 01) [13]
Physics High energy nuclear physics (Brown & 07) [3]
e-SCIENCES
F. A. Schreiber Experimental Methods ... Data Perspective
18
e-science applications as well as other Web Information Systems share a collaborative and distributed nature of their development and content management (Curino & Al. 08) [5,6]
Evolution in time DB migration
While preserving the past contents of the DB and the history of its schema
Applications maintenance while allowing legacy applications to access new contents
through old schema versions
DATA MODELS AND STRUCTURES MAINTENANCE (Marche 93) [9], (Sjoberg 93) [11], (Curino & Al. 08) [5,6]
F. A. Schreiber Experimental Methods ... Data Perspective
19
Conceptual/Logical level Schema evolution
Restructuring Optimization ……….
Observational study (analog to natural sciences) made on the evolution of the Wikipedia Database schema Goal:
Create a benchmark for schema evolution (and in general a standard relational DB dataset).
Extend the analysis to several other Open-Source WIS (Joomla!,TikiWiki, Slashcode, Zen-Cart, Wordpress)
Extend the analysis towards Public Scientific DB (Genome, HGVS)
EXPERIMENTS ON SCHEMA EVOLUTION: the Wikipedia case (Curino & Al. 08) [5,6]
F. A. Schreiber Experimental Methods ... Data Perspective
20
• Schema Evolution: • 170+ versions in 4.5 years • almost 250% increase • WIS evolve faster than Traditional IS • 38% w.r.t. [Sjoberg93] • 539% w.r.t. [Marche93]
EXPERIMENTS ON SCHEMA EVOLUTION: the Wikipedia case
F. A. Schreiber Experimental Methods ... Data Perspective
21
Previous queries on new schema
Major restructuring
EXPERIMENTS ON SCHEMA EVOLUTION: the Wikipedia case
F. A. Schreiber Experimental Methods ... Data Perspective
22
New queries on all previous schema versions
Major restructuring
DATA MODELS AND STRUCTURES MAINTENANCE (Babu & Al. 09)[1], (Davcev & Al. 08) [7]
F. A. Schreiber Experimental Methods ... Data Perspective
23
Physical level Tables
Sorting
Clustering
……….
Indexes Trees
Hashing
……….
Memory Shared buffers
Cache size
……….
EXPLORING DATABASE CONTENT
F. A. Schreiber Experimental Methods ... Data Perspective
24
EVERY HUMAN KNOWLEDGE STARTS FROM
INTUITIONS, PROCEEDS THROUGH CONCEPTS, AND
REACHES ITS CLIMAX WITH IDEAS
I. Kant
KNOWLEDGE HIERARCHY
F. A. Schreiber Experimental Methods ... Data Perspective
25
ELEMENTS (VOLUME)
VARIABLES
VALUE ADDED
EXPERIENCE STATISTICAL PROCESSING
KNOWLEDGE DISCOVERY
PROCEDURES
INVOICES
DATA
SALES TREND
INFORMATION
STRATEGIC DECISIONS
WISDOM
MARKET RULES
KNOWLEDGE
KNOWLEDGE DISCOVERY AND DATA MINING
F. A. Schreiber Experimental Methods ... Data Perspective
26
Knowledge Discovery in Databases and Data Warehouses To identify the most significant information To show it to the user in the most convenient way
Data Mining Algorithm application to raw data in order to extract
knowledge (relations, paths, …) Predictive aim (signal analysis, voice recognition, ecc.) Descriptive aim (decision support systems, natural sciences)
WHAT KIND OF INFORMATION DO WE GET?
F. A. Schreiber Experimental Methods ... Data Perspective
27
Associations Set of rules specifying the joint occurrence of two (or more)
elements Sequences
Possibility of stating temporal sequences of events Classifications
Grouping of elements into classes following a given model Clusters
Grouping of elements into classes which have not been defined a-priori
Trends Discovery of peculiar temporal paths having a forecasting
value
KNOWLEDGE DISCOVERY PROCESS (1)
F. A. Schreiber Experimental Methods ... Data Perspective
28
Even if specialized tools are available it requires A competence in used techniques A very good application domain knowledge
Sequential steps Selection
Choice of the sample data the analysis shall be focused on
Preprocessing Data sampling in order to reduce their volume Data scrubbing for errors and omissions
KNOWLEDGE DISCOVERY PROCESS (2)
F. A. Schreiber Experimental Methods ... Data Perspective
29
Transformation Data types homogeneization and/or conversion
Data mining Choice of the method/algorithm
Interpretation and evaluation Retrieved information filtering Possible refining by previous steps repetition Search results visual presentation (graphical or logical)
KNOWLEDGE DISCOVERY PROCESS (3)
F. A. Schreiber Experimental Methods ... Data Perspective
30
RAW DATA
TARGET DATA
PRE- PROCESSED
DATA TRANSFORMED DATA
CORRELATIONS AND PATHS
KNOWLEDGE
SELECTION
PRE-PROCESSING
TRANSFORMATION
DATA MINING
INTERPRETATION
source: G. Piatesky-Shapiro 1996
DATA MINING ALGORITHMS
F. A. Schreiber Experimental Methods ... Data Perspective
31
Model representation Formalisms to represent and describe possible paths
Model evaluation Statistical or logical estimate of the correspondence of a path to the search criteria
Search method Of parameters
Search of the parameters which optimize the evaluation criteria, the observations set and the model representation being given
Of model The parameters are applied to models belonging to the same family, differentiated by the representation type, for quality evaluation
THE “MARKET BASKET” MODEL
F. A. Schreiber Experimental Methods ... Data Perspective
32
The best-known model on which data mining techniques are applied
Mainly, but not exclusively, used for retail sale
problems The goal is to discover recurrent patterns in data
(association rules)
THE “MARKET BASKET” MODEL
F. A. Schreiber Experimental Methods ... Data Perspective
33
I = {i1, ..., ik} SET OF k ELEMENTS (ITEM)
B = {b1, ..., bn} SET OF n SUBSETS (BASKET) OF I
bi ⊆ I
I Goods in a supermarket
Words in a dictionary
B
A customer’s purchase
A document in a corpus
ASSOCIATION RULE i1 ⇒ i2 i1 AND i2 SHOW TOGETHER IN AT LEAST s% OF THE n BASKET (SUPPORT)
OF ALL THE BASKETS CONTAINING i1 AT LEAST c% CONTAIN ALSO i2 (CONFIDENCE)
THE “MARKET BASKET” MODEL: ANY PROBLEM?
F. A. Schreiber Experimental Methods ... Data Perspective
34
c COFFEE IS IN THE BASKET c NO COFFEE IN THE BASKET
t TEA IS IN THE BASKET t NO TEA IN THE BASKET
c c
t
t
Σ rows
Σcolumns
20 5 25
70 5 75
90 10 100
WARNING! A CORRELATION EXISTS BETWEEN TEA AND COFFEE r = P[t ∧ c] / (P[t] x P[c] ) = 0.89
t ⇒ c IS TRUE???
s = 20% c = P[t ∧ c] / P[t] =20/25= 80%
PERHAPS, BUT ... THOSE WHO BUY COFFEE ANYHOW REACH 90% !!!
CLASSIFICATION PROBLEM : AN EXAMPLE
F. A. Schreiber Experimental Methods ... Data Perspective
35
AGE CAR TYPE RISK 17 sports high 43 family low 68 family low 32 truck low 23 family high 18 family high 20 family high 45 sports high 50 truck low 64 truck high 46 family low 40 family low
AGE CAR TYPE 22 family 60 family 35 sports
AGE CAR TYPE 22 family 60 family 35 sports
CLASS
high
high low
MINE CLASSIFICATION
TEST
1. IF Age ≤ 23 THEN Risk IS High; 2. IF CarType = sports THEN Risk
IS High; 3. IF CarType IN {family, truck} AND
Age > 23 THEN Risk IS Low; 4. DEFAULT Risk IS Low
EFFECTIVENESS
F. A. Schreiber Experimental Methods ... Data Perspective
36
No established results on metric and methodologies Application dependent
Context Physical Sociological
User psychology
DATA QUALITY DIMENSIONS (Cappiello & Schreiber 12) [4]
ACCURACY the degree of conformity of a measured or computed quantity to its actual (true) value (|vavg-vref| < εacc)
PRECISION the degree to which repeated measurement show the same or similar results
(small variance 1/n*ΣNn=1 (vn – μ)2 < εprec )
TIMELINESS
CURRENCY the time interval from the instant the value was sampled to the instant at
which it is sent to the base station
VOLATILITY the amount of time during which data remain valid
Timeliness = max(1 − Currency/Volatility; 0)s
F. A. Schreiber Experimental Methods ... Data Perspective
37
BASIC PRINCIPLES OF A PROPOSED AGGREGATION ALGORITHM
Accuracy is represented by the window height Values falling within the window can be considered similar
enough to be fairly represented by their average Values falling outside the window are outliers Outliers can be occasional or consecutive: in any case
outliers information must be preserved for further investigation
v
t
vref
vref+ εacc
Vref- εacc
x x x
x
F. A. Schreiber Experimental Methods ... Data Perspective
38
CONSIDERED CASES
OSCILLATORY / BURSTY
EXPECTED TREND SLOW CHANGE
v
t
vref
vref+ εacc
vref- εacc
(b)
v
t
vref
vref+ εacc
vref- εacc
W
H (a)
OUTLIER
By considering Z aggregate values and J outliers out of a set of N measures, the algorithm is considered efficient if the output is composed by (Z+J) values instead of N where (Z+J)<<N
F. A. Schreiber Experimental Methods ... Data Perspective
39
ALGORITHM BANDWIDTH
Compressing data amounts to lowering the bandwidth of the measurement system
The window width determines the number of measured values which are aggregated 1 point window no compression max bandwidth
The window width also determines the timeliness by which data are delivered to the base station
F. A. Schreiber Experimental Methods ... Data Perspective
40
ALGORITHM INPUT/OUTPUT
INPUT PARAMETERS TIME SERIES V = <v1, v2, … vn> EXPECTED VALUE vref
ACCURACY TOLERANCE εacc
PRECISION TOLERANCE εprec
WINDOW WIDTH N CONTINUITY INTERVAL C
OUTPUT PARAMETERS
AGGREGATE VALUES T = < a1,t1 >; < a2,t2 >; … < az,tz > OUTLIERS O = < o1,t1 >; < o2,t2 >; … < oj,tj >
ALGORITHM COMPLEXITY ALGORITHM FOOTPRINT O(N) 11 KB RAM; 1 KB ROM
F. A. Schreiber Experimental Methods ... Data Perspective
41
EXPERIMENTAL SET UP
+ -
R R i
v1
∆V=v2
Z(t)
100 Ώ < ZR(t) < 1000 Ώ (measured)
R = 1 Ώ
0 mV < ΔV < 30 mV
0 mA < i < 30 mA (Data sheet)
R + ZR ≈ ZR
F. A. Schreiber Experimental Methods ... Data Perspective
42
7 TRANSMITTED VALUES , 30mJ 60 TRANSMITTED VALUES , 120mJ
ALGORITHM BEHAVIOUR
WITH AGGREGATION WITHOUT AGGREGATION (BYPASS)
7 TRANSMITTED VALUES , 30mJ 60 TRANSMITTED VALUES , 120mJ
70% ENERGY SAVINGS
F. A. Schreiber Experimental Methods ... Data Perspective
43
COMPARISON CRITERIA (1/2)
Two real world data sets have been processed by using the algorithm proposed and two other aggregation algorihms: I. Lazaridis, S. Mehrotra, Capturing Sensor-Generated
Time Series with Quality Guarantees, in: ICDE, 2003, pp. 429–439.
T. Schoellhammer, E. Osterweil, B. Greenstein, M. Wimbrow, D. Estrin, Lightweight Temporal Compression of Microclimate Datasets, in: LCN, 2004, pp. 516–524.
F. A. Schreiber Experimental Methods ... Data Perspective
44
COMPARISON CRITERIA (2/2)
The comparison among algorithms have been based on three main criteria: Compression rate: the degree with which data have been
aggregated. Energy savings: the degree with which the aggregation
allows sensors to save energy with respect to the case in which all the original values are sent to the base station.
Correctness: the degree with which the aggregated data allow the base station to retrieve the original trend. Correctness has been evaluated by using the Mean Absolute Error (MAE) and the related Mean Absolute Percentage Error (MA%E).
F. A. Schreiber Experimental Methods ... Data Perspective
45
DATA SET (A) RESULTS
0,13
0,14
0,15
0,16
0,17
0,18
0,19
0 20 40 60 80 100 120 140 160
CappielloandSchreiber
[V]
[t] [V]
F. A. Schreiber Experimental Methods ... Data Perspective
C2N2 absorption spectrum
46
a b c
DATA SET (A) RESULTS
0,13
0,14
0,15
0,16
0,17
0,18
0,19
0 20 40 60 80 100 120 140 160
Lazaridiset al.
[t]
[V]
F. A. Schreiber Experimental Methods ... Data Perspective
47
a b c
DATA SET (A) RESULTS
0,13
0,14
0,15
0,16
0,17
0,18
0,19
0 20 40 60 80 100 120 140 160
Schoellhammer et al.
[V]
[t] a b c
F. A. Schreiber Experimental Methods ... Data Perspective
48
DATA SET (A) RESULTS
60,00%
65,00%
70,00%
75,00%
80,00%
85,00%
90,00%
[Authors] [Lazaridis et al.] [Schoellhammer et al.]
Compression rate
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
[Authors] [Lazaridis et al.] [Schoellhammer etal. ]
Energy Reduction
MAE in case of non linear trends
49
F. A. Schreiber Experimental Methods ... Data Perspective
DATA SET (B) RESULTS
-8
-6
-4
-2
0
2
4
6
0 20 40 60 80 100 120 140 160[t]
CappielloandSchreiber
Input dataset
F. A. Schreiber Experimental Methods ... Data Perspective
50
C2N2 absorption spectrum FM
Systematic error due to the processing time shift
DATA SET (B) RESULTS
-8
-6
-4
-2
0
2
4
6
0 20 40 60 80 100 120 140 160[t]
Lazaridiset al.
Input DataSet
51
F. A. Schreiber Experimental Methods ... Data Perspective
DATA SET (B) RESULTS
-8
-6
-4
-2
0
2
4
6
0 20 40 60 80 100 120 140 160[t]
Schoellhammeret al.Input data set
52
F. A. Schreiber Experimental Methods ... Data Perspective
DATA SET (B) RESULTS
65,00%
70,00%
75,00%
80,00%
85,00%
90,00%
95,00%
[Authors] [Lazaridis et al.] [Schoellhammer etal. ]
Compression rate
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
[Authors] [Lazaridis et al.] [Schoellhammer etal. ]
Energy Savings
F. A. Schreiber Experimental Methods ... Data Perspective
53
SUMMARY COMPARISON AND COMMENTS
No single algorithm is «the best» Transmission procedures with packed based protocols can
affect the analysis Higher packing factors improve energy efficiency Higher transmission delays negatively affect timeliness
Adaptable procedures should be used on the basis of The peculiar features of the signals to be processed The quality requirements of the applications
F. A. Schreiber Experimental Methods ... Data Perspective
54
54
Programs and Data
F. A. Schreiber Experimental Methods ... Data Perspective
55
Philosophy without Science is empty,
Science without Philosophy is
blind I. Kant
PARAPHRASE Programs without Data are empty, Data without Programs are blind F. A. Schreiber
SUMMARY AND CONCLUSIONS
F. A. Schreiber Experimental Methods ... Data Perspective
56
Experiments on Databases and DBMSs for optimizing data structures and management including Data Quality
Data organization and management as a service to the experiments of the scientific community
Experimenting with the Database content itself (data mining)
Experimentation is both: a science because it requires formal and rigorous
methodologies, languages, and instruments an art because it requires intuition, phantasy, and …
it gives emotions
BIBLIOGRAPHICAL REFERENCES
F. A. Schreiber Experimental Methods ... Data Perspective
1. Babu S. et Al. – Automated Experiment-Driven Management of (Database) Systems – Proc. 12th HotOS, pp. 1 – 5, 2009
2. Boral H., DeWitt D. J. – A Methodology for Database Systems Performance Evaluation – SIGMOD Record, Vol. 14, n. 2, pp. 176-185, 1984
3. Brown D. et Al. – High energy nuclear database: a testbed for nuclear data information technology – Int. Conf. On nuclear data for Science and Technology, art. 250, 2007
4. Cappiello C., Schreiber F.A. - Experiments and analysis of quality and Energy-aware data aggregation approaches in WSNs - 10th Int. Workshop on Quality in Databases QDB 2012, Istanbul, Aug. 26, 2012, pp. 1- 8 http://www.purdue.edu/discoverypark/cyber/qdb2012/papers/7data%20aggregation.pdf
5. Curino C. et Al. – Schema Evolution in Wikipedia: Toward a Web Information System Benchmark – Proc. ICEIS, pp. 323 – 332, 2008
6. Curino et Al. – Graceful Database Schema Evolution: the PRISM Workbench – Proc. VLDB’08, pp. 761 – 772, 2008
7. Davcev D. et Al. – Experiments in Data Management for Wireless Sensor Networks – Proc. 2° Int. Conf. on Sensor Technologies and Applications , pp. 198 – 202, 2008
8. Manolescu I. et Al. - The Repeatability Experiment of SIGMOD 2008 - SIGMOD Record, Vol. 37, n. 1, pp. 39 – 45, 2008
57
BIBLIOGRAPHICAL REFERENCES
F. A. Schreiber Experimental Methods ... Data Perspective
58
9. Marche S. – Measuring the stability of data models – European Journal of Information Systems, Vol.2, n.1, pp. 37 – 47, 1993
10. Masseroli M. - Management and Analysis of Genomic Functional and Phenotypic Controlled Annotations to Support Biomedical Investigation and Practice - IEEE Transactions on Information Technology in Biomedicine, Vol. 11, n. 4, pp. 376-385, 2007
11. Sjoberg D. I. – Quantifying schema evolution – Information asnd software technology, Vol. 35, n. 1, pp.35 - 44, 1993
12. Stoeckert C. et Al. – Microarray databases: standards and ontologies – Nature genetics, Vol. 32, pp. 469 – 473, 2002
13. Szalay A., Gray J. – The world-wide telescope – Science, Vol. 293, pp. 2037 – 2040, 2001
14. Vanschoren J., Blockeel H. – Experiment Databases - In: Dzeroski S., Goethals B., Panov P. (Eds.), Inductive Databases and Queries: Constraint-based Data Mining, Chapt. 14, Springer, pp. 335 - 360, 2010
15. Schwartz B. – The four fundamental performance metrics – PERCONA, 2011 http://www.mysqlperformanceblog.com/2011/04/27/the-four-fundamental-performance-metrics/
16. http://www.tpc.org/information/benchmarks.asp