european organization for nuclear research capturing, storing and using time-series data for the...

European Organization For Nuclear Research

Capturing, storing and using Capturing, storing and using time-series data for the world's largest time-series data for the world's largest

scientific instrumentscientific instrument

The LHC Logging ServiceThe LHC Logging Service

By Chris Roderick

UKOUG Conference 2006, Birmingham

14-11-2006 The LHC Logging Service - UKOUG Conference 2006 2

OutlineOutline

Introduction to CERN The LHC Logging Project LHC Logging Service Architecture LHC Logging Service Design & Implementation Database Techniques Used Lessons Learned Conclusion Questions


CERNCERN European Organization for Nuclear ResearchEuropean Organization for Nuclear Research

Founded in 1954 by 12 countries Today: 20 member states CERN does pure scientific research into the laws of nature Providing to >7000 users (particle physicists) from all over the world:

Accelerators accelerate particles to almost the speed of light Detectors make the particles visible

The birthplace of the World Wide Web


CERNCERN

Geneva Airport LHC accelerator

CERN Main site (CH)SPS accelerator

CERN 2nd site (F)

France

Switzerland

PS accelerator complex


The Large Hadron Collider (LHC)The Large Hadron Collider (LHC) Most powerful instrument ever built to investigate particles properties 27km circumference, 100m underground 4 cathedral size underground caverns hosting huge detectors The highest energy of any accelerator in the world The most intense beams of colliding particles Operating at a temperature just above absolute zero


OutlineOutline



The LHC Logging ProjectThe LHC Logging Project Need to know what’s going on in and around the LHC

Log heterogeneous time series data:Cryogenics temperatures, magnetic field strengths, power dissipation, vacuum pressures, beam intensities and positions…etc…

Integration with other services (e.g. alarms): Coherent data time-stamping Consistent identification of the data

Oracle has a long history at CERN (since 1983 version 2.3) Decision to go with latest available Oracle technology:

Oracle 9i Database Oracle Application Server Oracle Enterprise Manager

Logging Project started in 2001 First operational implementation used in autumn 2003


The LHC Logging ProjectThe LHC Logging Project Exponential increase in data volumes Stabilise after 1st year of LHC operation ~5TB per year


OutlineOutline



LHC Logging Service

ArchitectureArchitecture

<Data Source>

<Data Source>

Logging DB

Measurement DB

Technical Services DB<Data Source>

WRITEAPI

XML

HTTP

HTTP

READAPI

WRITEAPI

READAPI

JDBC

JDBC

JDBCJDBC

JDBC

JDBC

PL/SQL

PL/SQL

<Data User>Custom Java Applications

<Data User>Generic Usage

XML Data Sending SAX parser extracts data as objects Java API loads data

10g AS

Oracle RAC Database Host 2 x SUN Fire V240 Dual 1 GHz CPU 4 GB RAM Dual internal disks Dual power supply SUN StorEdge 3510FC disk array 2-Gb fiber channel bus Dual RAID controllers 1 GB cache 12 x 146 GB 10k rpm FC disks 24 x 300 GB 10k rpm FC disks (2007) SUN Solaris 9 Veritas Volume Manager 3.5 Oracle9i Enterprise Edition Release 9.2.0.7.0

60,000,000 records / day


135,000 records / day



OutlineOutline



Metadata Tables Variables for which time

series data will be logged Users User-Variable relations

Variable groupings

Time series Tables Index-Organized Range Partitioned

Tablespace per range Timestamp data type Table per data type

Time Series DataRange Partitioned

Audit & Error Tables Error definitions Error logging Administrator notifications Audit data:

Who did what When How long it took

VARRAY of NUMBER


Partitioned Measurement DataPartitioned Measurement Data Time series data tables in Measurement Database, also implemented as

range-partitioned IOTs Benefit for administration and query performance 7 days of data are kept table may contain several billion rows Data is organized in daily range partitions Daily DBMS_JOB executes a procedure:

adding a new partition dropping oldest partition, using UPDATE GLOBAL INDEXES

(avoids the need to rebuild the global table index as an offline operation)

Today’s DataDrop Add

P-6 P-4P-5 P-3 P-2 P-1 P P+1 P+2 P+3 P+4 P+5P-7P-8

Range-Partitioned IOT Time Series Data Table


SELECT variable_id, count(*) repeating_cntFROM ( SELECT variable_id, value, LAG(value) OVER (PARTITION BY variable_id ORDER BY utc_stamp) last_value FROM data_numeric WHERE variable_id IN ( SELECT variable_id FROM meta_variable WHERE user_name = p_user_name AND datatype_name = l_dn_datatype ) AND utc_stamp >= p_start_stamp AND utc_stamp < p_end_stamp)WHERE value = last_value GROUP BY variable_id;

SELECT COUNT(*) repeating_dn_cntFROM ( SELECT value, LAG(value) OVER (PARTITION BY variable_id ORDER BY utc_stamp) last_value FROM data_numeric WHERE variable_id IN ( SELECT variable_id FROM meta_variable WHERE user_name = p_user_name AND datatype_name = l_dn_datatype ) AND utc_stamp >= p_start_stamp AND utc_stamp < p_end_stamp)WHERE value = last_value;

Automated TasksAutomated Tasks Check available storage (used space and partition creation) Check for broken jobs Gather CBO statistics Calculating system usage statistics:

No. records logged, per user, per data type during the previous day No. of repeating values


Data ExtractionData Extraction

Correlation of logged data from different systems key functional requirement

Dedicated Java Web applications based on the Apache Struts framework provide: Time series data statistics Data extraction to file, in various formats Data visualization in interactive charts


Time ScalingTime Scaling Need to correlate asynchronous data from different systems

Scale-upsummary information from finer granularity information

Scale-down finer granularity information from coarser granularity information

Since 9i, no built in PL/SQL time scaling (formerly TimeSeries cartridge)

Time-scaling functions implemented in PL/SQL using analytic functions LAG and LEAD: Scale-down Repeat Scale-down Interpolate Scale-up MIN/MAX/AVG


Time ScalingTime Scaling Accessible via PL/SQL functions, returning a TABLE (user defined SQL

Type) containing the time scaled data

TABLE data is returned row by row using the PIPELINED clause (no need to wait for the complete set of time scaled data to be derived)

Example SCALEUP Function usage:

SELECT stamp, min_value, max_value, avg_value FROM TABLE( TIME_SCALE.SCALEUP( 24288, -- ID of variable for which data is to be scaled 1, -- No. of intervals between generated timestamps 'HOUR', -- Type of interval between generated timestamps '2006-08-22 00:00:00',-- Desired start time for generated data '2006-08-22 23:59:59',-- Desired end time for generated data 'utc_stamp', -- Name of column containing the raw timestamps 'value', -- Name of column containing the raw values 'data_numeric', -- Name of table containing time series data 'variable_id' -- Name of column containing variable ID

)) ORDER BY stamp;


Show Time Scaling in TIMBERShow Time Scaling in TIMBER


Exception HandlingException Handling

Extremely important – deals with abnormal situations System administrators quickly aware of problems Event and conditions which caused problem are captured Allows APIs to behave in a predictable manner Logging Service uses an adaptation of the exception

handling framework proposed by S. Feuerstein in Oracle Magazine (May – August 2003)

Deals with: Exception definitions (name, description, severity) Exception capturing (what, who, when, parameter values) Exception treatment (log, mask, raise) Administrator notifications (depending on severity)


OutlineOutline



Database Techniques UsedDatabase Techniques Used MERGE

Used to simultaneously insert and update batches of metadata Used for time series data loading on ORA-00001 events Combined with UPDATE triggers – capture client software or

hardware problems without fatal exceptions

MERGE INTO meta_variables USING ( SELECT :var_name vn, :var_desc vd, :var_units vu FROM dual)ON (variable_name = vn)WHEN MATCHED THEN UPDATE SET description = vd, unit = vuWHEN NOT MATCHED THEN INSERT ( variable_id, variable_name, description, units) VALUES ( meta_variable_seq.nextval, vn, vd, vu);


Database Techniques UsedDatabase Techniques Used SQL Types

Used as tools for loading and querying data Custom time scaling functions return SQL table types INSERT – SELECT statements based on data in SQL table types

CREATE TYPE table_of_varchar AS TABLE OF VARCHAR2(4000);

SELECT column_value||' unregistered'FROM TABLE( CAST(:p_var_names AS table_of_varchar) )vWHERE column_value NOT IN ( SELECT variable_name FROM meta_variables);


Database Techniques UsedDatabase Techniques Used BULK PL/SQL

Used by PL/SQL data loading API BULK COLLECT for loading collections FORALL for inserting data Significant performance gains (depending on LIMIT clause value)

PROCEDURE LOG_LOGGING_DN IS l_data logging_data_loading.data_numeric_tab; -- Collection of logging data CURSOR c IS SELECT variable_name, utc_stamp, value FROM to_log_data_numeric;

BEGIN

OPEN c; LOOP FETCH c BULK COLLECT INTO l_data LIMIT 100; EXIT WHEN c%NOTFOUND; /* Send the data to the Logging DB 100 records at a time */ logging_data_loading.LOG_NUMERIC_DATA(l_data); END LOOP; CLOSE c;

………


Database Techniques UsedDatabase Techniques Used INTERVAL Data Types

Used in PL/SQL, SQL queries, and DBMS_JOB scheduling Add clarity to code Allow arithmetic to be carried out upon them – simplifies establishing

time elapsed between consecutive client data loading sessions

CASE /* Log the value if it is outside the fixed logging interval */

WHEN l_vars(i).last_logged + NUMTODSINTERVAL( l_vars(i).fixed_log_interval_size, l_vars(i).fixed_logging_interval_type ) < l_data(j).utc_stamp THEN INSERT INTO to_log_data_numeric (variable_name, utc_stamp, value) VALUES ( l_vars(i).variable_name, l_data(j).utc_stamp, ROUND(l_data(j).value, l_vars(i).rounding) );

WHEN …………

DBMS_JOB.INTERVAL(:job_no, 'SYSDATE + 1/24/4');VsDBMS_JOB.INTERVAL(:job_no, 'SYSDATE + INTERVAL ''15'' MINUTE');


Database Techniques UsedDatabase Techniques Used

Multi-table Inserts Used when inserting certain metadata Remove the need for procedural logic, and multiple INSERT

statements in the data loading API

INSERT FIRST WHEN dtype IN ('NUMERIC', 'NUMSTATUS', 'TEXTUAL', 'VECTORNUMERIC') THEN INTO prof_variables(user_prof_id, variable_id) VALUES (profid, varid) WHEN dtype = 'FUNDAMENTAL' THEN INTO prof_fundamentals(user_prof_id, fundamental_id) VALUES (profid, varid) SELECT :pid profid, :vid varid, :dt dtype FROM DUAL;


Database Techniques UsedDatabase Techniques Used

Analytic Functions Used for Time Scaling, Auditing, Data Derivation and Filtering Brings a new power to SQL obtain results previously only possible

using procedural code Give better performance, and reduce development time

Find old range partitions to be dropped based on convention of naming partitions as <table_alias>_PARTYYYYMMDD e.g. DN_PART20061114

SELECT table_name, partition_name FROM ( SELECT partition_name, table_name, RANK() OVER(PARTITION BY table_name ORDER BY partition_name DESC) p_rank FROM user_tab_partitions)WHERE p_rank > p_max_parts ORDER BY table_name, p_rank DESC;


OutlineOutline



Lessons LearnedLessons Learned

Handling large data volumes Range partitioned, index-organized tables

Good query performance Simplified administration

Storing arrays of numeric data using VARRAY Advantage of storing the data in it’s original order Not supported over database links (need for PL/SQL type conversion) Problematic for bulk data processing operations, where VARRAY size can

vary greatly between records No native means of comparing content of VARRAYS


Lessons LearnedLessons LearnedGetting the data access right Variety of data providers, with varying data volumes and rates

accurate CBO statistics is an ongoing challenge Combinations of skewed data and bind variable peeking Temporary fix: forcing execution paths

Impedance mismatch between OO developers and database developers Controlled database access via APIs is essential

Analytic functions – simplify development, improve performance

Knowing what’s going on, and keeping control Enterprise Manager Application Server Control 10g (9.0.4.3.0)

Intuitive, reliable, fulfils requirements for OAS administration Functionality removed / added in the subsequent version

Client and system audit data is invaluable


OutlineOutline



ConclusionConclusion The Logging Project operational Logging Service, used by an

increasing number of clients… Vital to monitor client behaviour and overall service performance Design has been validated, but the implementation continues to evolve in sync

with Oracle technology Scalability has been anticipated within the design, but additional hardware

needs to be foreseen and plugged in to meet future demands

Database tuning can be improved, particularly CBO statistics strategy

With hindsight – re-consider data storage using objects

Convinced to have followed the right path, we are looking forward to the full exploitation of the Oracle-powered logging service when CERN’s flagship LHC produces its first scientific results


QuestionsQuestions


Additional SlidesAdditional Slides


The LHC Logging ProjectThe LHC Logging ProjectThe need for time-series data logging The LHC will be an installation of unprecedented complexity, with dense high-energy

particle beams that can potentially damage the equipment of the machine itself. Hundred-thousands of signals coming from equipment surveillance and beam

observation will be monitored to carefully examine the behaviour of this machine. Cryogenics temperatures, magnetic field strengths, power dissipation, vacuum

pressures, beam intensities and positions are all parameters that will influence the correct functioning and performance of the installation as a whole.

Each subsystem has its own data acquisition system to track parameter values over time. In order to be able to correlate this heterogeneous information, the data is “logged” in a central database, allowing end-users to:

Visualize and extract any data Compare over time Examine trends, Find patterns Discover unforeseen parameter relationships.

The need and value for such centralized logging was already demonstrated on previous occasions:

LEP Logging: data logging for CERN’s previous flagship accelerator, the Large Electron-Positron Collider (LEP), operated from 1989 till 2000.

LHC-string Logging: data logging from the prototype LHC magnet test stand, operated from 2001 till 2003.

LHC Logging project was launched in October 2001, and the first operational implementation was used in autumn 2003.


The LHC Logging ProjectThe LHC Logging ProjectInitial system requirements Two types of clients acting as data providers:

Equipment hardware related – data is progressively and increasingly available before LHC start-up, as individual subsystems go through their hardware commissioning

Particle beam related – data will only become available when the LHC is fully commissioned with the green light to receive the first particles from the injecting accelerator chain

Rough figures on data volumes – exponential increase foreseen to stabilise after first year of beam operation at around 5TB per year

Historic data of previous years will be kept on-line in the same data store, in order to facilitate data comparison over the lifetime of the LHC

Throughput and latency performance requirements could not be quantified at project start design had to anticipate scalability

Overall software environment Java, J2EE Supervisory control and data acquisition (SCADA) systems on top of programmable logic

controllers (PLC). Heterogeneous data providers due to the specificities of the individual data acquisition

systems.


The LHC Logging ProjectThe LHC Logging ProjectIntegration with other projects The logging service is not an isolated tool to monitor the performance of

the LHC and its subsystems Functionality needs to be combined with other services to get a more

complete picture of the machine behaviour The Logging service is designed to be a steady, continuous, long-term

data capturing tool, for a wide range of clients The following services have complimentary functionality: Post-Mortem service: collects and analyzes transient recordings, i.e.

buffers of rapid data points (kilohertz range) from a set of crucial LHC subsystems, triggered by special events

Alarm service: centralizes notifications that indicate malfunctioning of any component to inform control room operators, and incite them for immediate action

Two prerequisites to allow a smooth integration of the services have been enforced: Coherent time-stamping Consistent identification of the data


Oracle Environment at CERNOracle Environment at CERN

CERN was one of the first European companies to have purchased Oracle (1983, version 2.3)

Many operational systems developed with the Oracle database as a foundation

Oracle is the standard RDBMS at CERN Database administration managed by CERN IT department Decentralized software development teams are

responsible for database & software design Hardware and system software infrastructure is

established by IT according to specific database service requirements

In general, only terminal releases of Oracle versions are deployed as operational production platforms


Oracle Environment for LHC LoggingOracle Environment for LHC Logging Starting the logging project five years before the full blown operational

use of the service – the evolution of the technology needed to be taken into account.

Strategic decisions have been made from the outset - going for the latest technology, as soon as possible:

Database technology Design and implementation details: based on Oracle9i technology Early 2007: migration to Oracle 10g Release 2, Enterprise Edition

Application Servers Java is the application software platform for all new projects. In-depth evaluation was made in 2003-2004 to compare J2EE deployment solutions Final choice: Standalone Oracle Container for Java (OC4J) or Oracle Application

Server (OAS) Logging project opted for OAS

Enterprise Manager Allows simple monitoring, diagnostics and repair of the deployed service Anticipate that these actions will be partially performed by control room operators in

the future Only OAS instances are currently monitored; complete service (including database)

could potentially be managed in the future


Development & Runtime EnvironmentsDevelopment & Runtime Environments Development Tools

JDeveloper 10.1.3 is used for all Java development, and database modelling (replacing Oracle Designer). Oracle SQLDeveloper is used for SQL and PL/SQL development.

Three runtime environments, each containing the complete setup of: Database schemas Data loading APIs Data reading APIs Application Server(s)

Development: Developing and testing features to satisfy new requirements Upgrading existing implementations to use new techniques or technologies

Test: Identical to the production environment in terms of hardware and software deployment. Used by new clients of the logging service – allowing them to validate their software against our system. Opportunity to validate behaviour of new clients – ensuring they conform to logging service requirements:

Send data in batches Send data at appropriate frequencies Filter unchanged or uninteresting data Round numeric values to appropriate precisions

Logged data is either deleted after an elapsed time, or transferred to the production environment. Production:

For clients who have passed ‘validation’ in the TEST environment, and wish to continue logging data, to be stored beyond the lifetime of the LHC machine.


DiagnosticsDiagnostics Database – ViewDB (in-house tool):

Database activity User sessions Resource consumption

Application Server – Oracle Enterprise Manager Application Server Control: Monitor application server usage and performance Access server and application log files Control application server components Edit configuration files Modify Web application runtime parameters in various OC4Js

System Usage – Audit tables, populated by the various data loading APIs, database triggers, and automated procedures which analyse logged time series data: Monitor evolution of system usage Plan for new system resources in advance Identify bad clients Track system performance Identify the causes of bursts of activity


Evolution and Future DirectionEvolution and Future DirectionScalability – difficult to estimate final throughput and volume rates Not all subsystems are currently available. Fine details of data pushing capabilities of each subsystem are not known yet. Rationalization of information (e.g. filtering, data reduction) is of low priority to data providers. Peak loads may occur unexpectedly from inexperienced clients.

We can assume the eventual need to increase the resources of the service Number of application servers. Number of nodes in the database RAC cluster. Number of disks, and disk controllers.

No fundamental changes to the overall system design or the database design are foreseen.

Planned upgrades For the year 2007, an extra raw disk capacity of 7.2 TB will be installed.

Unexplored possibilities Clustering of application server hosts at software or hardware level. Automatic load balancing between the database servers. Data replication on a redundant system to ensure 100% uptime. Application notification of data modifications to avoid polling by external applications. Bringing in an Oracle consultant for system tuning.

european organization for nuclear research capturing, storing and using time-series data for the...

Documents

lhc logging project

year of lhc operation

large hadron collider

year slide

data oracle

birmingham slide

cern european organization

member states cern