web 2.0 elder matias cls – 09-04-28. what is web 2.0? in plain english …. automating tedious...

Download Web 2.0 Elder Matias CLS – 09-04-28. What Is Web 2.0?  In plain English …. Automating tedious tasks using web technology Tools to help people and software

If you can't read please download the document

Post on 19-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1
  • Web 2.0 Elder Matias CLS 09-04-28
  • Slide 2
  • What Is Web 2.0? In plain English . Automating tedious tasks using web technology Tools to help people and software collaborate
  • Slide 3
  • Scientific American May 2008 Science 2.0 The Risk and Reward of Web-Based Research --------------------------------- Our real mission isnt to publish journals but to facilitate scientific communication Timo Hannay Head of Web Publishing at Nature Publishing Group
  • Slide 4
  • ScienceStudio Elder Matias CLS 09-04-28
  • Slide 5
  • 5 User Access to Synchrotrons Who is the community that will use your platform? Synchrotrons are electron storage rings that emit high intensity photons that are used for experiments by a large scientific community (tens of thousands worldwide). Access is normally granted for single periods of 1-3 days in a half- year cycle. What couldnt your community do without the platform? Physical distances and episodic access prevent rapid scientific progress and limit scientific collaboration. Why was that a problem or limitation? Governments worldwide have invested >$2B in these facilities, yet the scientific outcomes could be optimised.
  • Slide 6
  • User Access to Synchrotrons What middleware was needed to resolve the limitations? Workflow management Engine for the User Office Web Portal for remote data access (during and post experiment) Enterprise Service Bus and SOA to integrate internal and external data analysis services How do your plans meet the needs Users will have frequent remote access to the VESPERS beamline at the Canadian Light Source under conditions where many collaborators can participate in the experiment. 6
  • Slide 7
  • Science Studio serves three purposes: Management of all aspects of a scientific experiment including data storage, collaboration with others, processing of data; Control of, or interaction with, remote experiments on the CLSI VESPERS Beamline and UWO Nanofabrication Laboratory and User Services (sample management, scheduling, peer review, user training) 7
  • Slide 8
  • 8 Team: People and Orgs Remote Control User Services System Deployment Integration System Architecture System Requirements Testing Data Analysis/Grid Computing User Office Software Scientific Workflow Engines
  • Slide 9
  • 9 Team: People and Orgs Dionisio Medrano Dylan Maxwell Daron Chabot Elder Matias Chris Armstrong John Haley Mike Bauer Stewart McIntyre Marina Suominen Fuller Jinhui Qin Nathaniel Sherry Yuhong Yan Zahid Anwar Ludeng (Eric) Zhao Dan Ni Yaofeng Xu
  • Slide 10
  • System Architecture Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP 1. VESPERS Beamline 2. EPICS control system 3. Beamline Control Module (BCM) 4. Web Application 5. Database 6. File Storage 7. Web Interface
  • Slide 11
  • VESPERS Beamline VESPERS Very Sensitive Elemental and Structural Probe Employing Radiation from a Synchrotron A bending magnet beamline on sector 6 at the Canadian Light Source synchrotron in Saskatoon, Saskatchewan. A hard x-ray microprobe with an energy range of 6 to 30keV. Techniques: X-Ray Fluorescence (XRF) & X-Ray Diffraction (XRD) Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP
  • Slide 12
  • VESPERS Endstation CCD Detector (XRD) Microscope MCA Detector (XRF)Sample
  • Slide 13
  • EPICS Low-level Control System EPICS Experimental Physics and Industrial Control System The standard control system at the CLS. EPICS consists of a network of Input-Output Controls (IOCs) which are connected to directly to devices. An IOC provides many Process Variables (PVs) which relate to either an input or output from a device and have a unique name. Channel Access (CA) is used to read or write to any PV without knowing which IOC provides the PV. More than 50,000 PVs in the CLS control system. Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP
  • Slide 14
  • Beamline Control Module (BCM) The BCM provides a high-level interface to the low-level control system (EPICS). Logical and physical separation of business logic and control logic. Virtual device abstraction that provides independence from low-level control system. Virtual devices can be logically organized into a device hierarchy. Basic devices can be combined to build more functional devices. Communication with external applications using two message queues (ActiveMQ). Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP
  • Slide 15
  • Web Application A J2EE Servlet application that provides a web-based interface Science Studio. Tools: Spring (MVC), iBATIS (ORM), JSecurity (Apache Ki), Apache Tomcat Divided into two parts: the Core application and the VESPERS beamline application. Core application is responsible for providing access to the business objects. VESPERS application is responsible for remote control of the VESPERS beamline. Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP
  • Slide 16
  • Database Metadata associated with the operation of a remote controlled beamline and the organization of experimental data collected on that beamline. A project is the top level organizational unit and is associated with a project team. A session defines a period of time allocated to a project team to conduct experiments. An experiment relates a sample and the technique being applied to that sample. A scan records the location of the acquired experimental data. Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP
  • Slide 17
  • Database Schema person project_person project_role project session laboratory sampleexperimentscan techniqueinstrument Instrument_technque facility
  • Slide 18
  • Experimental Data Storage Experimental data is stored at the CLS. Common directory structure shared with other beamlines. A large data storage facility is now operational at the University of Saskatchewan as part of WestGrid. Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP
  • Slide 19
  • VESPERS Web Interface Rich web interface to Science Studio and the VESPERS beamline. Designed to be used over commodity broadband internet. Developed for the Firefox web browser without any additional plugins or extensions. Known to work with other browsers, but requires the Canvas HTML tag. AJAX is used for the VESPERS interface to provide device values in pseudo real time. ExtJS, a JavaScript framework, provides many advanced GUI elements. Web Application Beamline Control Module DB SAN JMSCA VESPERS HTTP
  • Slide 20
  • Beamline Setup
  • Slide 21
  • Experiment Setup
  • Slide 22
  • XRF (X-Ray Fluorescence)
  • Slide 23
  • Beamline Hutch Cameras
  • Slide 24
  • Experimental Data Viewer
  • Slide 25
  • User Office Workflow Goal: Many tasks in proposal & sample management at CLS To develop a workflow management system that manages ordering of tasks e.g. (training before shipping) Tracks manual as well as SS task progression Mar 6-month cycle CLS call for proposals Proposal submission To CLS CLS gathers proposals CLS reviews proposals CLS grants scientist Beamline time cientist packs sample I wonder if CLS received my sample yet? Scientist must complete Online SS training CLS health & safety inspection Many other tasks Perform Experiment Return Sample Take Survey
  • Slide 26
  • User office Workflow Status Workflow Management Engine Beamline User User Office Task :Training Completed Notify Approved Notify Record Progress Features Open source Petri-nets based Direct support for workflow control flow patterns Ability to interact with web services declared in WSDL Relies on XML standards e.g. XPath and XQuery for data & doesnt use proprietary languages Architecture System Core: YAWL engine. Engine instantiates specifications designed using YAWL designer. managed by the YAWL repository Environment composed of YAWL services inspired by web services paradigm, end-users, applications, and organizations are all services in YAWL.
  • Slide 27
  • Screenshot: User Training Test Creation
  • Slide 28
  • Screenshot: User Survey Taking Page
  • Slide 29
  • Screenshot: User Survey Edit Page
  • Slide 30
  • Screenshot: Workflow Sample Management
  • Slide 31
  • Screenshot: Workflow Call for Proposals
  • Slide 32
  • User Office Workflow Example Prototype Implementation 1. CLS issues a call for proposals and gives deadline 2. Beamline users submit proposals 3. User Office administrator ends registration or extends deadline 4. User Office administrator assigns proposals to user office reviewers 5. Reviewers look at proposals and rank them 6. User Office looks at ranking and chooses the proposals to accept 7. Accepted proposals contact persons are notified 8. Beamline User completes training (web service) 9. After training is completed (simulated by a delay) the CLS is notified
  • Slide 33
  • Scheduling Module Goal: To automate the review process and the method by which beam time is allocated and scheduled to users depending on the access mechanism chosen by the user and the stage of operation (construction, commissioning or operation) of the beamline. Side effects: Facilitate the management of cycles, runs and modes of operation Use automatic scheduling to handle more scheduling conditions and constraints than human beings are able to handle manually and identify optimal solutions.
  • Slide 34
  • Scheduling Module Features Users Submit proposals Integer Programming and Heuristic Algorithm Schedule INPUT: SEARCH AND CONSTRAINT SATISFIABILITY: OUTPUT: Beamlines2 Experiments3 Release Times[1,1,2] Deadlines[8,15,5] Weights[4,5,1] Processing Times[10,4,3] Eligibility[[0,1,0],[1,0,1]] CONSTRAINTS 1. One beamline per experiment 2. Start time after release time 3. Only eligible beamlines can be selected. 7. No overlap of experiment per beamline
  • Slide 35
  • X-Ray Fluorescence (XRF): Reveals Elemental Composition Characteristic Element Lines Selected and Mapped Over a 2D Scan Area S: K Cr: K & Cr: K Fe: K & Fe: K Ni: K & Ni: K 2D Maps Generated for Selected Elemental Lines
  • Slide 36
  • X-Ray Diffraction (XRD): Reveals Structural Information Peak Fitting and Indexing of Image Set to Create a Grain Orientation Map Peak Search Old IDL Programme Matched Peak New C Programme Matched Peak New C Programme Expected Peak The XRD Indexing programme examines the locations of peaks in an image in order to determine the kind of lattice structure the samples constituent atoms are arranged in. Shown here are the results of an older indexing programme written in IDL, and the new indexing programme, written in C. The new indexing programme is proving to be more versatile, and more reliable than the old programme, often indexing sets of data that the old programme failed with. Grain Orientations Indexing Process
  • Slide 37
  • High Performance Computing Elder Matias CLS 09-04-28
  • Slide 38
  • Is this about making processors faster? Moores Law has limited us There are also other fundamental limits We need to look at parallel computers
  • Slide 39
  • What is High Performance Computing? Special purpose machines, configured to solve complex problems Usually multi-processor (tens to thousands) Requires parallel programming Models Grid multi-machines inter-connected solving the same problem, Supercomputer multi-processor with shared memory
  • Slide 40
  • Limitation of Parallel Programming (Amdahls Law and Gustafsons Law) The degree to which a problem can be expressed using a parallel algorithm will limit the speedup achieved on a multi-processor machine. Amdahls Law P = % Parallelism S = Speedup (x sequential) N = number of processors
  • Slide 41
  • Examples . LHC LHC at CERN is an example of a grid application where no one county has sufficient processing capabilities 15 million gigabytes of data per year In 2006 LHC Tier 1 Grid was tested TRIUMF is the Canadian Tier 1 Centre for LHC Experiments Courtesy TRIUMF
  • Slide 42
  • How about in the synchrotron Community? Many synchrotrons understand the need for HPC Some of CLS users make use of WestGrid for Computation The New WestGrid data storage facility is intended to support CLS experiments and is located on campus UWO/ORNL/APS/CLS are working on a joint crystallography application SharcNet using the Cell environment
  • Slide 43
  • Diamond - Racks layout Courtesy: Nick Rees Diamond Oct/08
  • Slide 44
  • Diamond - Current situation Water pipes Cable Tray Courtesy: Nick Rees Diamond Oct/08
  • Slide 45
  • How do I get access to a HPC Machine? Compute Canada Responsible for High Performance Computing in Canada Each regional grid is a member of Compute Canada ACEnet Atlantic Canada CLUMEQ - Quebec SCINET - UofT HPCVL Queens, Royal Military Collage St. Lawrence, Carlson, Ottawa, RQCHP - Quebec SHARCNET - Ontario WESTGRID Western Canada
  • Slide 46
  • Grid Data Storage? UofS is the host for the new WestGrid data storage facility Cost: $3.2 M Includes on-line and archival storage Two sites on campus Photo: tape backup unit holding 6,000 tape (each @1TB)
  • Slide 47
  • IBM Cell Processor (3.2 GHz)
  • Slide 48
  • Slide 49
  • ANISE Elder Matias CLS 09-04-28
  • Slide 50
  • 50 ANISE: Active Network for Information from Synchrotron Experiments Active means near-instantaneous stream processing of complex data during transfer to the user or to storage. Cell processing using Infosphere Streams software from IBM and lightpath provided by CANARIE network. Distributed processing on facilities provided by SHARCNET and WESTGRID. Objective: Develop such a network to provide processed results from experiments such as Laue diffraction at APS (34-ID) and VESPERS at CLS The network would assist the integration of diffraction data from multiple and large area detectors. The network would facilitate faster resolution of research problems and free up time for more users. The network would encouage common data formats and protocols leding to closer collaboration.
  • Slide 51
  • 51 ANISE: Active Network for Information from Synchrotron Experiments Some project outcomes: 1 Accessibility of Laue diffraction methods to a greater number and variety of users could be achieved by reducing the time required to accumulate meaningful data. 2 The results of complex diffraction measurements involving a wider segment of angles could be assessed rapidly. 3. Data and experiment management processes of Science Studio could enable very brief follow-up experiments to answer crucial questions sometime later. 4. Distant collaborators could participate in, and learn from experiments on samples of critical importance to a project. 5. User support software could man a more rapid publications. 6. Expansion to include APS and NSLS beamlines.
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • X-Ray Fluorescence (XRF): Reveals Elemental Composition Characteristic Element Lines Selected and Mapped Over a 2D Scan Area S: K Cr: K & Cr: K Fe: K & Fe: K Ni: K & Ni: K X-Ray Diffraction (XRD): Reveals Structural Information Peak Fitting and Indexing of Image Set to Create a Grain Orientation Map The XRD Indexing programme examines the locations of peaks in an image in order to determine the kind of lattice structure the samples constituent atoms are arranged in. Shown here are the results of an older indexing programme written in IDL, and the new indexing programme, written in C. The new indexing programme is proving to be more versatile, and more reliable than the old programme, often indexing sets of data that the old programme failed with. Peak Search Indexing Process Grain Orientations Apply to Entire Data Set 2D Maps Generated for Selected Elemental Lines VESPERS Beamline Experimental Setup Sample Beam XRD Area Detector XRF Output XRD Output