bringing data science, xinformatics and semantic escience into the graduate curriculum (solicited)...

18
Bringing Data Science, Xinformatics and Semantic eScience into the Graduate Curriculum (solicited) EGU2012-11224 (EOS 6/ ESSI2.3) April 25, 2012, Vienna Peter Fox (RPI) [email protected] Tetherless World Constellation

Upload: austen-griffin

Post on 26-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Bringing Data Science, Xinformatics and Semantic eScience into the Graduate Curriculum (solicited)

EGU2012-11224 (EOS 6/ ESSI2.3)

April 25, 2012, ViennaPeter Fox (RPI) [email protected] World Constellation

tw.rpi.edu

Themes

Future Web•Web

Science•Policy•Social

Xinformatics•Data Science

•Semantic eScience

•Data Frameworks

Semantic Foundations•Knowledge Provenance

•Ontology Engineering Environments•Inference, Trust

Hendler

Fox

McGuinness

Multiple depts/schools/programs ~ 35 (Post-doc, Staff, Grad, Ugrad)

Application Themes

Govt. Data•Open

•Linked•Apps

Env. Informatics•Ecosystems

•Sea Ice•Ocean imagery

•Carbon

Health Care/ Life Sciences•Population Science•Translational Med

•Health Records

Hendler/ Erickson

Fox

McGuinness/Luciano

Platforms:Bio-nano tech centerExp. Media and Perf. Arts Ctr.Comp. Ctr. Nano. Innov.

Data Intensive

http://tw.rpi.edu/web/Courses

4

Data Information Knowledge

Context

PresentationOrganization

IntegrationConversation

CreationGathering

Experience

Data Science Xinformatics Semantic eScienceWeb Science

Also at RPI

• Data Science Research Center and Data Science Education Center

• http://www.rpi.edu/about/inside/issue/v4n17/datacenter.html– Over 35 research faculty, 5 post-docs, ? grad

students

• Data is one of Rensselaer Plans’ five thrusts

• Other key faculty– Fran Berman (VPR)– Jim Myers (Director CCNI)

Curriculum

• Web Science and IT – undergrad, and MSc. and PhD. (with science concentrations)

• Environmental Science with Geoinformatics concentration

• Bio, geo, chem, astro, materials - informatics

• GIS for Science

• Master of Science – Data Science (pending)

• Multi-disciplinary science program (2012) PhD in Data and Web Science

E.g. IT with Env. Sci.

• ERTH-1200 Geology II (4 credits) - spring

• CHEM-2250 Organic Chemistry I (4 credits) - spring  

• ERTH-2210 Field Methods (2 credits) - fall

• IENV-1920 Environmental Seminar (2 credits) - spring

• BIOL-2120 Intro. to Cell and Molecular Biology (4 credits) - spring

• IENV-4500 Global Environmental Change (4 credits) - fall

• ERTH-4180 Environmental Geology (4 credits) – spring

• ERTH-4963 Xinformatics (4 credits) – spring

• IENV-4700 One Mile of the Hudson River (4 credits) - fall

Geoinformatics concentration

• CSCI1000 - Computer Science I• CSCI1200 - Data Structures• CSCI2300 - Introduction to Algorithms or

ERTH 4750 - Geographic Information Systems in the Sciences

• CSCI4380 – Databases• CSCI4961 - Data Science• CSCI4960 – Xinformatics• ERTH 4980 – Senior Thesis

Web Science Learning Objectives

• Students will demonstrate knowledge and be able to explain the three different "named" generations of the web (a/k/a Web 1.0, Web 2.0, and Web 3.0) from mathematical, engineering, and social perspectives

• Students will demonstrate the ability to use the dynamic programming language Python to develop programs relating to Web applications and the analysis of Web data.

• Students will be able to understand and analyze key Web applications including search engines and social networking sites.

• Students will be able to understand and explain the key aspects of Web architecture and why these are important to the continued functioning of the World Wide Web.

• Students will be able to analyze and explain how technical changes affect the social aspects of Web-based computing.

• Students will be able to develop "linked data" applications using Semantic Web technologies.

Data Science Objectives

• To instruct future scientist how to sustainably generate/ collect and use data for their research as well as for others: data science.

• To instruct future technologists how to understand and support essential data and information needs of a wide variety of producers and consumers

• For both to know tools, and requirements to properly handle data and information

• Will learn and be evaluated on the full life-cycle of data and relevant methods, technologies and best practices.

10

Learning Objectives

• Develop and demonstrate skill in data collection and management

• Know how to develop and apply data models and metadata models

• Demonstrate knowledge of data standards• Develop and demonstrate the application of skill

in data science tool use and evaluation• Demonstrate the application of data life-cycle

principles and data stewardship• Demonstrate proficiency in data and information

product generation11

Xinformatics Objectives

• To instruct future information architects how to sustainably generate information models, designs and architectures

• To instruct future technologists how to understand and support essential data and information needs of a wide variety of producers and consumers

• For both to know tools, and requirements to properly handle data and information

• Will learn and be evaluated on the underpinnings of informatics, including theoretical methods, technologies and best practices.

12

Learning Objectives

• Through class lectures, practical sessions, written and oral presentation assignments and projects, students should:– Develop and demonstrate skill in development and

management of multi-skilled teams in the application of informatics

– Demonstrate ability to develop conceptual and logical information models and explain them to non-experts

– Demonstrate knowledge and application of informatics standards

– Demonstrate skill in informatics tool use and evaluation

13

Modern informatics enables a new scale-free framework approach

Semantic eScience Objectives

• Ontology Development, Merging and Validation• Semantic Language and Tool Use and

Evaluation• Use Case Development and Elaboration• Semantic eScience Implementation and

Evaluation via Use Cases• Semantic Application Development and

Demonstration• Group Project and Team Development, Use

Case Implementation and Evaluation

Discussion…

• Science and interdisciplinary from the start!– Not a question of: do we train scientists to be

technical/data people, or do we train technical people to learn the science

– It’s a skill/ course level approach that is needed

• Education and research semi-coupled• We must teach methodology and principles over

technology *• Data science must be a skill, and natural like using

instruments, writing/using codes• Team/ collaboration aspects are key **• Foundations and theory must be taught ***

18

Progression after progression

IT Cyber

Infrastructure

Cyber Informatics

Core Informatics

Science Informatics

Science, Societal Benefit Areas

Informatics

Example:

•CI = OPeNDAP server running over HTTP/HTTPS

•Cyberinformatics = Data (product) and service ontologies, triple store

•Core informatics = Reasoning engine (Pellet), OWL

•Science (X) informatics = Use cases, science domain terms, concepts in an ontology

RequirementsRequirements