ichass workshop seasr

24
SEASR Introduc.on High Performance Compu.ng in the Humani.es, Arts, and Social Science Workshop UIUC/NCSA July 28, 2008 LoreHa Auvil Na.onal Center for Supercompu.ng Applica.ons University of Illinois at Urbana Champaign

Upload: loretta-auvil

Post on 28-Nov-2014

872 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: ICHASS Workshop Seasr

SEASRIntroduc.on

HighPerformanceCompu.ngintheHumani.es,Arts,andSocialScienceWorkshop

UIUC/NCSAJuly28,2008

LoreHaAuvil

Na.onalCenterforSupercompu.ngApplica.onsUniversityofIllinoisatUrbanaChampaign

Page 2: ICHASS Workshop Seasr

SEASRGoalsThisprojectwillfocusondeveloping,integra.ng,deploying,andsustainingasetofreusableandexpandablesoPwarecomponentsandasuppor.ngframeworkthatwillbenefitabroadsetofdataminingapplica.onsforscholarsinhumani.es.

ThekeygoalsestablishedforthiseffortareasetofsoPwarecentricdirec.ves:

–  Supportthedevelopmentofastate‐of‐the‐artsoPwareenvironmentfordatamanagementandanalysisofdigitallibraries,repositoriesandarchives,aswellaseduca.onalplaVormsthatareexpectedtocontributetomanyofthehumani.esbreakthroughsofthe21stcentury.

–  Supportthecon.nueddevelopment,expansion,andmaintenanceofend‐to‐endsoPwaresystem–userinterfaces,workflowengines,datamanagement,analysisandvisualiza.ontools,collabora.vetools,andothersoPwareintegratedintoacompleteenvironment–tobringthefullpowerofdataanaly.cstothescholars.

–  Supporteduca.onandtrainingforuseofthissoPwareenvironmentforanalysisthroughworkshopstopromoteitsusageamongscholars.

Page 3: ICHASS Workshop Seasr

ProjectHighlights

•  SEASRwillemployacomprehensiveenvironmentthatintegratestwocomplementaryandrevolu.onarytechnicaladvances–ServiceOrientedArchitectureandSeman.cWeb,intoasinglecompu.ngarchitecture–Seman.cEnabledServiceOrientedArchitecture

•  SEASRaddressesthechallengesoftransforminginforma.onintoknowledgebyconstruc.ngthesoPwarebridgesthatarerequiredtomovefromtheunstructuredandsemi‐structureddataworldtothestructureddataworld

Page 4: ICHASS Workshop Seasr

WhatdoesthismeanfortheDHcommunity?

SEASRwill:

•  helpscholarsaccessexis.nglargedatastoresmorereadily•  providescholarswithenhanceddatasynthesisandquery

analysis

–  fromfocuseddataretrievalanddataintegra.on–  tointelligenthuman‐computerinterac.onsforknowledgeaccess

–  toseman.cdataenrichment–  toen.tyandrela.onshipdiscovery–  toknowledgediscoveryandhypothesisgenera.on

•  empowercollabora.onamongscholarsbyenhancingandinnova.ngvirtualresearchenvironments

Page 5: ICHASS Workshop Seasr

Seman.callyEnabledSOA

Page 6: ICHASS Workshop Seasr

Seman.callyEnabledSOA2

Page 7: ICHASS Workshop Seasr

TechnicalComponents

•  High‐LevelComponentRequirements–  Hardwareabstrac.on(virtualiza.on)–  Assetsstorageandcura.on–  Taskcrea.onanddefini.on(components)–  Processdescrip.on(flows)–  Openservicesandstandardizemetadataexchange–  Easyreachingtoanontechnicalcommunity(visualprogrammingandinterac.onUIs)

–  Socialinterac.onplaVormforresearchers–  NLP,machinelearning,andunderstandablevisualiza.ons

Page 8: ICHASS Workshop Seasr

TechnicalComponentsTechnicalarchitecturethatemphasizesflexibility,scalability,

modularity,providescommunityhubtoheterogeneoussystems,andreducespathdependence

•  Seman.c‐webdrivenarchitecturetostandardizeinteroperability

•  Designforcommunitybuildingandtoencouragesharingandpar.cipa.on

•  Data‐intensiveflowstomovefromasimpledesktoptoalargeclustertransparently

•  Movablecomputa.on.Computa.oncanbetransparentlyshippedtotheassets(complyingwithprivacyissues)

•  Quickre‐configurability(flowscanbeadaptedandreusedinseconds)

•  Buildtoreuseandcross‐fer.liza.onacrossdomains

Page 9: ICHASS Workshop Seasr

SEASRComponents

Virtualiza.onInfrastructure

HadoopFSSharedStores SOAGateways

MeandreInfrastructure

Visualiza.on

MetadataStores

ComponentRepository ComponentDiscovery

MeandreData‐IntensiveFlows

SEASRApps SEASRServicesSEASRPlugins SEASRWebApps

Analy.csData

GatewayConnec.onsDataPersistence

DataTransforma.on

Predic.veModelingDiscovery

NaturalLangProcessing

Char.ngModelingVis

InfoVis

Develop

erToo

ls

Page 10: ICHASS Workshop Seasr

SEASRApps:CommunityHub

Page 11: ICHASS Workshop Seasr

MoreCommunityHub

Page 12: ICHASS Workshop Seasr

CommunityHubImplementa.on

Implemen.ngCommunityHubfunc.onalityaswordpressplugins

Page 13: ICHASS Workshop Seasr

MoreCommunityHub

Page 14: ICHASS Workshop Seasr

MeandreWorkbenchDesign

Page 15: ICHASS Workshop Seasr

MeandreWorkbench

Page 16: ICHASS Workshop Seasr

SEASRApps:WebApp

•  Administra.ontool–  Future:Addsecuritylevels

•  Jobmanagementcontrol

•  Usermanagement/profile

•  Repositoryexplora.on

Page 17: ICHASS Workshop Seasr

MeandreInfrastructure

•  ComponentandFlowAPI•  Repository–  Future:VersioningofComponentsandFlows

•  Execu.onEngine–  Future:Parallelism,checkpoin.ng,faulttolerance,extendfiringpolicy

•  Debugger/Monitorforflowexecu.on•  ZigZag– Highlevellanguagefordescribingflows–  Interpreter/compilerforexecu.ngtheflows– Automa.cparalleliza.onatcomponentlevel

Page 18: ICHASS Workshop Seasr

MeandreInfrastructure

•  WebServiceOpera.ons– Callstotherepositoryforflowsandcomponents

– Current:REST– Future:SOAPenable

•  WebUI– Current:ComponentsusewebUIfragment(whichpasshtml)

– Future:Enablemorecomplexvisualcomponentsforlandscapeconstruc.on

Page 19: ICHASS Workshop Seasr

ComponentRepository

•  MeandreRepository– RDFdescrip.onsforcomponentsandflows

– Supportforrdfonlocalfile;webaccessiblefiles;jdbcenabledrela.onaldatabase(Derby)oratriplestore

– Supportforrdf,Hl,ntformats

Page 20: ICHASS Workshop Seasr

SEASRComponents:NLP•  Syntac.canalysis

–  Tokeniza.on–  POStagging–  Shallowparsing–  Customliterarytagging

•  Seman.canalysis–  NamedEn.tytagging–  Seman.cCategory(unnamed

en.ty)tagging–  Co‐referenceresolu.on–  Ontologicalassocia.on(WordNet,

VerbNet)–  Seman.cRoleanalysis–  Concept‐Rela.onextrac.on–  Logicalanalysis–  Eventsequenceinference–  Eventcausalinference

•  TopicFiltering–  bytopic–  by.meperiod–  byloca.on–  etc.

•  Seman.cnetwork–  Extractpredicate‐argument&other

triples–  ConverttoRDFtriples–  AddtriplestoRDFstore–  Posestructuredqueries–  Graph‐basedinference

•  Explora.on,DiscoveryandKnowledgeExtrac.on–  Query‐based–ques.onanswering–  Visual–naviga.on

Page 21: ICHASS Workshop Seasr

SEASRComponent:MachineLearning

•  DataTransforma.on–  Featureextrac.onandconstruc.on–  Boos.ngandBagging

•  UnsupervisedLearning–  Clustering,SOMs– HypothesisGenera.on

•  SupervisedLearning–  Tradi.onalSta.s.calLinearMethods–  Bayesian,SupportVectorMachines,DecisionTrees–  EnsembleModels

•  Op.miza.onApproaches– GAs

Page 22: ICHASS Workshop Seasr

Developers:EclipsePlugin

Page 23: ICHASS Workshop Seasr

SEASR@Work‐MONK

Page 24: ICHASS Workshop Seasr

SEASR@Work–NEMA