enroller colloquium: sulman sarwar

14
ENROLLER - An e- Research infrastructure for humanities researchers Sulman Sarwar Research Associate National e-Science Center University of Glasgow [email protected]

Upload: johanna-green

Post on 30-Jun-2015

497 views

Category:

Education


0 download

DESCRIPTION

Introduction to the Enroller Portal by Sulman Sarwar, University of Glasgow

TRANSCRIPT

Page 1: Enroller Colloquium: Sulman Sarwar

ENROLLER - An e-Research infrastructure for humanities researchers

Sulman SarwarResearch AssociateNational e-Science Center

University of Glasgow

[email protected]

Page 2: Enroller Colloquium: Sulman Sarwar

OUTLINE

Introduction

Current Work

DEMO

Future Work

Conclusions

Page 3: Enroller Colloquium: Sulman Sarwar

Introduction - ENROLLER ENROLLER - An Enhanced Repository for Language and Literature Researchers

JISC funded project (2009 - 2011)

National e-Science Center and University of Glasgow English Department

Data sets participating in the project

OED (Oxford English Dictionary), HTE (Historical Thesaurus of English), DSL (Dictionary of Scots Language), SCOTS (Scottish Corpus of Text and Speech), CMSW (Corpus of Modern Scottish Writing), NECTE (Newcastle Electronic Corpus of Tyneside English)

Page 4: Enroller Colloquium: Sulman Sarwar

Objectives• To develop an interactive, research infrastructure

providing seamless access to participating data collections

• A well designed easy-to-use search system, access to digital sound, video and textual data

• Develop tools for linguistic analysis (such as: concordancing, collocation and frequency analysis)

• Seamless secure access to licensed data; by developing automatically enforced access and usage policies

• Support for addition of new data collections

• Building large-scale data indexes for searching and exploiting HPC and e-Science facilities (such as ScotGrid and NGS)

Page 5: Enroller Colloquium: Sulman Sarwar

Current WorkSimple Searches

Data resides on local (to institution servers)

Input: a file containing word(s) or phrases

Output: to Display, Save to file.

Cross-collection Searches

Search same word(s) in multiple collections

Page 6: Enroller Colloquium: Sulman Sarwar

Current Work ...Bulk Searches (over NGS or ScotGrid)

Data resides on NGS

Input: Word(s), Phrases,

Output: to Display, Save as file

Execute Workflows (over NGS or ScotGrid)

Data, Input and Output same as above.

Page 7: Enroller Colloquium: Sulman Sarwar

Workflow Example• Input a word OR upload a file containing the words/phrases (terms) to be searched

• Search the terms in thesaurus (for example: HTE)

• Search the results from thesaurus-search in Scottish Corpus

• Find concordances for for each of the words

• Display the thesaurus-search results

• Display the corpus-search results

• Display the concordances

• Save/Download the results

Timid

{acolmod,egeful,(ge)forht,forhtfulforhtiendlic,forhtig,forhtmod,herebleaþ,ungedyrstig,unþriste,blethe<bleaþ,fey<fæge,unbold<unbeald,unbold<unbeald,argh<earg,frightful,feared,ferdy,fearful,ferdful,g(h)astful,trembling,timorous,cremetous,cremeuse,craintive,sheepish,meticulous,timid,tremebund,awful,soft,pale,timorsome, tremulous, pigeon-hearted,affrightful, formidolous, pavid, timidous,unsupported, tender-nosed, scary, pippin-hearted,kitten-hearted, funky, tender-footed, fearsome misventurous,scare,cotton-wool,

} 51 entries

talk chapped knocked blate bashfultimidrax stretch galluses braces yont defend inverewe handkerchief trees wavetimidsurrender while rhododendrons hurl defiance 1 lawrence wrote of histimidtypist nelly morrison dirty bitch

through you i see thetimidindeterminate puzzled soul behind that together in one corner atimidscrum we tried to celebrity

bit rupert wis aye atimidcat it jist hatit e wis a tom it wistimidi min fine on it

HTE

Page 8: Enroller Colloquium: Sulman Sarwar

2- Shibboleth redirects

user to W.A.Y.F. service

Typical Interaction Flow

3- User selects their

home instituition

QuickTime and aᆰBMP decompressor

are needed to see this picture.

Home Instituition

Identity Provider LDAP

LDAP

AuthZ

AuthN

QuickTime and aᆰBMP decompressorare needed to see this picture.

QuickTime and aᆰBMP decompressorare needed to see this picture.

Shibb Frontend

PortalDB

Service Provider

QuickTime and aᆰBMP decompressorare needed to see this picture.OEDQuickTime and aᆰBMP decompressorare needed to see this picture.SCOTSQuickTime and aᆰBMP decompressorare needed to see this picture.NECTE

NGSNGS

QuickTime and aᆰBMP decompressorare needed to see this picture.

HTE

QuickTime and aᆰBMP decompressorare needed to see this picture.

SCOTS

MapMapReduceReduce

Data

QuickTime and aᆰBMP decompressor

are needed to see this picture.

QuickTime and aᆰBMP decompressor

are needed to see this picture.

Uni. of Uni. of GlasgowGlasgow

Uni. of Uni. of New New

CastleCastle

OUPOUP

WE

B

WE

B

SE

RV

ICE

SS

ER

VIC

ES

GR

ID S

ER

VIC

ES

GR

ID S

ER

VIC

ES

1. User points browser at Grid resource/portal

QuickTime and aᆰBMP decompressor

are needed to see this picture.

Federation

4. Home site authenticates user

and pushes attributes to the service

provider

QuickTime and aᆰBMP decompressor

are needed to see this picture.

5. Pass authentication info and attributes to authZ function

QuickTime and aᆰBMP decompressor

are needed to see this picture.

QuickTime and aᆰBMP decompressorare needed to see this picture.

QuickTime and aᆰBMP decompressorare needed to see this picture.

Resu

lts Ag

gre

gato

r

Page 9: Enroller Colloquium: Sulman Sarwar

Using NGS/ScotGrid

#!/bin/bashecho "Starting application: #!/bin/bashecho "Starting application: scots-app "echo " submitting to job-scots-app "echo " submitting to job-manager at: "echo $(/bin/hostname -f)echo manager at: "echo $(/bin/hostname -f)echo " with aruguments to main "echo $*cd " with aruguments to main "echo $*cd /home/ngs0273/javaprog/scots-/home/ngs0273/javaprog/scots-appprops=/home/ngs0273/javaprog/scots-appprops=/home/ngs0273/javaprog/scots-app/src/main/resources/app/src/main/resources/project_ngs.propertiesecho "properties file: project_ngs.propertiesecho "properties file: " $propsnthreads=4echo "# threads= " " $propsnthreads=4echo "# threads= " $nthreads/usr/local/Cluster-Apps/java-$nthreads/usr/local/Cluster-Apps/java-1.6.0_03/bin/java -cp target/scots-app-1.0-1.6.0_03/bin/java -cp target/scots-app-1.0-SNAPSHOT.jar scots.app.App "$1" $props SNAPSHOT.jar scots.app.App "$1" $props $nthreads$nthreads

QuickTime and aᆰBMP decompressorare needed to see this picture.DATAQuickTime and aᆰBMP decompressorare needed to see this picture.DATA

Head NodeHead Node

Job ManagerJob Manager

CE-1CE-1

CE-2CE-2

QuickTime and aᆰBMP decompressorare needed to see this picture.DAT

A

CE-3CE-3

CE-4CE-4

CE-NCE-N

OutputOutput

WEB SERVICESWEB SERVICES

GR

ID S

ER

VIC

ES

GR

ID S

ER

VIC

ES

QuickTime and aᆰBMP decompressorare needed to see this picture.

JobJob

Sub-Sub-missionmission

ClientClient

MapReduceMapReduceApplicationApplication

Job Job

Submission

Submission

Script

Script

QuickTime and aᆰBMP decompressorare needed to see this picture.DAT

A

Page 10: Enroller Colloquium: Sulman Sarwar

DEMO

ENROLLER Search

ENROLLER Advance Search

Page 11: Enroller Colloquium: Sulman Sarwar

Types of Searches

Simple Word (single/multiple) Searches

Free text queries / phrase searches

Wild-card searches (can* , t?ll)

Fuzzy searches -to search for a term similar in spelling (roam~ : foam , roams)

Field searches (title: BBC)

Page 12: Enroller Colloquium: Sulman Sarwar

Term boosting - to control the relevance (salmon^4 reid)

Boolean Searches ( “ayr” AND “scotland”, “ayr” -“scotland”, “ayr” OR “edinburgh”. Likewise + and NOT operators)

Grouping - ( (ayr OR glasgow) AND BBC))

Types of Searches

Page 13: Enroller Colloquium: Sulman Sarwar

Future Work•Cross-collection searches

•Development of Language Analysis Tools

•Addition of new data collections

•Addition of UI features in portal for better user experience

•Working towards the development of a VRE for Language and Literature community

Page 14: Enroller Colloquium: Sulman Sarwar

•Thank you.

•Questions?

THE END.