september 6, 2013 a hubzero extension for automated tagging jim mullen advanced biomedical it core...

19
September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Upload: aubrey-fields

Post on 13-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

September 6, 2013

A HUBzero Extension for Automated Tagging

Jim MullenAdvanced Biomedical IT Core

Indiana University

Page 2: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

My Work on Extension

I implemented the automated tagging extension, but others came up with the idea and contributed to the design, including Bill Barnett, Michael Grobe and Anurag Shankar.

September 6, 2013A HUBzero Extension for Automated Tagging

Page 3: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Automated Tagging Extension

Goal. Support automated tagging of Indiana CTSI (Clinical and Translational Sciences Institute) Hub (http://indianactsi.org) pages using the NCBO (National Center for Biomedical Ontology) Annotator

Motivation. Tagging (assigning terms from a controlled vocabulary/ontology to pages) can be very helpful for site search and navigation, but manual tagging is expensive.

September 6, 2013A HUBzero Extension for Automated Tagging

Page 4: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

September 6, 2013A HUBzero Extension for Automated Tagging

NCBO Annotator

A web site that includes web services for annotating text using various controlled vocabularies and ontologies, such as SNOMED and MeSH (Medical Subject Headings).

Page 5: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

NCBO Annotator Example

Text: “Gene therapy vectors based on murine retroviruses have now been in clinical trials for over 20 years. During that time, a variety of novel vector pseudotypes were developed in an effort to improve gene transfer.”

Ontology: MeSH

Terms/Tags:Genes Gene Therapy Retroviridae

therapy Time Transfer (Psychology)September 6, 2013A HUBzero Extension for Automated Tagging

Page 6: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Extension Overview• The Indiana CTSI HUB was built using HUBzero (

http://hubzero.org), which was built on top of the Joomla content management system.

• Extension works with Joomla (version 1.5) as well as HUBzero

• Extension consists of:o Plugin – conditionally tags pages when they are

accessed, and displays the tags on pageso Component – provides user interface for search

and navigation and administrative interface for configuration

September 6, 2013A HUBzero Extension for Automated Tagging

Page 7: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Extension Overview (continued)

• User interface (front-end)o Information/help pageo Multi-word auto-complete tag searcho Tag cloudo Tag information page

• Admin interface (back-end)o Configuration of extension

September 6, 2013A HUBzero Extension for Automated Tagging

Page 8: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Tags on Pages

September 6, 2013A HUBzero Extension for Automated Tagging

The extension adds tags to the bottom of pages (using a plugin).

Page 9: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Information/Help Page

September 6, 2013A HUBzero Extension for Automated Tagging

You can create an article that users will be directed to when they click on the “What’s this?” link.

Page 10: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Tag Search

September 6, 2013A HUBzero Extension for Automated Tagging

You can select the ontology to use for the search.

Auto-completion is provided for search terms.

Page 11: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Tag Cloud

September 6, 2013A HUBzero Extension for Automated Tagging

The size of a term is proportional to the number of pages that are tagged with it

Page 12: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Tag Information Page

September 6, 2013A HUBzero Extension for Automated Tagging

The Tag Info page lists the pages that contain the specified tag.

Page 13: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Extension Installation

September 6, 2013A HUBzero Extension for Automated Tagging

Upload a zip file using the HUBZero / Joomla admin interface

Page 14: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Extension Configuration

September 6, 2013A HUBzero Extension for Automated Tagging

After the component is installed, the component’s admin interface is used to configuring the component

Page 15: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Component Configuration - Steps

1. Get and enter an NCBO API key

2. Download ontology information from NCBO

3. Select the ontologies to use and a primary/default ontology

4. Set tagging options

5. Turn tagging on

September 6, 2013A HUBzero Extension for Automated Tagging

Page 16: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Component ConfigurationTagging Display Options

September 6, 2013A HUBzero Extension for Automated Tagging

Page 17: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

Extension ConfigurationTagging Update Options

• Turn on/off tag updates• Limit IP addresses for tag updates• Time limit before tag updates are made• Pages to exclude from tagging• Components to exclude from tagging

September 6, 2013A HUBzero Extension for Automated Tagging

Page 18: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

ConclusionsPros

•Automatically tags pages

•Will work on all pages (not component-dependent)

•Works with Joomla as well as HUBzero

Cons

•Tagging dependent on NCBO annotator:o Does not seem to be very intelligento Too slow to have real-time taggingo Extension will break if NCBO changes their web serviceso Limited to biomedical ontologies

It’s possible to change the extension’s annotator, so this extension could be used as a basis for using or testing other annotators.

September 6, 2013A HUBzero Extension for Automated Tagging

Page 19: September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University

September 6, 2013A HUBzero Extension for Automated Tagging

License Terms• Please cite as: Mullen, J. A HUBzero Extension for Automated Tagging. 2013.

Presentation. Presented at HUBbub 2013 (Indianapolis, IN, 6 September 2013). http://hdl.handle.net/2022/17012

• Items indicated with a © are under copyright and used here with permission. Such items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse.

• Except where otherwise noted, contents of this presentation are copyright 2013 by the Trustees of Indiana University.

• This document is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.