data harmony version 3.9 features update

107
Marjorie M.K. Hlava [email protected] Access Innovations, Inc. www.accessinn.com Leveraging your content semantically 10 th Annual Data Harmony User Group Meeting

Upload: accessinnovations

Post on 11-May-2015

2.562 views

Category:

Technology


2 download

DESCRIPTION

Marjorie M.K. Hlava, President and founder of Access Innovations, Inc., unveils the newest version and module updates of the Data Harmony indexing software suite.

TRANSCRIPT

Page 1: Data Harmony Version 3.9 Features Update

Marjorie M.K. [email protected] Innovations, Inc.

www.accessinn.com

Leveraging your content semantically

10th Annual Data Harmony User Group Meeting

Page 2: Data Harmony Version 3.9 Features Update

DH Technical Support Team

Development programming team Lamine Idjeraoui ** Allexander Lyons Daniel Vasicek Scott Roberts Doug Vendcat

Customer support Mary Garcia ** Jack Bruce Gabe Carr Samantha Lewis

Documentation Jack Bruce ** Kirk Sanders Gena San Nicolas Barbara Gilles

Systems Tom Peterson** SWCP

Page 3: Data Harmony Version 3.9 Features Update

DH Customer Support Team

Sales and Licensing Marjorie Hlava Janice McIntyre Bill Richardson Jay Ven Eman ** Leland Yates

Blog and Web team Barbara Gilles Melody Smith ** Timothy Soholt **

Marketing Heather Kotula ** Ashley Beard

Page 4: Data Harmony Version 3.9 Features Update

Editorial Team Taxonomy and Rule Building

Gabe Carr Jack Bruce Kathy Brown Barbara Gilles Bob Kasenchak **

Samantha Lewis Kirk Sanders Tim Soholt Gena San Nicolas Alice Redmond-Neal Eric Ziecker

Page 5: Data Harmony Version 3.9 Features Update

Access Integrity

Kathy Brown Jerry Jorgeson John Kuranz** Leland Yates Access Rule Building Team Access Programming Team

Page 6: Data Harmony Version 3.9 Features Update

Who’s Who?

Introduce yourself Relationship to Data Harmony Where do you use Data Harmony Project Name(s)

Page 7: Data Harmony Version 3.9 Features Update

Access InnovationsWhat do we do?

Page 8: Data Harmony Version 3.9 Features Update

Four Divisions Database Services Data Harmony

NewsIndexer National Information Center for

Educational Media (NICEM) MediaSleuth

Access Integrity Medical Claims Compliance Integracoder

Page 9: Data Harmony Version 3.9 Features Update

Database Services

Database Design Consulting DTD / Metadata Schemas Workflow Scheduling

Editorial Services Metadata capture and creation Tagging – XML, SGML Abstracting Indexing Author disambiguation

Page 10: Data Harmony Version 3.9 Features Update

Database Services - 2

Taxonomy Construction Thesaurus Vocabulary Ontology Data Linking (linked data) Authority Files – pick lists Rule Bases

Semantic Enrichment Data Format Conversion Database Applications Retrospective metadata tagging Author disambiguation

Page 11: Data Harmony Version 3.9 Features Update

Database Services - 3

Applications development Search – Lucene and Solr Search Harmony interface Web services layer

Link to user experience or user interface Web calls

API setup and linking www.accessinn.com

Page 12: Data Harmony Version 3.9 Features Update

Data Harmony

Built for our use starting in 1987 Visual Basic C++ Java Aid to the editorial and indexing processes Alleviate the clerical aspects Speed the tagging process Guarantee accuracy, consistency, and

depth of indexing

Page 13: Data Harmony Version 3.9 Features Update

Data Harmony Suite – Main Modules

M.A.I. Thesaurus Master XIS

XML Intranet System Administrative configuration module “The Data Harmony Suite”

Page 14: Data Harmony Version 3.9 Features Update

Tech stuff Downloadable Documentation revised 2014 APIs for client server versions Internet accessible Cloud and SaaS Full multilingual display Unicode - Accepts ASCII data Entification tables converted Drivers for display and print

For most languages

Page 15: Data Harmony Version 3.9 Features Update

Data Harmony

Java Platform independent Applet modules Web services APIs

XML TCP/IP JSON and SSL on WEB Start GlassFish for extension support www.dataharmony.com

Page 16: Data Harmony Version 3.9 Features Update

Full multilingual display

Page 17: Data Harmony Version 3.9 Features Update

Data Harmony Machine Aided Indexing (M.A.I.)

Semantic, syntactic, morphological, etc. layer Rule Builder for users Concept Extractor for text Statistics for Machine Learning Use in automatic, batch, or assisted mode

Thesaurus Master For creating taxonomies, thesauri, ontologies, and

authority files MAIstro

Thesaurus Master and M.A.I. combined

Page 18: Data Harmony Version 3.9 Features Update

Data Harmony Extensions

Inline Tagging Metadata Extractor MAIChem Search Harmony SharePoint integration Recommender

Page 19: Data Harmony Version 3.9 Features Update

New

DH Author Submission System Author / Name Disambiguation MAIBatch GUI Semantic Fingerprinting Web Start Sneak Peek at “Ontology Master”

Page 20: Data Harmony Version 3.9 Features Update

Retiring

Automatic Summarizer WebThes ThesViewer

Page 21: Data Harmony Version 3.9 Features Update

TaxoDiary

Daily blog Weekly feature 3 + items per day Big archive Launched in June 2010

Page 22: Data Harmony Version 3.9 Features Update

DH Bulletin Board Exchangehttp://dhd.accessinn.com  

Page 23: Data Harmony Version 3.9 Features Update

Data Harmony Forum

Discussion threads Solutions to reported problems Access to the newest documentation Announcements of features Bug reports Enhancement requests

Page 24: Data Harmony Version 3.9 Features Update

Data Harmony Partners

EJ Press MarkLogic

Really strategies (R Suite) Yuxi Xquire

Publishing Technology More ….

Page 25: Data Harmony Version 3.9 Features Update

Some DH Connectors & Exports…

ACD/Labs’

Lucene (org. & Solr)

Perfect Search

Oracle/Stellent Universal Content Management

Jive Software’s Clearspace

EJ Press

Publishing Technology

OpenOffice

Mark Logic’s MarkLogic Server

Microsoft’s SharePoint

NorthPlains

Temis

Synaptica

and more…

Page 26: Data Harmony Version 3.9 Features Update

Other DH offerings

Off-the-shelf taxonomy Term records Browseable list Rule bases

Consulting Information architecture DTD and schema creation

Search implementation

Page 27: Data Harmony Version 3.9 Features Update

Knowledge Domains in over 40 subject areas.• Agriculture• Applied Technologies• Business (popular)• Business and Finance• Communications• Computer and Information

Science (popular)• Computer Science • Consumer and Homemaking

Education• Corporate Names• Counseling and Guidance• Economics• Education• Engineering• Environment• Geography (subject)• Geographical Place Names• Health and Safety• History• Language Arts

• Languages• Literature and Drama• Mathematics• News • Occupations• Organizational Names• Personal Names• Physical Education and

Recreation• Political Science• Psychology• Religion and Philosophy• Science (popular)• Science, Technology, and Medicine (STM)• Society• Sports• Technology• Visual and Performing Arts• US Industrial Codes (NAICS)• US Zip Codes and Places

Go to TaxoBank for more!

Page 28: Data Harmony Version 3.9 Features Update

NewsIndexer

Automatic indexing of newspapers 8 topical areas Maps to IPTC, NAICS, ICB, and GICS

codes Popular, automatic, and fast Remote submission / ASP 13 levels Filter to 3 License and augment www.newsindexer.com

Page 29: Data Harmony Version 3.9 Features Update

National Information Center for Educational Media - NICEM

667,000 records for non-print educational media

23,000 producers and distributors Based on school curriculum needs Online and CD-ROMs MARC cataloging Thesaurus Print www.nicem.com

Page 30: Data Harmony Version 3.9 Features Update

MediaSleuth

Online ordering of media from NICEM Search Harmony implementation Full e-commerce platform for ordering Educational and popular materials

www.mediasleuth.com

Page 31: Data Harmony Version 3.9 Features Update

Access Integrity, Inc. (AI2) Medical Claims Compliance Automatic IDC-9 suggestions CPT rule base HCPCS rule base ICD-9 V 3 Hospitals ICD-10 Accurate, deep, consistent coding Making medical billing efficient

Page 32: Data Harmony Version 3.9 Features Update

Corporate Information

Closely held Financed by

Sweat and Persistence Good Cash Flow and Management

Since 1978 - 35 years in business Marjorie M.K. Hlava Jay Ven Eman Joanna Ginter

www.accessinn.com

Woman Owned Small Business

Page 33: Data Harmony Version 3.9 Features Update

UPDATE

Data Harmony Users Group Meeting

February 10-14, 2014

Page 34: Data Harmony Version 3.9 Features Update

The 15 modules + extensionsWhat’s new

Admin Module Author Submission

System Author / Name

Disambiguation Inline Tagging Metadata Extractor M.A.I. MAIBatch GUI

MAIChem Ontology Master Thesaurus Master Search Harmony SharePoint Recommender Web Start XIS

Page 35: Data Harmony Version 3.9 Features Update

Rule Base

TermKeyRecord

ConceptExtractor

Statistics Module

M.A.I.

TaxonomyAuthority filesAll terms AlphabeticPermuted view

XML (Extensible Markup Language) - Unicode

Java Virtual Machine

TCP/IP Transmission Control Protocol / Internet Protocol

Thesaurus Master

Native XMLContentCreationRepository

OWL Zthes SKOSXMLMARC, etc.

Administrative modules

DH Extensions

XIS Search Harmony

NavTree

Auto Completion

Narrow Search - NT

Expanded Search - RT

Auto Sum

Metadata Extractor

MAI Chem

Data Harmony 2013 Stack

Page 36: Data Harmony Version 3.9 Features Update

Data Harmony 2014 Stack

Rule Base

TermKeyRecord

ConceptExtractor

Statistics Module

TaxonomyAuthority filesAll terms AlphabeticPermuted view

XML (Extensible Markup Language) - Unicode

Java Virtual Machine

TCP/IP Transmission Control Protocol / Internet Protocol

Thesaurus Master

Native XMLContentCreationRepository

OWL Zthes SKOSXMLMARC, etc.

Administrative modules

Web Start, APIs, Web services and connectors

XIS Search Harmony

NavTree Auto Completion

Narrow

Search - NTExpanded

Search - RT

Metadata Extractor

MAIChem

Inline Tagging

Author Disambiguation

Recommender

M.A.I.

Automatic Summarizer

Author Submission System

SharePoint Connector

Ontology Master

MAIBatch

Page 37: Data Harmony Version 3.9 Features Update

Admin Module

Configuration of Thesaurus Master, M.A.I., MAIstro

Separate Admin Module for XIS MAIBatch added to MAIstro Admin

Module

Page 38: Data Harmony Version 3.9 Features Update

The author pastes the data into the

document template,

attaching images, graphs, etc. as

necessary:

Copyright © 2013 Access Innovations, Inc.

Author Submission Module

Page 39: Data Harmony Version 3.9 Features Update

Author Submission Module

Copyright © 2013 Access Innovations, Inc.

The author fills in the data to the document template, attaching images and graphs as necessary.

An API calls Data Harmony and generates a list of indexing terms based on the content.

Page 40: Data Harmony Version 3.9 Features Update

Authors review the indexing and may change it.

Content is stored into a data repository as HTML, XML, etc.

Author Submission Module

Copyright © 2013 Access Innovations, Inc.

Page 41: Data Harmony Version 3.9 Features Update
Page 42: Data Harmony Version 3.9 Features Update

DH Author Submission System

Leveraging Records Management with Documentum, Author Submission, and MAIstroMarjorie M.K. Hlava and Leland Yates, Access Innovations, Inc.

Page 43: Data Harmony Version 3.9 Features Update

Admin Module

Page 44: Data Harmony Version 3.9 Features Update

DH Author Submission System

Page 45: Data Harmony Version 3.9 Features Update

Configure any field Index on any field XML or XHTML Link to the CMS

Author Submission

System Configuration Module

Page 46: Data Harmony Version 3.9 Features Update

Author Disambiguation

Build a file of authors Name: first, second, surname DOIs published Publication rank (first author, etc.) Keywords for those DOIs Affiliation(s) Location(s) city, state, country, etc. Co-authors (inferred by DOI) Etc.

Page 47: Data Harmony Version 3.9 Features Update

Affiliation Disambiguation

Build a file of affiliations Name

Lab, institute, etc. name DOI Location Full address Keywords Etc.

Page 48: Data Harmony Version 3.9 Features Update

Author Disambiguation

Link the two databases Build a web service to accept files Auto-disambiguate incoming files Review new or non-match to ensure

accuracy Leveraging Semantic Fingerprinting for

Building Author NetworksBob Kasenchak, Wednesday @ 9:30 AM

Page 49: Data Harmony Version 3.9 Features Update

Inline Tagging

Full text tagging Send search query directly to the place in

the document where the concept is mentioned.

Flexible in XML and HTML views Inline Tagging and Dictionary Connection

Gena San Nicolas, Wednesday @ 2:15

Page 50: Data Harmony Version 3.9 Features Update

Inline tagging Web service

Use M.A.I. to put terms in context for high-precision indexing

Page 51: Data Harmony Version 3.9 Features Update

Inline Tagging

Shows the exact point where the concept is mentioned

Mouse over to view the term record

Statistical summary, showing the number of times each term is mentioned in the article

Page 52: Data Harmony Version 3.9 Features Update

XML View forInline Tagging

Copyright © 2013 Access Innovations, Inc.

Page 53: Data Harmony Version 3.9 Features Update

Metadata Extractor

Automatic creation from PDF digital layer Position training needed Dublin Core metadata Bibliographic citation created Automatic summarization added Uses M.A.I. on full text Can be linked to Author Disambiguation

Page 54: Data Harmony Version 3.9 Features Update

Input file

Page 55: Data Harmony Version 3.9 Features Update

Source file PDF digital layer

Page 56: Data Harmony Version 3.9 Features Update

Metadata Extractor Full Record Display

Page 57: Data Harmony Version 3.9 Features Update

Output in XML

Page 58: Data Harmony Version 3.9 Features Update

Or use with HTML Pages

. <document><title>Access Innovations -

Knowledge Management Professionals</title><document-type>Web Page</document-type><copyright>© 2007 Access Innovations, Inc.</copyright><address>

<street>131 Adams NE</street><city>Albuquerque</city><state>New Mexico</state>

</address><subject-terms>

<term>Data Harmony</term><term>Indexing</term><term>Taxonomies</term>

</subject-terms></document>

Page 59: Data Harmony Version 3.9 Features Update

M.A.I.

M.A.I. is used to describe or categorize items by matching text to controlled vocabulary terms   Rule Builder Concept Extractor Statistics Collector Test MAI

Page 60: Data Harmony Version 3.9 Features Update

M.A.I. 2014

Find in Test MAI Export Fields function Expanded warning and information labels Expanded print functions Rule error details Emphasis tags MAIBatch GUI

Page 61: Data Harmony Version 3.9 Features Update

Find Function In Test MAI

Page 62: Data Harmony Version 3.9 Features Update

Export with fields selection

Page 63: Data Harmony Version 3.9 Features Update

Expanded warning and information labels

Delete term warning

Page 64: Data Harmony Version 3.9 Features Update

Term warnings

Term with multiple Broader Terms warning Remove relationship warning message

Page 65: Data Harmony Version 3.9 Features Update

Move term functions

Move a single term

Page 66: Data Harmony Version 3.9 Features Update

Expanded print functions

Page 67: Data Harmony Version 3.9 Features Update

Test the syntax of a rule

Page 68: Data Harmony Version 3.9 Features Update

View information about a thesaurus term

Page 69: Data Harmony Version 3.9 Features Update

MAIBatch GUI

Page 70: Data Harmony Version 3.9 Features Update

IMAIBatch input format

PDF XML, nXML Web content (HTML, HTM) Plain text (TXT), rich text (RTF) MS Word documents (DOC, DOCX)

Page 71: Data Harmony Version 3.9 Features Update

Full window with suggestedAND used terms

Page 72: Data Harmony Version 3.9 Features Update

Select all or just some files to process

Page 73: Data Harmony Version 3.9 Features Update

MAIBatch XML

Add Custom tags Click on “XML tags” in

the Settings menu.

Page 74: Data Harmony Version 3.9 Features Update

MAIBatch - Adding files Viewing results

Upload File/Directory

Row of asterisks separates each document

file path of a document

suggested thesaurus terms

Page 75: Data Harmony Version 3.9 Features Update

Log Statistics From source data to

compare accuracy By human editors

assigning values HIT MISS NOISE

From source file data

<USEDTERMS><TERM>Term 1</TERM><TERM>Term 2</TERM></USEDTERMS>

Page 76: Data Harmony Version 3.9 Features Update

M.A.I. Statistics Module

Page 77: Data Harmony Version 3.9 Features Update

Exporting MAIBatch

resultsSave as .txt file through export menu

Save to Log Spreadsheet .xls

Page 78: Data Harmony Version 3.9 Features Update

MAIChem

Dictionaries Full terms Beginners Enders

M.A.I. Concept Extractor Links to graphical displays

Page 79: Data Harmony Version 3.9 Features Update

Ontology Master

Sneak Peek Built on Thesaurus Master Full OWL and SKOS exports Full directional relationships Same extensive functionality Bob Kasenchak – Wednesday @ 1:15

PM

Page 80: Data Harmony Version 3.9 Features Update

Recommender

Page 81: Data Harmony Version 3.9 Features Update
Page 82: Data Harmony Version 3.9 Features Update

More Like This - Recommender

Page 83: Data Harmony Version 3.9 Features Update

Search Harmony

Built to leverage semantically enriched text

Uses the thesaurus sections BT-NT relationships for taxonomy tree Type ahead from tab, permuted index Related terms Narrower terms

Page 84: Data Harmony Version 3.9 Features Update

Copyright © 2005 - Access Innovations, Inc.

Taxonomyview

ThesaurusTerm Record

view

Page 85: Data Harmony Version 3.9 Features Update

Search Presentation Layer

Automatic completion

and type ahead from thesaurus

Page 86: Data Harmony Version 3.9 Features Update

Search Presentation Layer

Related

Narrower

Page 87: Data Harmony Version 3.9 Features Update

Search Presentation Layer

The Hierarchical view of the thesaurus is also a browseable view of the content.

The numbers include the number of hits 1. For the term 2. For the branch

Page 88: Data Harmony Version 3.9 Features Update

Semantic Fingerprinting

People / Authors Articles Medical records Organizations and affiliations Point ads to users Related to author disambiguation

Page 89: Data Harmony Version 3.9 Features Update

Thesaurus Master

Machine Aided Indexer

(M.A.I.™)

Repository

SearchPresentation:

90% accuracy

Browse by SubjectAuto-completionBroader TermsNarrower TermsRelated Terms

Client Taxonomy

Inline Tagging

Metadata and Entity

Extractor

Automatic Summarizatio

n

SearchSoftware

Client Data

Full Text

HTML, PDF,

Data Feeds, etc.

Client taxonomy

Fully integrated SharePoint

Copyright © 2013 Access Innovations, Inc.

[Data Harmony fully integrated with MOSS.]

Page 90: Data Harmony Version 3.9 Features Update

Select term store management located under Site AdministrationEdit term sets to accurately reflect your document

libraries and content types. Term sets can be individual taxonomies or flat controlled vocabulary lists. 90

Page 91: Data Harmony Version 3.9 Features Update

Thesaurus Master - 2014

Built for vocabulary control Taxonomy Thesaurus Entities

Full standards compliance ISO 25964 Parts 1 and 2 NISO Z39.19 – 2010

Page 92: Data Harmony Version 3.9 Features Update

Emphasis Is Available for Preferred Terms

bold, italics, or underline Term with emphasized words

Term with enriched words

Change Term dialog with enhancement buttons

Page 93: Data Harmony Version 3.9 Features Update

XML Emphasis Export

Page 94: Data Harmony Version 3.9 Features Update

Full Path Export

Data Harmony Custom Features as Implemented for Triumph Learning

Kirk Sanders Wednesday @ 11:00

Emphasis Full path export

Page 95: Data Harmony Version 3.9 Features Update

Thesaurus Master 2014

Emphasis tags – more Wednesday @ 11:00 Data Harmony Custom Features as

Implemented for Triumph LearningKirk Sanders, Access Innovations, Inc.

Page 96: Data Harmony Version 3.9 Features Update

Pattern analysisDomain associations

Page 97: Data Harmony Version 3.9 Features Update

Pattern analysisComponent gaps

Page 98: Data Harmony Version 3.9 Features Update

Web Start

Replacing WebThes and ThesViewer Allows auto-start from the browser Full featured Password access control Everything from view only to full access

Page 99: Data Harmony Version 3.9 Features Update
Page 100: Data Harmony Version 3.9 Features Update
Page 101: Data Harmony Version 3.9 Features Update

V

Page 102: Data Harmony Version 3.9 Features Update
Page 103: Data Harmony Version 3.9 Features Update

XIS

A XIS project consists of the following: Folders that XIS uses. These are the “project

folders.”  A schema (configuration file) called

projects.MyProject.xml.  A XIS DTD, called “projects.dtd.”

Page 104: Data Harmony Version 3.9 Features Update

XIS links to Thesaurus Master and M.A.I.

Page 105: Data Harmony Version 3.9 Features Update

XIS and Lucene

Search within a search (recursive search)

New Lucene search

Using Lucene for Search within XISAllexander Lyons, Wednesday @ 11:45

Page 106: Data Harmony Version 3.9 Features Update

DHUG 2015

Albuquerque February 16 – 20 Call for papers is now open Ideas for what to do better and differently

VERY welcome

Page 107: Data Harmony Version 3.9 Features Update

We Apply ImaginationKeep the System Flexible

Make the Applications Fun

Thank you!

Marjorie M.K. Hlava, President,

Access Innovations

505-998-0800

[email protected]