challenges in closing information and records management capability gaps in share point

26
Challenges in Closing Information & Records Management Capability Gaps

Upload: conceptsearching

Post on 01-Nov-2014

1.284 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Challenges in closing information and records management capability gaps in share point

Challenges in Closing Information & Records Management

Capability Gaps

Page 2: Challenges in closing information and records management capability gaps in share point

• Welcome and Introductions• Dave Sanchez of Concept Searching• Juan Celaya of COMPU-DATA• Case Study• Questions and Wrap Up

Page 3: Challenges in closing information and records management capability gaps in share point

Company founded in 2002 Product launched in 2003 Focus on management of structured and unstructured

information Privately held and profitable – no funding Growth rate of 35% in 2008 and in excess of 100% for 2009 Founders and management team with company since inception

Technology Automatic concept identification, content tagging, auto-

classification, taxonomy management Only statistical vendor that can extract conceptual metadata

2009 and 2010 ‘100 Companies that Matter in KM’ (KM World Magazine)

KMWorld ‘Trend Setting Product’ of 2009

Locations: US, UK, & South Africa

Client base: Fortune 500/1000 organizations

Managed Partner under Microsoft global ISV Program - “go to partner” for Microsoft for auto-classification and taxonomy management

Microsoft Enterprise Search ISV , FAST Partner

Concept Searching • Don Miller • (408) 828-3400 • [email protected]

Concept Searching, Inc.

David Sanchez * [email protected] * 1 (713) 893-1743

Page 4: Challenges in closing information and records management capability gaps in share point

Information & Records Management Capability Gaps that Increase Costs

David Sanchez * [email protected] * 1 (713) 893-1743

Lack of Information Transparency: e-Discovery and FOIA Government and Private Sector directives to tag content for retrieval Untagged Data Assets = Untapped Resources Time Gap between Information Requests and Discovery is Directly Proportional

to Volume of Data Assets

Non-Compliance with Records Management Policies Sarbanes-Oxley and Government RM Retention Schedules Record Declaration process is manual Data Stored in Wrong Location & Information not Preserved in Accordance with

Regulatory Guidelines

Increasing Volume of Unplanned Data Exposure Events Privacy Act Program (PII), Protected Health Information (PHI), HIPAA, Payment

Card Industry (PCI), etc… Organizational Confidential and Sensitive Information

Problems

Page 5: Challenges in closing information and records management capability gaps in share point

Data Privacy and Security Exposure Events

By Sector

Business48%

Education21%

Government19%

Medicine12%

Source: Open Security Foundation

David Sanchez * [email protected] * 1 (713) 893-1743

Page 6: Challenges in closing information and records management capability gaps in share point

Data Privacy and Security Exposure Events

By Type

DISPOSAL6% Email

4%

FRAUD8%

HACK16%

Lost/Stolen Com-puters and Docu-

ments45%

UN-KNOW

N3%

SnailMail4%

Virus1%

Web13%

Source: Open Security Foundation

David Sanchez * [email protected] * 1 (713) 893-1743

Page 7: Challenges in closing information and records management capability gaps in share point

Data Privacy and Security Exposure Events

Government

DISPOSAL6%

Email4%

FRAUD6%

HACK8%

Lost/Stolen Com-puters and Docu-

ments49%

UN-KNOWN

4%

SnailMail7%

Virus0% Web

16%

Source: Open Security Foundation

David Sanchez * [email protected] * 1 (713) 893-1743

Page 8: Challenges in closing information and records management capability gaps in share point

Why is this Difficult?

Physical or Cognitive Properties of an Individual or Human Social Behavior which Influence Functioning of Technological Systems

Metadata

Tagging

Records Retention Code

Access Rights

Document Library 1 Document Library 2

Document Library 3 Document Library 4

Server Content with Appropriate Metadata, Retention Codes, and Rights Management

Templates

Human Factors

David Sanchez * [email protected] * 1 (713) 893-1743

Page 9: Challenges in closing information and records management capability gaps in share point

Physical or Cognitive Properties of an Individual or Human Social Behavior which Influence Functioning of Technological Systems

Limiting Factor = Human Behavior

Metadata

Tagging

Records Retention Code

Access Rights

Document Library 1 Document Library 2

Document Library 3 Document Library 4

Server Content with Appropriate Metadata, Retention Codes, and Rights Management

Templates

Why is this Difficult?

Human Factors

David Sanchez * [email protected] * 1 (713) 893-1743

Page 10: Challenges in closing information and records management capability gaps in share point

How do Organization’s Typically Address These Capability Gaps

Customize system interface to force manual application of metadata Pros: data assets now have metadata Cons: high customization costs, increase in end-user labor costs, less end-user

productivity, non-standardized application of metadata across enterprise

Hire temporary staff to add metadata to data assets Pros: data assets now have metadata Cons: temporary staff = $$$$$ and results in non-standardized tagging

Acknowledge that it is a problem and do nothing

Alternatives

David Sanchez * [email protected] * 1 (713) 893-1743

Page 11: Challenges in closing information and records management capability gaps in share point

Solution: conceptClassifier for SharePoint

Records Retention

Code Tagging

Automatic Content

Type Updating

Document Library 1

Document Library 2

Document Library 3

Document Library 4

Concept Classifier

for SharePoint

SharePoint Security

Services & Windows Rights

Management

Appropriate Storage

& Preservati

on

Increase Information

Retrieval Precision

for e-Discovery

Semantic Metadata Tagging

Concept Searching: Addressing the Technology Gap not the Behavior

David Sanchez * [email protected] * 1 (713) 893-1743

Page 12: Challenges in closing information and records management capability gaps in share point

Taxonomy Management & Automatic Metadata Tagging in SharePoint

e-Discovery & FOIA (moss.conceptsearching.com) Auto-classification to multiple vocabularies Faceted Searching Taxonomy Browsing

Records Management Aligning Vocabulary to Records Retention Codes Record Declaration Process – tagging documents with retention codes

Information Management – Data Privacy & Security Compliance PII, PHI, and PCI tagging Sensitive content (FOUO, Secret, Internal Use Only – contracts, labor rates,

etc…)

Live Demonstration

David Sanchez * [email protected] * 1 (713) 893-1743

Page 13: Challenges in closing information and records management capability gaps in share point

We Make Metadata Work For You

Automatic Conceptual Metadata Generation

Automated Classification

Taxonomy Development & Management • Proven to reduce taxonomy development by 80%

Microsoft Integration• Runs natively in SharePoint 2007 and SharePoint 2010, Microsoft Office

Applications, SharePoint Search and FAST, Windows Server 2008 R2 FCI• Fully integrated with SharePoint Content Types

Content Type Updater• Automatically changes the Content Type based on presence of

organizationally defined metadata found within the document• Identification of confidential/privacy data• Ability to identify records based on the records retention schedule and

route to the records center Technology

• Downloadable in 30 minutes – no programming required• Fully SOA compliant, delivered as Web Parts, based on open standards

• Highly scalable

conceptClassifier

David Sanchez * [email protected] * 1 (713) 893-1743

Page 14: Challenges in closing information and records management capability gaps in share point

Concept Classifier for SharePoint

David Sanchez * [email protected] * 1 (713) 893-1743

Page 15: Challenges in closing information and records management capability gaps in share point

Closing Information & Records Management Capability Gaps

Uses Taxonomy Manager to create and manage organizational taxonomies, ontologies, and metadata environment;

Employs conceptClassifier for SharePoint as an Automated Metadata Population Service;

Applies content types base on metadata;

Uses content types derived from metadata to drive individual and group access to data assets using inherent SharePoint Security;

Uses content types derived from metadata to drive migration of data assets to proper document libraries where RMS templates are automatically applied to restrict data asset usage.

Leveraging Metadata as an Enabling Asset

David Sanchez * [email protected] * 1 (713) 893-1743

Page 16: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLC

Juan J. Celaya

President/CEOSenior Business & IT Consultant

[email protected]

Office: 281.292.1333

www.cdlac.com

blog.cdlac.com

Page 17: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLCCompany Overview

Who are we?CDI is a successful information management integrator based in Spring, Texas (North of Houston) with offices in Miami, FL and Stafford, VA. We have been in business for over 22 years with 18 of those focused in Content and Data Integration (CADI™), enterprise search, classification, capture and data management. We are a small business and designated as a certified Texas HUB contractor.

What do we do?Integration, software development and reseller of best-of-breed products for ECM solutions focused inSearch, Automatic Classification, Capture and Business Automation (Workflows). We work with Government and private industry customers in delivering successful departmental and enterprise solutions.

Who do we serve?Medium to large organizations in government, health care, manufacturing and oil industries.

Page 18: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLC

During this Presentation we will:

For the case study:1. Summarize the issues facing U.S. Army researchers and records managers.2. Describe our approach in resolving those issues within the constraints of a

DoD environment and discuss the software tools that comprise the solutions.3. Discuss the challenges in identifying and managing millions of documents.4. Review how automatic classification and meta data tagging enhances search

in this environment.5. Address business outcomes and benefits in automating processes.

For conceptClassifier:6. Describe how the concept Classifier is being applied as part of the JSRRC

project.7. Present Concept Searching’s technologies also working outside of the

SharePoint® environment.

Presentation Overview

Page 19: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLC

Army Records ManagementProvide oversight and program management for the Army's Records Management Program.

Establish programs for records collection and preservation from garrison, training, contingency, and war time operations.

Operate and sustain the Army Electronic Archive and provides the means to identify, collect, index and retrieve important Army records, in hard copy and electronic media.

Management Information Control (AR 335-15)

U.S. Army ChallengesRecords Management

Records ScheduleHundreds of Record Series with around 4,000 individual record instructions.

End users faced with myriad choices when categorizing records.Results in improper classification.

Neglect to use schedule at all.

Affects retention durations.

Reduces impetus to retain record materials.

Reduced consistency in tagging records to schedules.

New rules and procedural training.

Hundreds of locations and data environments.

Page 20: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLC U.S. Army ChallengesJSRRC

U.S. Army ChallengesJSRRC

Joint Service Records Research Center (JSRRC)Validates veteran’s war-related claims for the Veterans Administration.

Primarily on Post-Traumatic Stress Disorder (PTSD), but also Agent Orange exposure and others.

Reviews cases for ALL services from WWII to present day.

Required research, among others, of DoD field documents that relate to the specific individual and event.

Literally tens-of-millions of documents.

No categorization or indexing of documents.

Plethora of data sources.

Today – millions of electronic files in multiple formats are being generated daily.

Usefulness of data – Not determined.

Manual identification – Not feasible.

For JSRRC – Finding a needle in the hay stack!

Goal – Standardize and consolidate field & internally generated data providing a common research interface.

Page 21: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLCCDI’s Solution PhilosophyCDI’s Solution Philosophy

For JSRRCDevelop ability to integrate documents and data from myriad disparate sources utilizing CADI™ framework.

Utilize conceptClassifier to classify Army documents into discrete, searchable segments.

Leverage the classification implementation to enhance search allowing for better results for the end users.

Implement the infrastructure that can be leveraged to move forward with Records Management, FOIA and Declassification organizations at RMDA.

For Records ManagementCombination of Army process changes and implementation of technology tools.

Streamline Records into fewer functional series.

End user has minimal or no role in categorizing record.

Utilize Army’s ARIMS & SharePoint to attribute initial metadata.

Utilize conceptClassifier & conceptTaxonomyManager to correctly identify appropriate disposition based on content and metadata.

Page 22: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLCPrimary Solution ComponentsPrimary Solution Components

Base Infrastructure:conceptClassifierconceptTaxonomyManagerconceptSearch

Application Infrastructure:DigitalAsset Finder™

Professional Services for integration and implementation of solution.

Page 23: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLCJSRRC SolutionJSRRC Solution

Consolidate existing data sources:Access databasesApplicationsNetwork shared drives

Prepare for future data sources:Identify possible originsVolume and formatsNo standards in data deliverySupport special security needsStored in different locations

Identify types of metadata & documents that:Must be standardizedDerive concepts & contentUsed to identify data to information relationshipsCreate taxonomies

Page 24: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLCJSRRC SolutionJSRRC Solution

Infrastructure built to support initial two environments:Environment #1

Windows based 3-server group.

DigitalAsset Finder™, conceptSearch with Distributed Query Server, conceptClassifier & conceptTaxonomyManager

Initial configuration for support of 200 terabytes of index-able data.

Microsoft Office & other text based files.

PDFs and searchable PDFs.

Image files (Tiff, JPG and others).

Environment #2

Windows based server

DigitalAsset Finder™, conceptSearch, conceptClassifier & conceptTaxonomyManager.

Currently supporting over 5 million records and growingMicrosoft Office & other text based files.

PDFs and searchable PDFs.

Image files (Tiff, JPG and others).

Structured data with no file reference.

Page 25: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLCJSRRC SolutionJSRRC Solution

Creation of taxonomies used to:Enhance Search

Categorize or Identify documents

Some of the taxonomies created include:Unit Names

Dates

Document Types (Names & Content)

Locations

Results include:Consolidation of information into distinct groups allowing a focused approach to the required research.

Controlled vocabulary that can be applied to the data sets as requirements evolve.

Access to information that previously was impossible to reach due to the resource requirements needed to collate the raw data.

Collaboration among researchers increase as they share information by contributing their knowledge to existing data for future reference and retrieval.

Page 26: Challenges in closing information and records management capability gaps in share point

Preserving the Worlds Knowledge - Available Anytime AnywhereSM

©2010 COMPU-DATA International, LLC, All Rights Reserved

COMPU-DATA International, LLCData Process PipelineData Process Pipeline

File FilterProcess

File type

File size

Folder name

Archive content

processing.

FileSynchronizer

PDFGeneration

conceptClassifierconceptSearch

Classification db

Search Indexes

DigitalAsset Finder™

Classification Metadata Assigned Search Indexes Independent of the data location