intelligent classifier sti innsbruck & excogito user-friendly semi-automatic product...

21
Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi- Automatic Product Classification System

Upload: muhammad-blackston

Post on 14-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Intelligent Classifier

STI Innsbruck  &  Excogito

User-friendly Semi-Automatic Product Classification System

Page 2: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

People

1. Supervision: Marcus Spies

2. People: Sigurd Harand, Christian Leibold

3. Contact person: Christian Leibold, [email protected]

4. Industrial cooperation with Excogito, Maksym Korotkiy

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 3: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Outline

1. Context: Product Classification Problem

2. Project intro, positioning and objectives

3. Workflow driven approach

4. GoldenBullet shooting market

a) Improved Software architecture

• Java XML Registries

• User taxonomies

b) Improved (re-)usability and quality

5. Conclusions and Future

6. Online Demo

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 4: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Product Classification Problem

1. E-Catalogs contain thousands of cryptic product descriptions

1. CAREPAQ BUREAU PROSIGNIA3YRS/SITE/J+1/TEL

2. TRAINING ACT/ASEEXCEPT TRU64UNIX and OPENVMS

3. ….

2. Businesses have to deal with thousands of e-catalogs

3. Classification standards have tens of thousands of product categories (21192 in UNSPSC 8.04)

4. The result: high manual classification effort is required

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 5: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

• many standards (e.g. UNSPSC, eCl@ss, ebXML, GPC, …),

– ~20.000 classes,

– millions of products

• Current SOA: Outsourcing to low-salary countries or use of (counterproductive) low level quality software tools with 25% failure rates

• GoldenBullet 2 research prototype offered an exclusive "semi-automatic" functionality to support the classification by manual intervention and to achieve by "learning" a classification level of 95% and speed up the process up to 60 times

• The development of the GB IC product into a marketable product will be an innovative creation of added value and help to reduce outsourcing of labor.

GB IC Positioning and Objectives

Page 6: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Project intro

1. Project won ProIT funding (cooperation between transIT and CAST)

2. Duration: 1st September 2007 - 31st August 2008

3. Objectives:

• Submission of a debugged, robust and marketable GB IC Prototype

• Extended Usability and Robustness

• Extended Reusability

4. Completed tasks & Status:

• Worked out contract for handling IPR between stakeholders (UIBK, Excogito NL, BvW Global Pty)

• Including foundation regulations for marketing and selling

• 1st report with deliverable of the technical specification accepted by CAST and transIT

• Cooperation with industrial partner Excogito© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 7: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Workflow Driven Approach

1. GoldenBullet semi-automatically classifies product descriptions into a standard (e.g. UNSPSC) by employing

1. NLP techniques to preprocess descriptions (stemming)

2. Clustering methods to generate representative sub-sets of e-catalog (currently k-means)

3. Machine learning techniques to train the system and automatically generate ranked classification options (currently Naïve Bayes)

2. The user approves or corrects the proposed classification

3. GoldenBullet constantly learns from the user choices and updates the classification options

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 8: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Architecture

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Mapping the workflow to functional modules:

• Seperation of concerns

• Workflow support to be implemented in the GUI

Page 9: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Architecture

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Enhanced Usability and Robustness:

- Provide sort and search functions for catalogue AND classification schema

- Multi-language GUI and contextual help-system

- Support of catalogue sizes of up to 10^6

- Action logging enables undo / redo for classification and user workflow

- Implementation of strategies for the avoidance of over-fitting

Page 10: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Architecture

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Enhanced reusability:

- Software can be deployed in a Java Enterprise Edition Application Server (e.g. Tomcat, all major vendors)

-The Java EE XML Registry is instrumented for storing and accessing classification schema data

- Enables customer catalogue taxonomies to be stored and exchanged over a common format.

- Documentation (SW Design, User guide, Feature list), JUnit, JavaDoc

Page 11: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Conclusions and Future

1. GoldenBullet is a semi-automatic product classification system that offers significant reduction of e-catalog classification effort

2. GoldenBullet IC considerably improves (re-) usability and robustness of the system

3. In future we aim at:

1. Implementation & validation of the technical specification

2. Generation of awareness (transIT)

3. Evaluation of further (possibly new) options of marketable exploitation

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 12: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Online Demo

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

- Questions so far?

- http://www.gbclass.com

Page 13: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Thank you !

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Further Questions?

Page 14: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Backup

The following slides are provided for the case that no internet connection is

available or theDEMO is not reachable

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 15: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

GoldenBullet IC GUI Outline

1. Wizards

1. Data Import/Export

2. Simple and Expert Training

3. Classification

2. E-Catalog and UNSPSC Browsers

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 16: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

“CI” Style

GoldenBullet IC has an integrated GUI style and continuous designed and brand-like Interface.

- Recognition as product

- Usability through commoly used symbols

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 17: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Data Import/Export Wizards

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 18: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

E-Catalog Browser

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 19: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Expert Training

Automatically created representative sub-catalog is provided to the userfor semi-automatic classification

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 20: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

Classification

Automatically created classification options are proposed to the user for approval

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.

Page 21: Intelligent Classifier STI Innsbruck & Excogito User-friendly Semi-Automatic Product Classification System

UNSPSC Browser

The Browser allows the user to locate an appropriate UNSPSC category and manually assign it to a product description

© 2002 - 2007 STI Innsbruck  &  Excogito. All Rights Reserved.