1 lu yang, biplab sarker, virendrakumar c. bhavsar and harold boley [email protected] faculty of...

31
1 Lu Yang, Biplab Sarker, Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar Virendrakumar C. Bhavsar and Harold and Harold Boley Boley [email protected] [email protected] Faculty of Computer Science Faculty of Computer Science University of New Brunswick (UNB) University of New Brunswick (UNB) Fredericton, Canada Fredericton, Canada IICAI, December 20, 2005 IICAI, December 20, 2005 Range Similarity Measures between Buyers and Sellers in e-Marketplaces

Upload: sophia-gloria-jennings

Post on 02-Jan-2016

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

1

Lu Yang, Biplab Sarker, Lu Yang, Biplab Sarker, Virendrakumar C. BhavsarVirendrakumar C. Bhavsar and Harold Boley and Harold Boley

[email protected]@unb.caFaculty of Computer ScienceFaculty of Computer Science

University of New Brunswick (UNB)University of New Brunswick (UNB)Fredericton, CanadaFredericton, Canada

IICAI, December 20, 2005IICAI, December 20, 2005

Range Similarity Measures between Buyers and Sellers in e-Marketplaces

Page 2: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

2

Agenda Agenda • Motivation• Partonomy Tree Similarity Algorithm

• Tree representation• Partonomy similarity• Non-semantic matching on nodes

• Semantic Matching• Inner nodes vs. leaf nodes• Global similarity measure (for inner nodes)

• Taxonomic class similarity• Encoding subtaxonomies into partonomy trees

• Local similarity measures (for leaf nodes)• Conclusion

Page 3: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

3

Main Server

User Info

User Profiles

User Agents

Agents

Matcher1 Matchern

To other sites (network)

Web Browser

User

e-Market

• e-business, e-learning …• Buyer-Seller matching• Metadata for buyers and sellers

• Keywords/keyphrases• Trees

• Tree similarity

Motivation

Page 4: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

4

Partonomy Tree Similarity Algorithm─ Tree Representation

• Tree representation for product/service descriptions [Bhavsar et al. 2004]

• Characteristics of our trees

• Node-labled, arc-labled and arc-weighted

• Sibling arcs are labled in lexicographical order

• Sibling arc weights sum to 1.0

A simple example “Car” tree:

2002

Car

FordBlack

Make

Color Year0.3

0.2

0.5

Page 5: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

5

(si (wi + w'i)/2)

(A(si)(wi + w'i)/2)

A(si) ≥ si

lom

educational

0.5

general

format platform0.50.50.5

Introduction to Oracle

t t´

technical0.3334 0.33330.3333

edu-set gen-set tec-set

language

en

title

HTML WinXP

lom

0.1

general

format platform0.90.80.2

Basic Oracle

technical0.70.3

gen-set tec-setlanguage

en

title

* WinXP

* : Don’t Care

• Partonomy similarity [Bhavsar et al. 2004]

Fragments of learning object trees [Boley et al. 2005] for learning object matching (http://www.cs.unb.ca/agentmatcher)

Partonomy Tree Similarity Algorithm─ Similarity Algorithm

Page 6: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

6

• Non-semantic matching on both inner and leaf nodes

• Exact string matching

binary result 0.0 or 1.0

• Permutation of strings

“Java Programming” vs “Programming in Java”

Number of identical words

Maximum length of the two strings

Example 1:

For two node labels “a b c” and “a b d e”, their similarity is:

2

4= 0.5

Partonomy Tree Similarity Algorithm─ Non-Semantic Matching

Page 7: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

7

Example 2:Node labels “electric chair” and “committee chair”

1

2= 0.5 meaningful?

• Semantic matching techniques are needed for the above problems

Partonomy Tree Similarity Algorithm─ Non-Semantic Matching

Page 8: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

8

Semantic Matching

• Inner nodes vs. leaf nodes• Inner nodes — class-oriented

• Inner node labels can be classes• Classes are located in a taxonomy tree• Taxonomic class similarity measure (global similarity measure)

• Leaf nodes — type-oriented• Address, currency, date, price and so on• Type similarity measures (local similarity measures)

Page 9: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

9

Semantic Matching (Cont'd)

String Permutation (both inner

and leaf nodes)

Exact String Matching (both inner

and leaf nodes)

Non-Semantic Matching

Taxonomic Class Similarity

(inner

nodes)

Type Similarity (leaf nodes)

Semantic Matching

Page 10: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

10

Distributed Programming

Credit

“Introduction to Distributed Programming”

Textbook

TuitionDuration

$8002months3

0.20.1 0.3

0.4

t1 t2

Object-Oriented Programming

Credit

“Objected-Oriented Programming Essentials”

Textbook

TuitionDuration

$10003months3

0.10.5 0.2

0.2

partonomy trees

• Global similarity measure (for inner nodes) [Yang et al. 2005]

Semantic Matching ─ Global Similarity

Page 11: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

11

Programming Techniques

Applicative Programming

0.60.5General

Automatic Programming

Concurrent Programming

Sequential Programming

Object-Oriented Programming

Distributed Programming

Parallel Programming

0.8 0.50.9

0.7

0.7 0.5

• The taxonomy tree of “Programming Techniques” according to the ACM Computing Classification System (http://www.acm.org/class/1998/ccs98.txt)

Semantic Matching ─ A Taxonomy Tree

Page 12: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

12

• The arc weights can be determined by human experts or machine learning algorithms [Singh 2005]

• Sibling arc weights do not need to add up to 1

• Three factors that affect the taxonomic class similarity

• The shortest path length between two classes

• Arc weights on the shortest path

• Level difference of two classes

Semantic Matching ─ Taxonomic Class Similarity

Page 13: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

13

• Taxonomic class similarity computation [Yang et al. 2005]

21**)1(),( 21cc dd

t

s GMN

NccTS

where

TS(c1, c2) is the taxonomic class similarity of classes c1 and c2

Ns: the number of edges of the shortest path

Nt: the number of edges of the whole tree

M: the product of the arc weights on the shortest path

: the level difference factor where G’s value is in (0.0, 1.0) and is the absolute difference of the depths of classes c1 and c2 (We assume G=0.5 here)

21 cc ddG

||21 cc dd

Semantic Matching ─ Taxonomic Class Similarity

Page 14: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

14

Programming Techniques

Applicative Programming

0.60.5General

Automatic Programming

Concurrent Programming

Sequential Programming

Object-Oriented Programming

Distributed Programming

Parallel Programming

0.8 0.5 0.90.7

0.7 0.5

Example

0766.012

5.0*)7.0*5.0*7.0(*)8

31(

)gProgrammin Oriented-Object g,Programmin dDistribute(

TS

• red arrows stop at their nearest common ancestor

Semantic Matching ─ Taxonomic Class Similarity

Page 15: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

15

• Encoding subtaxonomy trees into partonomy trees

• A converse task Computes the similarity of pairs of taxonomies e.g. subtaxonomies of the background taxonomy, as required in our Teclantic project (http://teclantic.cs.unb.ca)

• Allows the direct reuse of our partonomy similarity algorithm and permits weighted (or ‘fuzzy’) taxonomic subsumption with no added effort

Semantic Matching ─ Encoding Subtaxonomies

Page 16: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

16

Programming Techniques

ApplicativeProgramming

0.1 0.15General

AutomaticProgramming

ConcurrentProgramming

SequentialProgramming

Object-OrientedProgramming

DistributedProgramming

ParallelProgramming

0.3

0.1

0.15

* **

* *

* *

*

0.6 0.4

0.2

• Sibling arc weights must sum up to 1.0

• Classes are represented as arc labels (lexicographical ordered)

• All node labels except the root node label are changed into “Don’t Care”

Background Taxonomy tree of “Programming Techniques” for encoding

Semantic Matching ─ Encoding Subtaxonomies

Page 17: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

17

Credit TitleTuition

Duration$800

2months3

0.050.1 0.15

0.05Classification

0.65taxonomy

DistributedProgramming

course

SequentialProgramming

ParallelProgramming

*

*

0.6 0.4

*

*0.7 0.3

1.0Programming Techniques

*

DistributedProgramming

ConcurrentProgramming

Credit TitleDuration

$1000

3months3

0.20.05 0.05

0.05Classification

0.65taxonomy

Object-OrientedProgramming

course

SequentialProgramming

**0.8 0.2

1.0Programming Techniques

*

Tuition

Object-OrientedProgramming

Two course trees with encoded subtaxonomy trees

Semantic Matching ─ Encoding Subtaxonomies

• Weight assignment in the "Classification" branch (two options)

• By human expert

• By machine learning

• Normalizes corresponding weights in the background taxonomy

Page 18: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

18

Semantic Matching ─ Local Similarity

• Local similarity measures (for leaf nodes) Special-purpose similarity measures for various data types realizing semantics to be invoked when computing similarity of any two of their instances

• “Price” type

• “Date” type [Yang et al. 2005]

• . . .

Page 19: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

19

• Price

• Price is the omnipresent factor that determines buyers’ and sellers’ decision-making

• Price similarity seems to be asymmetric for buyers and sellers

e.g. buyer asks $800 and seller asks $1000 — Unsuccessful buyer asks $1000 and seller asks $800 — Successful The similarity of $800 and $1000 is different for the above cases

Semantic Matching ─ Price Matching

Page 20: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

20

• Transform the asymmetry to symmetry

• Buyers and sellers always have price ranges in their minds [Bpref, Bmax] and [Smin, Spref]

Bpref : buyer’s preferred price

Bmax : buyer’s maximum acceptable price

Smin : seller’s minimum acceptable price

Spref : seller’s preferred price

• Our price-range similarity measure is based on the intuition that the greater the overlap between the buyer’s and seller’s price ranges, the higher is their similarity value

Semantic Matching ─ Price Matching

Page 21: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

21

PriceRangeSim ([Bpref, Bmax], [Smin, Spref])Begin If Spref <= Bpref similarity = 1.0 else if Bmax < Smin similarity = 0.0 else if Bmax = Smin

similarity = else { MIN = min{MIN, Smin} MAX = max{MAX, Bmax}

similarity = } return similarity End.• This algorithm can be easily adapted to the “price”-typed attributes

e.g. “salary range” in job seeking and recruiting e-Market

• Pseudo code of the price-range similarity algorithm

MINMAX

005.0

MINMAX minmax SB

Semantic Matching ─ Price Matching Algorithm

Page 22: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

22

• “Date”-typed leaf node similarity measure

{1 –

| d1 – d2 |

365

0.0 if | d1 – d2 | ≥ 365

otherwiseDS(d1, d2) =

0.5

end_date

Nov 3, 2004

0.5

t1 t 2

start_date

May 3, 2004

Project

0.5

end_date

Feb 18, 2005

0.5

start_date

Jan 20, 2004

Project

0.74

where DS(d1, d2) is the date similarity of two dates d1 and d2

Semantic Matching ─ Date Matching

Page 23: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

23

Conclusion• Weighted trees for product/service descriptions

• Partonomy tree similarity algorithm

• Synchronously traverses trees top-down

• Aggregates intermediate similarity values bottom-up

• Semantic Global and Local Matching

• Taxonomic Class Similarity

• Encoding Subtaxonomies into Partonomies

• Leaf-Node Type Similarity Measures

• Future Work

• Improvement of Taxonomic Class Similarity

• Generalization of Local Similarity Measures

Page 24: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

24

References

[1] Yang, L., Ball, M., Bhavsar, V.C., and Boley, H. Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making, Journal of Business and Technology (to appear).[2] Boley, H., Bhavsar, V.C., Hirtle, D., Singh, A., Sun, Z., and Yang, L. A Match-Making System for Learners and Learning Objects. International Journal of Interactive Technology and Smart Education, August, 2005, 2(3):171-178.[3] Bhavsar, V.C., Boley, H., and Yang, L. A Weighted-Tree Similarity Algorithm for Multi-Agent Systems in e-Business Environments. Computational Intelligence, 2004, 20(4):584-602.[4] Singh, A., LOMGenIE: A Weighted Tree Metadata Extraction Tool, Master Thesis, Faculty of Computer Science, University of New Brunswick, Fredericton, Canada, September 2005.

Page 25: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

25

Thank you !

Page 26: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

26

Seller Weights

• Advertisements on TV, Internet, and in newspaper

Sellers always emphasize specific product/service attributes to attract buyers

• Our match-making system is buyer-seller-centric

Sellers also seek buyers having close preferences

Page 27: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

27

Seller Weights (Cont’d)• Suppose sellers do not have weights

buyer tree seller tree

2002

Car

FordWhite

Make

Color Year0.1

0.1

0.8

2002

Car

FordRed

Make

Color Year0.0

0.0

0.0

Similarity=1/2(0.1+0.0)1.0 // for “Make” +1/2(0.8+0.0)1.0 // for “Year” = 0.45

Page 28: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

28

Seller Weights (Cont’d)• Suppose sellers have identical weights

buyer tree seller tree

2002

Car

FordWhite

Make

Color Year0.1

0.1

0.8

2002

Car

FordRed

Make

Color Year0.3333

0.3333

0.3334

0.7834

Page 29: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

29

Seller Weights (Cont’d)• Sellers have arbitrary weights

buyer tree seller tree 1

2002

Car

FordWhite

Make

Color Year0.1

0.1

0.8

2002

Car

FordRed

Make

Color Year0.05

0.05

0.9

0.925

2002

Car

FordRed

Make

Color Year0.2

0.2

0.6

seller tree 2

2002

Car

FordRed

MakeColor Year

0.10.6 0.3

seller tree 3

0.85 0.65

• All the seller trees above are identical except the arc weights

• The buyer prefers to negotiate with seller 1 because they have closer preferences on the car attributes

Page 30: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

30

Seller Weights (Cont’d)

• Sellers can always select the averaged weights if they do not want to emphasize any attributes of their products/services

• Using seller weights, both buyers and sellers can find the most promising trading partners

• The negotiation space is decreased

Page 31: 1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley bhavsar@unb.ca Faculty of Computer Science University of New Brunswick (UNB) Fredericton,

31

Publications

[1] Lu Yang, Marcel Ball, Virendrakumar C. Bhavsar, and Harold Boley, "Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making", Journal of Business and Technology (to appear).[2] Harold Boley, Virendrakumar C. Bhavsar, David Hirtle, Anurag Singh, Zhongwei Sun, and Lu Yang, "A Match-Making System for Learners and Learning Objects", International Journal of Interactive Technology and Smart Education, August, 2005, 2(3):171-178. [3] Jing Jin, Biplab K. Sarker, Virendrakumar C. Bhavsar, Harold Boley, and Lu Yang, "Towards a Weighted-Tree Similarity Algorithm for RNA Secondary Structure Comparison", In Proceedings of the 8th International Conference on High Performance Computing in Asia Pacific Region, IEEE Computer Society, December 2005. [4] Lu Yang, Marcel Ball, Virendrakumar C. Bhavsar, and Harold Boley, "Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making", In Proceedings of Workshop of Business Agents and the Semantic Web (BASeWEB'05), May 8, 2005, Victoria, British Columbia, Canada.[5] Lu Yang, Biplab K. Sarker, Virendrakumar C. Bhavsar, and Harold Boley, "A Weighted-Tree Simplicity Algorithm for Similarity Matching of Partial Product Descriptions", In Proceedings of ISCA 14th International Conference on Intelligent and Adaptive Systems and Software Engineering, Toronto 2005, pp.55-60.[6] Virendrakumar C. Bhavsar, Harold Boley, and Lu Yang, "A Weighted-Tree Similarity Algorithm for Multi-Agent Systems in e-Business Environments", Computational Intelligence, 2004, 20(4), pp.584-602.[7] Riyanarto Sarno, Lu Yang, Virendrakumar C. Bhavsar, and Harold Boley, "The AgentMatcher Architecture Applied to Power Grid Transactions", In Proceedings of the First International Workshop on Knowledge Grid and Grid Intelligence, Halifax, 2003, pp.92-99.[8] Virendrakumar C. Bhavsar, Harold Boley, and Lu Yang, "A Weighted-Tree Similarity Algorithm for Multi-Agent Systems in e-Business Environments", In Proceedings of 2003 Business Agents and the Semantic Web (BASeWEB'03) Workshop, Halifax, Canada, June 14, 2003.