internet searching and browsing in a multilingual world an experiment on the chinese business...

29
Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

Internet Searching and Browsing in a Multilingual WorldInternet Searching and Browsing in a Multilingual World

An Experiment on the Chinese Business Intelligence Portal

Acknowledgment: NSF/NIJ Grant

Page 2: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

2

OutlineOutline

• Motivation

• The Chinese Business Intelligence Portal– System Description– Results of Usability Study

• Conclusions

Page 3: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

IntroductionIntroduction

Page 4: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

4

MotivationMotivation

• As the Internet grows in popularity worldwide, more users want to access Web content in their native languages– The majority of the total global online population

(63.5%) lives in non-English-speaking areas (Global-Reach, 2002)

– Such population is estimated to grow rapidly, much faster than English-speaking population

• However, existing search engines may not serve their needs, because most technologies have been developed for English-speaking users

Page 5: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

5

This PresentationThis Presentation

• The following slides present our efforts in creating and evaluating intelligent Web portals that address the above needs– The Chinese business information serves as our

research testbed

• Through the studies, we aim to achieve better understanding of human interaction and analysis with automated systems developed for Internet searching and browsing in a multilingual world

Page 6: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

The Chinese Business Intelligence Portal (CBizPort)The Chinese Business Intelligence Portal (CBizPort)

Page 7: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

7

CBizPortCBizPort

• The Chinese Business Intelligence Portal (CBizPort)– Two versions of user interface: Simplified Chinese and

Traditional Chinese– URLs

• Introduction: http://ai.bpa.arizona.edu/go/dl/cbizport.html• Portal: http://ai17.bpa.arizona.edu:8080/big5biz/index.html

– Each version has the same user interface and provides the same functions

• Encoding conversion• Meta searching major Chinese information sources• Summarization, Categorization• Providing links to major Chinese business Web resources

– The following slides show the system architecture and screen shots of CBizPort

Page 8: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

8

Page 9: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

9

Keywords:

Meta searches 8 major information sources of Mainland China, Hong Kong, and Taiwan

Provides links to major Chinese business Web sites and resources

Provides both Simplified and Traditional Chinese versions of user interface

Allows input of multiple key terms

Page 10: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

10

Search Page Result Page

CategorizerA two-sentence summary on left, original page on right

Summarizer

Web pages grouped by key phrases extracted by mutual information algorithm (non-exclusive categorization)

Page 11: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

11

Evaluation of CBizPortEvaluation of CBizPortObjectives

1. To evaluate the performance of summarizer as a preview function and categorizer as an overview function

2. To compare CBizPort with regional Chinese search engines to study its effectiveness and usability

3. To evaluate, in comparison with existing regional Chinese search engines, the information quality obtained from CBizPort and its capability of searching for cross-regional business information

Page 12: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

12

Experimental DesignExperimental Design

• Searching and browsing were studied• Scenario-based, culturally oriented tasks, e.g.,

– A search task (4 min): “Find two cities in mainland China that Motorola has set up its manufacturing operations”

– A browse task (5 min): “Describe, in a number of distinct themes, the economic impacts of removing trade barriers between mainland China and Taiwan towards Hong Kong ”

• Theme identification method (Chen et al., 2001)– Pilot test: 3 subjects used up all the time in most tasks

only focused on effectiveness but not efficiency

Page 13: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

13

10 Tasks in the Experiment (1 hour)Subject’s Origin

Tool Setting Hong Kong Taiwan China

CBizPort Basic searching (with neither summarizer nor analyzer)

SO1 SO2 SO3

BO1 BO2 BO3

Basic searching + with summarizer only

SM1 SM1 SM1

BM1 BM1 BM1

Basic searching + with categorizer only

SA1 SA1 SA1

BA1 BA1 BA1

Regional Chinese SE

General searching and browsing

SG1 SG1 SG1

BG1 BG1 BG1

Cross-regional searching and browsing

SC1 SC2 SC3

BC1 BC2 BC3

S = search task; B = browse task; O = Basic searching (with neither summarizer nor analyzer); M = Basic searching + with summarizer only; A = Basic searching + with categorizer only; G = General searching and browsing; C = Cross-regional searching and browsing; same number signals the same question across different regions

(Random assignment of tasks is used for different settings)

Page 14: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

14

ComparisonsComparisons

SearchSearch

BrowseBrowse

SearchSearch

BrowseBrowse

Openfind

YahooHK

Sina.com

or

or

CBizPort

With or without summarizer

With or without summarizer

With or without categorizer

With or without categorizer

CompareCompare

Page 15: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

15

SubjectsSubjects

• 30 subjects, 10 from each region, were recruited– Rationale: equal influence of regional impacts

• Each subject used CBizPort and another search tool according to his/her origin

Subject’s origin Search tool CBizPort version

Hong Kong YahooHK Traditional Chinese

Taiwan OpenFind Traditional Chinese

Mainland China Sina.com Simplified Chinese

Page 16: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

16

ExpertsExperts

• Three experts, one from each region, were recruited to provide answers to all browse tasks – First, the experts identify the set of relevant

answers (organized into themes) to a browse task

– Then, they modified the answers by adding some of subjects’ responses that they judged as relevant

– The above two steps are repeated for all the other browse tasks

Bla bla bla

Page 17: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

17

HypothesesHypotheses

• Three sets of hypotheses were tested– CBizPort’s Enhanced Analysis Capabilities

• Searching and browsing• With or without summarizer/categorizer

– SE Performance Comparison• Searching and browsing capabilities• Individual settings and combination*

– Users’ Subjective Evaluation• Information quality• cross-regional searching capability• overall satisfaction

– Auxiliary hypotheses: Performance of the three regions are not significantly different

We tried to mimic a situation that each subject was allowed to use both CBizPort and benchmark search engine together to solve the same problem

Page 18: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

18

CBizPort Experts’ answers

Benchmark SE

Page 19: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

19

Page 20: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

20

Performance MeasuresPerformance Measures

• Accuracy = Percentage of correct answers• Precision = number of correct themes identified by users /

total number of themes identified by users • Recall = number of correct themes identified by users /

total number of themes identified by an expert• F value = 2*Recall*Precision / (Precision + Recall)• Information quality: accessibility, appropriateness of

amount, believability, completeness, …, etc. (Wang & Strong, 2002)

• Subjective evaluation: cross-regional searching capability, overall satisfaction, protocol analysis, post-hoc test (to study whether the three SEs yield significantly different results)

Page 21: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

21

Accuracy of search tasksAccuracy of search tasksAccuracy

36.67%

25.00%

35.00%

40.00%

28.33%

65.00%

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

CBiz CBiz+Summ CBiz+Categ Bench (gen) Bench (cross) Combined

Page 22: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

22

Precision of browse tasksPrecision of browse tasksPrecision

58.65%

51.05%53.33%

55.67%

66.37%

76.80%

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

CBiz CBiz+Summ CBiz+Categ Bench (gen) Bench (cross) Combined

Page 23: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

23

Recall of browse tasksRecall of browse tasksRecall

22.86%

26.06% 26.56%

21.83%

25.78%

43.08%

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

40.00%

45.00%

50.00%

CBiz CBiz+Summ CBiz+Categ Bench (gen) Bench (cross) Combined

Page 24: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

24

F value of browse tasksF value of browse tasksMean F value

31.39%33.10% 32.60%

29.02%

34.31%

52.32%

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

CBiz CBiz+Summ CBiz+Categ Bench (gen) Bench (cross) Combined

Page 25: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

25

Information QualityInformation QualityInformation Quality

4.55

4.49

4.39

4.47

4.40

4.28

4.374.35

4.10

4.15

4.20

4.25

4.30

4.35

4.40

4.45

4.50

4.55

4.60

CBiz(Present) CBiz(Coverage) CBiz(Usability) CBiz(all) Bench(Present) Bench(Cov) Bench(Usab) Bench(all)

Page 26: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

26

Users’ Subjective EvaluationUsers’ Subjective EvaluationUsers' Subjective Evaluation

4.45

4.37

4.14

4.03

3.80

3.90

4.00

4.10

4.20

4.30

4.40

4.50

CBiz(cross) CBiz(satis) Bench(cross) Bench(satis)

Page 27: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

27

Subjects’ Verbal CommentsSubjects’ Verbal Comments

• Subjects liked summarizer and categorizer– Subj.#15: “… good performance in summarization

and categorization, more focused results can be found”; #26: “… very handy”; #6: “…useful tools to enhance the searching ability” (11 subjects)

• CBizPort provides a wide coverage and variety of searching options– Subj.#2: “… Yahoo Search Engine is more limited

when search certain term in a specific region … While CBizport can fulfill what Yahoo couldn’t do.”; #4: “… more search engines to choose from” (4 subjects)

Page 28: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

28

Subjects’ Verbal Comments (2)Subjects’ Verbal Comments (2)• Subjects are familiar with benchmark SEs

– Subj#27: “I am familiar with the format of Openfind. So that's the reason that I am more satisfied with it than CBizPort.”; (4 subjects)

• Benchmark SEs are not good at cross-regional information searching– Subj#15: “Sina gives many results but they are not

focused, and is poor at searching HK and Taiwan results”; #5: “provide more accurate regional searching”

• CBizPort is user friendly but slow– #3: “Yahoo not as precise as CBizPort”; #28: “… easier

to search” (7 subjects); “slow” (3 subjects)

Page 29: Internet Searching and Browsing in a Multilingual World An Experiment on the Chinese Business Intelligence Portal Acknowledgment: NSF/NIJ Grant

29

ConclusionsConclusions• CBizPort’s summarizer and categorizer provide helpful

analysis capabilities for users’ search and browse tasks– CBizPort’s searching and browsing performance is comparable to

that of regional Chinese search engines

• CBizPort can significantly augment the searching and browsing ability of regional Chinese search engines, thus improving human integration of regional information and analysis– Information quality, cross-regional searching capability and overall

satisfaction of CBizPort are comparable to those of regional Chinese search engines

• CBizPort is better than regional Chinese search engines in terms of analysis functions, cross-regional searching capabilities and user-friendliness