katherine w. mccain*, june m. verner, gregory w. hislop, william evanco, & vera cole. college of...
Post on 02-Jan-2016
213 Views
Preview:
TRANSCRIPT
Katherine W. McCain*, June M. Verner, Gregory W. Hislop, William Evanco, & Vera Cole.
College of Information Science & TechnologyDrexel University
Combining Bibliometric and Knowledge Elicitation Techniques to Map a
Knowledge Domain
PHILADELPHIA brand Bibliometrics
OrganizationsISI: Gene Garfield, Henry SmallDrexel: Belver Griffith, Howard White, Chaomei Chen, Xia Lin, Carl Drott, Jackie Mancall, and a host of grad studentsCenter for Research Planning: Dick Klavans, Len Simon
Major themes: citation analysis/core literatures; aging of scholarly literatures; single period and longitudinal studies of scholarly literatures and fields; real-time, on-the-fly mapping of literatures, fields, paradigm shifts, vocabulary structures, etc.; bibliometric applications in collection management, competitive intelligence, institutional evaluation, etc.
AGENDAIntroduction: Domain analysis & software engineeringMapping methods:
Author Cocitation AnalysisKnowledge Elicitation – card sorting
ResultsACA clusters & mapPFNet author networkCard sorting clusters & map
Comparisons of ACA and KE resultsConclusions
DOMAIN ANALYSIS
SYSTEMS ANALYSIS: the task of identifying the operations and objects needed to specify information processing in a particular application domain
INFORMATION SCIENCE: the study of the field (knowledge domain) as a thought or discourse community. It focuses on such topics as knowledge organization, structure, cooperation patterns, language and communication forms, information systems, and relevance criteria as a way of understanding these communities (Hjørland, B., & Albrechtsen, H. (1995)
An Aside On DISCOURSE COMMUNITY
a common public goal or goalsa body of specialized knowledge mechanisms of intercommunication and participationa genre (e.g. scholarly journal)a specialized vocabulary
A group (likely to be geographically dispersed) who share:
Adapted from John Swales, Genre Analysis (1990 Cambridge)
SOFTWARE ENGINEERING
The establishment and use of sound engineering principles in order to obtain economically software that is reliable and works efficiently on real machines. the technological and managerial discipline concerned with systematic production and maintenance of software products that are developed and modified on time and within cost estimates
DOMAIN ANALYSIS OF SOFTWARE ENGINEERING
a study of the journal literature of software engineering, based on both author referencing patterns and index term assignments a study of the factors that affect the “visibility” of software engineering authors an INSPEC-based co-descriptor mapping of software engineeringa conjoint study of the intellectual and cognitive structure of software engineeringCitation content analysis of Brooks’ Mythical Man-Month
TWO APPROACHES TO MAPPING SE
BIBLIOMETRICS: Cocited author mapping uses the patterns of co-occurrence of authors’ names in reference lists to examine the intellectual structure of scholarly literatures and, by extension, the fields that produce those literatures KNOWLEDGE ELICITATION: the process of collecting from a human source of knowledge, information that is thought to be relevant to that knowledge. [Cooke]
Card sorting: structural analysis of mental models elicited via sorting named cards into piles
AUTHOR COCITATION ANALYSIS
AUTHOR SELECTION: authors highly cited in texts and in the core SE literature = 60 authors selected for studyCOCITATION DATA GATHERED: cocitation counts retrieved from SCISEARCH, 1990 – 1997ANALYSIS:
Raw cocitation counts -- PFNetsCorrelation matrix – cluster analysis & multidimensional scaling
60 AUTHORS
Abdel-Hamid, Tarek K.
Albrecht, Allan J.
Basili, Victor R.
Beizer, Boris
Biggerstaff, Ted J.
Boehm, Barry W.
Booch, Grady
Brooks, Frederick P., Jr.
Card, David N.
Clarke, Lori A.
Coad, Peter
Curtis, Bill
David, Allan M.
DeMarco, Tom
Dijkstra, Edsger W.
Fagan, M. E.
Fenton, Norman E.
Garlan, David
Ghezzi, Carlo
Gilb, Tom
Glass, Robert L.
Goldberg, Adele
Gomaa, Hassan
Grady, Robert B.
Harrison, W.
Hoare, C.A.R
Humphrey, Watts S.
Jackson, Michael A.
Jacobson, Ivar
Jones, T. Capers
Kaiser, G. E.
Kemerer, C. F.
Kernighan, Brian W.
Kitchenham, Barbara A.
Lehnman, M. M.
McCabe, Thomas J.
Meyer, Bertrand
Mills, Harlan D.
Musa, John D.
Myers, Glenford J.
Parnas, David L.
Pfleeger, Shari L.
Pressman, Roger S.
Prieto-Diaz, R.
Ramamoorthy, C. V.
Rombach, H. D.
Rumbaugh, James
Selby, R. W.
Shaw, Mary
Shepperd, M.
Shneiderman, Ben
Sommerville, Ian
Tichy, W. F.
Tracz, Will
Wasserman, A. I.
Weiser, M.
Weyuker, Elaine J.
Wing, Jeanette, M.
Yourdon, Edward
Zave, Pamela
1. 1982 Brooks, FP... 2. 1987 Brooks, FP... 3. 4. 5. 1981 Jones, TC.. 6. 7. 1973 Pfleeger, S... 8. 1984 Weyuker, E 9. 10.
CITATIONS
Source Papers
CA = BROOKS FP AND CW = JONES TC *
CA = BROOKS FP AND CA = PFLEEGER S
Retrieval Strategy *
* Multiple forms of authors' names were used in the search strategies
558 338 271 1392
1333 1213
Raw Cocitation MatrixBROOKS FP
DIJKSTRA E
GLASS RL
JONES TC
HUMPHREY W
WEYUKER E
BASILI V 831
367 118 288
159 333 197 106 639
74 66 9 129 2249 230
BA
SIL
I V
BR
OO
KS
FP
DIJ
KST
RA
E
GL
ASS
RL
JON
ES
TC
HU
MPH
RE
Y W
AL
BR
EC
HT
W
32 5 15 8 363 14 1276
Data Gathering for ACA
ACA ANALYSESRaw Cocitation Matrix
PFNet: links nodes (authors) based on their single highest co-occurrence counts. The result is generally a network structure with some authors appearing as major foci (many links to others) representing specialties
Correlation MatrixHierarchical cluster analysis: 8 cluster solution identifies major subject clustersMultidimensional scaling: 2 dimensional map shows overall structure and major themes
Knowledge Elicitation Methods
Interviews and observation Process tracing (e.g. protocol analysis) Conceptual techniques
Card sorting is a conceptual technique that can be done alone or combined with semi-structured interviews.
Card Sorting
Software engineers contacted via e-mail, invited to participate in studyTask: sort cards bearing authors’ names into piles, label piles, complete short questionnaire
As many piles as desiredPiles with single authorsPile of “don’t know” or “aren’t software engineers
46 respondents participated in postal mail study (a few interviews)
Brooks, F.
Stack of cards with authors' name sent to respondents with instructions
Don't Know
MetricsFormal Methods
Cards were sorted into piles and labeled, based on respondents' perceptions
7
0 1
0 1 8
1 2 5 37
0 0 30 4 3
7 28 0 0 1 0
0 0 3 2 1 1 2
BOOCH
DIJKSTRA
HOARE
JACOBSON
PFLEEGER
SOMMERVILLE
BASILIA
BD
EL
-HA
MID
BA
SIL
I
BO
OC
H
DIJ
KS
TR
A
HO
AR
E
PF
LE
EG
ER
JA
CO
BS
ON
RAW "CO-PILE" COUNTS
Card Sorting Procedure
CARD SORTING ANALYSES(correlation matrix)
Hierarchical cluster analysis—8 cluster levelMultidimensional scaling – 2 dimensional map
•
•Goldberg
•Meyer
•Ghezzi
•Rumbaugh
•Booch•Coad
•Jacobson
•Hoare
•Wing
•Kernighan
•Dijkstra
•Tichy
•Kaiser
•Jackson
•Tracz
•Shaw
•Biggerstaff
• Davis
•Prieto-Diaz
•Zave
• Wasserman
•Parnas
•Gomaa
•Weiser
•Yourdon
•Jones
•Sommerville
•Shneiderman
•
• Ramamoorthy
•Brooks
•Pressman
•Mills
•Lehman
•Glass
• Harrison
•Clarke
•Boehm
•Curtis
•Myers
•Humphrey
•Abdel-Hamid
•Albrecht
•Pfleeger
•Gilb
•Kemerer
•Beizer
•Basili
•Musa
•Selby•
Fagan
•Grady
•Rombach
•Weyuker
•Fenton
•McCabe
•• Card
•Shepperd
DeMarco
Garlan
Kitchenham
Cocitation Map of 60 Highly Cited Authors in Software Engineering
1990 - 1997
LOW FORMALSW ARCHITECTURE/ SW REUSE
OBJECT-ORIENTED ANALYSIS & DESIGN/ PROGRAMMING
FORMAL APPROACHES TO DEVELOPMENT/ FORMAL METHODS
SYSTEMS ANALYSIS & DESIGN
SW TESTING/ RELIABILITY
SW METRICS
SW PERFORMANCE
MACRO LEVEL
SW PROJECT MGT
MICRO LEVEL
ABDEL-HAMID
ALBRECHT
BASILI
BEIZER
BIGGERSTAFF
BOEHM
BOOCH
BROOKS
CARD
CLARKE
COAD CURTIS
DAVIS
DEMARCO
DIJKSTRA
FAGAN
FENTON
GARLAN
GHEZZI
GILBGLASS
GOLDBERG
GOMAA
GRADY
HARRISON
HOARE
HUMPHREY
JACKSONJACOBSON
JONES
KAISER
KEMERER
KERNIGHANKITCHENHAM
LEHMAN
MCCABE
MEYER
MILLS
MUSA
MYERS
PARNAS
PFLEEGER
PRESSMAN
PRIETO-DIAZ RAMAMOORTHY
ROMBACH
RUMBAUGH
SELBYSHEPPERD
SHAW
SHNEIDERMAN
SOMMERVILLE
TICHY
TRACZ
WASSERMAN
WEISER
WEYUKER
WING
YOURDON
ZAVE
PFNet of Raw Cocitation Counts for 60 Software Engineering Authors
1992 - 1997.
Comparisons: ACA and KE
Cluster similarity – most authors in similar clusters in terms of membership. Some differences in labeling There are differences between the way authors’ works are cited and the way the authors are perceived in terms of labels (known for textbook writing, cited for specific textbook content)
CARD SORTING CLUSTERS COCITATION CLUSTERS JONES BASILI BASILI PFLEEGER PFLEEGER ROMBACH ROMBACH SW METRICS CARD CARD SW METRICS MCCABE MCCABE GRADY GRADY FENTON FENTON KITCHENHAM KITCHENHAM HARRISON HARRISON SELBY SELBY SHEPPERD SHEPPERD KEMERER WEYUKER ALBRECHT KEMERER ALBRECHT
SE MANAGEMENT BOEHM BOEHM
PROCESS MODELING GILB GILB SE PROJECT CURTIS CURTIS MANAGEMENT HUMPHREY HUMPHREY ABDUL-HAMID ABDUL-HAMID LEHMAN LEHMAN BROOKS
CARD SORTING CLUSTERS COCITATION CLUSTERS GARLAN JONES RAMAMOORTHY DAVIS FORMAL DIJKSTRA DIJKSTRA FORMAL METHODS/ METHODS/ HOARE HOARE FORMAL APPROACHES SW ARCHITECTURE PARNAS PARNAS SHAW SHAW WING WING ZAVE ZAVE GHEZZI GHEZZI KERNIGHAN KERNIGHAN OBJECT ORIENTED BOOCH BOOCH OO ANALYSIS PROGRAMMING & RUMBAUGH RUMBAUGH & DESIGN DESIGN JACOBSON JACOBSON PROGRAMMING MEYER MEYER COAD COAD GOLDBERG GOLDBERG SHNEIDERMAN SE METHODOLOGIES/ PRESSMAN PRESSMAN SYSTEMS ANALYSIS SE TEXTS SOMMERVILLE SOMMERVILLE & DESIGN DEMARCO DEMARCO YOURDON YOURDON WASSERMAN WASSERMAN GOMAA GOMAA JACKSON JACKSON BROOKS GLASS MILLS MYERS DAVIS
CARD SORTING CLUSTERS COCITATION CLUSTERS BIGGERSTAFF BIGGERSTAFF SW REUSE TRACZ TRACZ SW ARCHITECTURE PRIETO-DIAZ PRIETO-DIAZ SW REUSE KAISER SW TOOLS & KAISER TICHY ENVIRONMENTS TICHY GARLAN
Comparisons: ACA and KE
Map similarity – similar distribution of authors and clusters along X-axis (r=0.73) but not along Y-axis (r=-0.08)The most important structural theme in Software Engineering, the “micro macro” dimension, exists in both citation patterns and in perceptions of the field by citing authors. Along the Y-axis, citing patterns focus on the content of authors’ work while general perceptions include more aspects of the authors’ personae.
Conclusions
Boehm, Basili, Booch, and Hoare are central figures in the Software Engineering R&D literature; we can identify other authors as probable linkers between research specialties. The main organizing principle in SE is a continuum of activities related to the process of software design, development, and evaluation. Key specialties in Software Engineering (in the decade of the 1990s) included Object-Oriented Programming, Analysis & Design, Formal Methods, Software Reuse, Software Testing & Reliability, Software Process Management, and Software Metrics.
Conclusions
ACA (mapping, PFNets) and KE (cardsorting) provide complementary views of software engineering. KE methods increase our understanding of the domain by capturing subjects’ mental models of the domain and providing additional information about mapped entities ACA and KE provide useful cross-validation. The structure of the literature as seen through networks of author indebtedness (citation of previous work) is a good reflection of their mental models of the field, the place of the (cited) authors, and the relationships among their contributions
top related