mining sociotechnical information from software repositories

38
Mining Sociotechnical Information From Software Repositories UNIVERSITY OF SÃO PAULO, BRAZIL University of California, Irvine Informatics Seminar March/2014 Marco Aurélio Gerosa

Upload: marco-aurelio-gerosa

Post on 01-Nov-2014

991 views

Category:

Technology


0 download

DESCRIPTION

A large amount of data is produced during collaborative software development. The analysis of such data sets a great opportunity to better understand Software Engineering from the perspective of evidence-based research. Mining software repositories studies have explored both the technical and social aspects of software development contributed to the discovery of important information about how software development evolves and how developers collaborate. Several repositories store data regarding source code production (version control systems), communication between developers and users (forums and mailing lists), and coordination of activities (issue tracker, task managers, etc.). In the open source world, such data is available in large ecosystems of software development. Platforms such as GitHub host millions of repositories, which receive contributions from millions of developers worldwide. Some project repositories register data from more than a decade of development, enabling the analysis of projects from a historical perspective. In this talk, I will discuss some of the uses and challenges of mining software repositories, focusing on some works conducted in our group, such as: identification of change dependencies, evaluation of architectural degradation from commit meta-data, core-periphery analysis of developers participation, change-proneness prediction, analysis of the impact of refactoring on code quality, and relations between quality attributes of the test and the code being tested.

TRANSCRIPT

Page 1: Mining Sociotechnical Information From Software Repositories

Mining Sociotechnical Information From Software

Repositories

UNIVERSITY OF SÃO PAULO, BRAZIL

University of California, IrvineInformatics Seminar

March/2014

Marco Aurélio Gerosa

Page 2: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 2

Software repositories

March/2014

Programmer

Tester

UserClient

Programmer

ManagerSW Architect

Source Code

MessageArchives

BugReportsIssues Etc.

Current and historical artifacts and interactions are registered in software repositories

Computer Mediated Tools

Page 3: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 3

Repositories of repositories

March/2014

https://github.com/about/presshttp://octoverse.github.com/

11.3 millions repositories5.4 millions usersIn 2013:• 3 millions new users• 152 millions pushes• 25 millions comments• 14 millions issue• 7 millions pull requests

36K projectshttp://en.wikipedia.org/wiki/CodePlex

30K projectshttps://launchpad.net

324K projects3.4 millions developershttp://sourceforge.net/apps/trac/sourceforge/wiki/What%20is%20SourceForge.net

250K projects

http://en.wikipedia.org/wiki/Comparison_of_open_source_software_hosting_facilities

93K projects1 million users

200 projectshttp://projects.apache.org/indexes/alpha.html

661K projects29 billions of lines of codes3 millions users

33K projects

http://www.ohloh.net/

Page 4: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 4

Mining software repositories

March/2014

Mining

http://2014.msrconf.org/

“The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects.”

Information about a project

Information about an ecosystem

Information about Software Engineering

Decision making

Software understanding

Support maintenance

Empirical validation of ideas & techniques

Collaboration and software production

Practitioner Researcher

*3C Model: Fuks, H., Raposo, A., Gerosa, M.A., Pimentel, M. & Lucena, C.J.P. (2007) “The 3C Collaboration Model” in: The Encyclopedia of E-Collaboration, Ned Kock (org), ISBN 978-1-59904-000-4, pp. 637-644.

Com m unica tion

Coordina tionCoope ra tion

3C M odel*

Discussion listsComments on issuesCode commentsUser reportsQ&A sitesSocial mediaSource code

and artifacts Issue trackersProject management systemsReputation systems

Applications

Tag cloud from MSR 2014 CFP

Page 5: Mining Sociotechnical Information From Software Repositories

Examples

Page 6: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 6

Sentiment analysis on commits

March/2014

http://geeksta.net/geeklog/exploring-expressions-emotions-github-commit-messages/http://www.igvita.com/slides/2012/bigquery-github-strata.pdf

GitHub Data Challenge 2nd place

Page 7: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 7

Programming language relations

March/2014

https://github.com/mjwillson/ProgLangVisualise

http://www.igvita.com/slides/2012/bigquery-github-strata.pdf

Page 8: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 8

Mining discussion lists

March/2014

Christian Bird, Alex Gourley, Prem Devanbu, Michael Gertz, and Anand Swaminathan. 2006. Mining email social networks. MSR 2006

Anja Guzzi, Alberto Bacchelli, Michele Lanza, Martin Pinzger, and Arie van Deursen. 2013. Communication in open source software development mailing lists. MSR 2013

What is the discussion list for?

Page 9: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 9

Coordination requirements

March/2014

Cataldo, M., Dependencies in geographically distributed software development: Overcoming the limits of modularity, PhD Thesis

Santana, F. et a. “XFlow: An Extensible Tool for Empirical Analysis of Software Systems Evolution”. ESELAW 2011

Page 10: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 10

Programmers who changed this function also changed …

March/2014

Thomas Zimmermann, Peter Weissgerber, Stephan Diehl, and Andreas Zeller. 2005. Mining Version Histories to Guide Software Changes. IEEE Trans. Software Eng. 31, 6 (June 2005), 429-445. DOI=10.1109/TSE.2005.72 http://dx.doi.org/10.1109/TSE.2005.72

Page 11: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 11

Don’t program on Fridays

March/2014

http://www.slideshare.net/taoxiease/software-analytics-towards-software-mining-that-matters

Page 12: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 12

Which files are more buggy?

March/2014

Page 13: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 13

Lack of documentation

March/2014

Deficient Documentation Detection: A Methodology to Locate Deficient Project Documentation using Topic Analysis, Joshua Charles Campbell, Chenlei Zhang, Zhen Xu, Abram Hindle, and James Miller, MSR 2013

Page 14: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 14

Some questions from MSR 2013• How to recommend developers to bug reports?

• How to filter update notifications?

• What can apps’ permissions reveal?

• How to extract feature requests from apps online reviews?

• How to analyze peer code-review data?

• How to find API usage patterns?

• Is programming knowledge related to age?

• How do patches reach the Linux kernel?

• How do code clones evolve?

• How to process crash reports?

• How to predict defects and effort?

• Etc.March/2014

http://2013.msrconf.org/program.php

Page 15: Mining Sociotechnical Information From Software Repositories

Some of our work

Page 16: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 16

Some of our work1. Change dependencies identification

2. Design degradation identification

3. Key developers characterization

4. Unit tests’ feedback for code quality

5. Refactoring

6. Change prediction

7. Support for OSS projects newcomers

March/2014

Page 17: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 17

1) Change dependencies identification Dependency?

• “A dependency means that a client element has knowledge of the supplier element and a change in the supplier may affect the client” [Larman, 2004]

• A dependency relation means that the semantics of the depending elements is semantically or structurally dependent on the definition of the supplier element [UML Formal Specification]

• Unrecognized dependencies result in a higher number of defects [Herbsleb et al., 2006]

• Structural (or static), dynamic, or logical dependency?

• Dependencies may be hard to identify• Publisher/subscriber, polymorphism, clones, crosscutting concerns, semantic relations etc.

March/2014

A BC l as s A C l as s Bd o T h isF o rM e()

D ep en d en cy C h an g e

Page 18: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 18

Change dependencies Files frequently changed together share some sort of dependency [Gall et al. 1998]

The concept has proven useful in several different studies:◦ Software quality analysis [Cataldo & Nambiar, 2010]◦ Bugs prediction [D’Ambros et al. , 2009a]◦ Change prediction and change impact analysis [Zimmermann et al., 2005]◦ Uncover cross-cutting concerns [Adams et al., 2010]◦ Uncover design flaws and opportunities for refactoring [D’Ambros et al., 2009b]◦ Understand and evaluate software architecture [Zimmermann et al., 2003]◦ Requirements traceability [Ali et al., 2013]◦ Maintain documentation [Kagdi et al., 2006]

March/2014

Strong change dependency from B to A (the opposite is a much weaker dependency)

A logical dependency denotes an implicit and evolutionary relationship between software artifacts

Page 19: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 19

1.1) Structural x Change Dependencies

March/2014

Analysis of 150K commits of the ASF showed that:- 93% of the change dependencies did not involve structural dependencies- 95% of the structural dependencies did not imply in a change dependency

Oliva, G.A., Gerosa, M. A. (2011) “On the Interplay between Structural and Logical Dependencies in Free Software”. Brazilian Symposium on Software Engineering (SBES 2011)

Overlap between change dependencies and structural dependencies

Gustavo Oliva, PhD candidate

S truc tura lD e pe nde nc y C o-ch an g es ?

Page 20: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 20

1.2) Change dependencies originsManual classification of commits to understand the origins of change dependencies

March/2014

Category Joint-changes Total %Refactoring elements that belong to

a same semantic class80 19.6%

Structural dependencies on a changing semantic class

9 2.2%

Cross-cutting concerns 165 40.4%Overloaded revision 60 14.7%

Repository operations 21 5.1%Structural dependencies on specific

elements66 16.2%

Other reasons 7 1.7%Total 408  

Oliva, G.A., Santana, F., Gerosa, M. A., Souza, C. (2011) “Towards a Classification of Logical Dependencies Origins: A Case Study”. Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution (IWPSE-EVOL '11)

Gustavo Oliva, PhD candidate

Page 21: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 21

1.3) Preprocessing commits

March/2014

N Sum Mean StDev Skewness Kurtosis

Before 479,794 3,206,900 6.68 37.84 33.80 1,844.00

After 453,865 3,174,051 6.99 40.79 39.56 2,829.94

Evaluation in the Apache code repository showed that the produced grouping corresponded to 4.6% of the number of commits

What about commit habits/practices/policies? Social aspects matter!

Oliva, G. A., Santana, F., Gerosa, M. A., Souza, C. (2012), “Preprocessing Change-Sets to Improve Logical Dependencies Identification”, 6th Int. Workshop on Software Quality and Maintainability (SQM 2012)

Commit = change?Using the sliding time window approach [Zimmermann & Weißgerber, 2004] to group SVN commits

Gustavo Oliva, PhD candidate

Page 22: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 22

2) Design degradation identification

March/2014

Rigidity and fragility [Martin & Martin, 2006] identification based on commit metadata

Oliva, G., Steinmacher, I., Wiese, I.S., Gerosa, M.A. “What Can Commit Metadata Tell Us About Design Degradation?”, In: 13th International Workshop on Principles on Software Evolution (IWPSE 2013), Saint Petersburg.

Gustavo Oliva, PhD candidate

5.3

Rigidity => designs difficult to change due to ripple effects => commit density (number of changed files per commit) Fragility => designs break in different areas when a change is performed => commit dispersion (distance in the directory tree among file paths included in a commit)

Page 23: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 23

3) Key developers characterization

March/2014

Key developers participation

Oliva, G., Santana, F.W., da Silva, J. T., Oliveira, K.C.M., Werner, C.M.L., Souza, C.R.B. & Gerosa, M.A., “Evolving the System’s Core: A Case Study on the Identification and Characterization of Key Developers in Apache Ant”, Computing and Informatics [to appear].

Page 24: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 24

4) Unit tests’ feedback for code quality

March/2014

Number of asserts indicate - Cyclomatic complexity?- LOC ?- Method calls ?

22 ASF projects 3 industry projects

- “Asserted Objects” metric presents better results than “number of asserts”- Statistically difference in 20% of the projects

Mauricio Aniche, PhD candidate

Aniche, M., Oliva, G.A., Gerosa, M.A., “What Do the Asserts in a Unit Test Tell Us about Code Quality? A Study on Open Source and Industrial Projects”, 17th European Conference on Software Maintenance and Reengineering (CSMR 2013).

Page 25: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 25

5) Refactoring

March/2014

Most part of the documented refactoring does not reduce cyclomatic complexity. However, 23% of the documented refactoring reduce cyclomatic complexity while 12% of the other commits have the same effect.

Sokol, F., Aniche, M.F., Gerosa, M.A., “Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study”,. In: IV Brazilian Workshop of Agile Methods (WBMA 2013).

Francisco Sokol, MSc candidate

www.metricminer.org.br

Page 26: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 26

6) Change prediction

March/2014

Using social, process, and architectural metrics to improve change proneness prediction

13 projects, 6 classifiers, 11 metrics

Social and architectural metrics improved the prediction model

Similar results when considering projects grouped by change ratio

Igor Wiese, PhD candidate

Dimensions Metric Name Reference

Process

HCM Hassan [13] WHCM D’Ambros et al. [8] Churn D’Ambros et al. [8]

WChurn Nagappan and Ball [23]

Social

MFD – Modification File-Developer

New

DevParticipation Rahman & Devanbu [28] WDevParticipation New

Architecture

Rigidity Oliva et al. [26] Fragility Oliva et al. [26]

WCo-Changes New UD – Unworked

Dependencies New

Wiese, I. S, Nassif Jr, Steinmacher I, Re Reginaldo, Gerosa, M.A “Comparing communication and development networks for predicting file change proneness: An exploratory study considering process and social metrics”,. In: International Workshop on Software Quality and Maintainability (SQM 2014).

Page 27: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 27

7) Supporting Newcomers to OSS projects

Analysis of newcomers dropout reasons

Use of MSR to identify newcomers

March/2014

COMUNICAÇÃO

COORDENAÇÃOCOOPERAÇÃO

gera com prom issos gerenciados pela

organiza as tarefas para

dem andaAwareness

com um + açãoAção de tornar com um

co + ordem + açãoAção de organizar

em conjunto

co + operar + açãoAção de operar

em conjunto

Steinmacher, I., Wiese, I.S., Chaves, A.P., Gerosa, M.A., Why do newcomers abandon open source software projects?, 6th Int. Workshop on Cooperative and Human Aspects of Software Engineering (CHASE 2013)

Igor Steinmacher, PhD candidate

Systematic review on awareness in DSD [JCSCW 2013]

Page 28: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 28

7) Supporting Newcomers to OSS projects◦ Systematic Literature Review on barriers for newcomers to OSS◦ Qualitative analysis of interviews

March/2014

Igor Steinmacher, PhD candidate

Steinmacher, I., Wiese, I., Conte, T., Gerosa, M.A., Redmiles, D.F. The Hard Life of Newcomers to OSS Projects, 7th International Workshop on Cooperative and Human Aspects of Software Engineering Steinmacher, I.; Graciotto Silva, M. A. ; GEROSA, M.A., Barriers faced by newcomers to open source projects: a systematic review. 10th International Conference on Open Source Systems (OSS 2014)

Page 29: Mining Sociotechnical Information From Software Repositories

And now?

Page 30: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 30

Some MSR challenges• Years of sociotechnical data of millions of projects• BigData & large-scale computing

• Natural Language processing• Hao Zhong, H., Zhang, L., Xie, T. & Mei, H. “Inferring Resource Specifications from Natural

Language API Documentation”, ASE 2009.

• Incomplete and inaccurate information (e.g., commit comments)

• Multiple identities and commits on behalf of others

• Traceability and provenance• Davies, German, Godfrey & Hindle. “Software bertillonage: finding the provenance of an entity”,

MSR 2011

• Tools usage change over time• Guzzi et al. “Communication in open source software development mailing lists”, MSR 2013

March/2014

Page 31: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 31

What’s next? (1/2) Towards theories and consolidation

Sharing insights

Sharing patterns ◦ DAPSE http://dapse.unbox.org ◦ MSR Cookbook

http://swag.cs.uwaterloo.ca/~okononen/msr /

Sharing data◦ http://www.githubarchive.org/◦ http://www.ohloh.net / ◦ Dataset papers published at MSR◦ MSR Challenges dataset◦ Promisedata.org Promisedata.org◦ Flossmetrics.org◦ Flossmole.org◦ MetricMiner

Replicating studies◦ Ghezzi, G. & Gall, H. “Replicating Mining Studies

with SOFAS”, MSR 2013 March/2014

http://thomas-zimmermann.com/2013/10/software-analytics-sharing-information/

[Gaines, 1999]

Page 32: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 32

What’s next? (2/2) Actionable information in the Software Environments (new IDEs)

Software Analytics◦ “The use of analysis, data, and systematic

reasoning to make decisions”

March/2014

http://thomas-zimmermann.com/2013/10/software-analytics-sharing-information/

New sources◦ A lot of information is lost between commits => IDE &

desktop instrumentation and live data collection ◦ Singer, J. “Navtracks: Supporting navigation in

software maintenance”, ICSM'05◦ Blincoe, K., Valetto, G. & Goggins, S. “Proximity: a

measure to quantify the need for developers' coordination. CSCW'12

◦ Software execution logs◦ Social Media

◦ Storey, M., Treude, C., Deursen, A. & Cheng, L., “The impact of social media on software engineering practices and tools” FoSER'10

◦ Bougie, G., Starke, J., Storey, M. & German, D., “Towards understanding twitter use in software engineering: preliminary findings, ongoing challenges and future questions”, Web2SE'11

◦ Q&A Sites◦ “Over 92% of Stack Overflow questions about expert

topics are answered - in a median time of 11 minutes” [Mamykina et al., CHI'11]

◦ Campbell et al., “Deficient Documentation Detection: A Methodology to Locate Deficient Project Documentation using Topic Analysis”, MSR'13

Page 33: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 33

Conferences◦ MSR: Working Conference on Mining Software Repositories

◦ http://msrconf.org◦ ICSM: International Conference on Software Maintenance

◦ http://conferences.computer.org/icsm◦ CSMR-WCRE: Software Evolution Week (joins CSMR and WCRE)

◦ http://ansymo.ua.ac.be/csmr-wcre◦ ICSE: International Conference on Software Engineering

◦ http://www.icse-conferences.org◦ SCAM: International Working Conference on Source Code Analysis and Manipulation

◦ http://www.ieee-scam.org/◦ ESEM: International Symposium on Empirical Software Engineering and Measurement

◦ http://www.esem-conferences.org◦ PROMISE

March/2014

Page 34: Mining Sociotechnical Information From Software Repositories

MARCO AUREL IO GEROS A

(GEROS [email protected]) – OFF ICE 5228@DBH

@GEROS A_MARCO

HTTP: / / L APESSC. IME .USP.BR/ (ALL PUBL ICAT IONS ARE AVA IL ABLE AT TH IS S ITE )

HTTP : / /WWW. IME .USP.BR/~GEROS A

HTTP : / /NAP.USP.BR/NAWEB

Thank you!

Page 35: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 35

Other projects

March/2014

http://www.arquigrafia.org.br

Smart Audio City Guide

82,000 images of Brazilian architecture

A social mobile system based on geo-referenced audio information to support the mobility of blind people

Page 36: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 36

Opportunities for Research in Brazil More than 38k scholarships implemented in 2013 for sending students abroad

Visiting researcher (PVE Ciencia sem Fronteiras) - http://bit.ly/1fJPfLG ◦ Projects for 2 or 3 years, with a stay of 30 to 90 days per year in Brazil, continuous or not ◦ ≈ US$ 6400 (R$ 14,000)/month stayed in Brazil◦ ≈ US$ 23,000 (R$ 50,000)/year for funding + scholarships◦ Transportation assistance, among other benefits

Young Talent Attraction (BJT) - http://bit.ly/17T8ycr ◦ 1 to 3 years◦ Min ≈ US$ 3200 (R$ 50,000)/month◦ Min ≈ US$ 9200 (R$ 20,000)/year for funding◦ Transportation assistance, support for accommodation

Visiting scholar (CAPES PVE) http://capes.gov.br/cooperacao-internacional/multinacional/pve ◦ Visits from 15 days to 1 year◦ ≈ US$ 3200 or US$ 4100 (R$ 6900 or R$ 8900)/month depending on the level ◦ Support for accommodation and transportation

FAPESP (Sao Paulo state) - http://fapesp.br/147 ◦ Visiting professor: ≈US$ 6300, 5000, or 4200 (R$ 13,653, R$ 10,950, R$ 9184)/month

depending on the level, visits of any length up to 12 months + support for transportation and health insurance

◦ Post-doc: ≈US$ 2700 (R$ 5908) + support for accommodation. 6 to 36 months. USP (University of Sao Paulo) - http://www.usp.br/prp/pagina.php?menu=3&pagina=17

◦ Specific grants for receiving retired researchers or faculty on sabbatical Bilateral agreements (with several universities)

March/2014

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

0500

1,0001,5002,000

CNPq Investment on Research (in BRL mil-

lions)

1996

1998

2000

2002

2004

2006

2008

2010

2012

0

50,000

100,000

150,000

200,000

250,000

Scholarships abroad (in thousands BRL)

Page 37: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 37

References Change Dependencies

[Larman, 2004] LARMAN, CRAIG: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development. third. ed. : Prentice Hall, 2004

[UML Formal Specification] OMG: Object Management Group: Unified Modeling Language (UML). http://www.omg.org/spec/UML/

[Ball et al, 1997] BALL, THOMAS ; ADAM, JUNG-MIN KIM ; HARVEY, A. PORTER ; SIY, P.: If Your Version Control System Could Talk... In: ICSE Workshop on Process Modeling and Empirical Studies of Software Engineering, 1997

[Gall et al., 1998] GALL, HARALD ; HAJEK, KARIN ; JAZAYERI, MEHDI: Detection of Logical Coupling Based on Product Release History. In: Proceedings of the International Conference on Software Maintenance, ICSM ’98. Washington, DC, USA : IEEE Computer Society, 1998 — ISBN 0-8186-8779-7, p. 190–

Change Dependencies and Its Use

[Cataldo & Nambiar, 2010] CATALDO, MARCELO ; NAMBIAR, SANGEETH: The impact of geographic distribution and the nature of technical coupling on the quality of global software development projects. In: Journal of Software Maintenance and Evolution: Research and Practice, John Wiley & Sons, Ltd. (2010)

[D’Ambros et al., 2009a] D’AMBROS, MARCO ; LANZA, MICHELE ; ROBBES, ROMAIN: On the Relationship Between Change Coupling and Software Defects. In: ZAIDMAN, A. ; ANTONIOL, G. ; DUCASSE, S. (eds.): 16th Working Conference on Reverse Engineering, WCRE 2009, 13-16 October 2009, Lille, France : IEEE Computer Society, 2009 — ISBN 978-0-7695-3867-9, pp. 135–144

[Zimmermann et al., 2005] ZIMMERMANN, THOMAS ; WEISSGERBER, PETER ; DIEHL, STEPHAN ; ZELLER, ANDREAS: Mining Version Histories to Guide Software Changes. In: IEEE Trans. Softw. Eng. vol. 31. Piscataway, NJ, USA, IEEE Press (2005), Nr. 6, pp. 429–445

March/2014

Page 38: Mining Sociotechnical Information From Software Repositories

Marco Aurélio Gerosa ([email protected]) 38

References Change dependencies and its use (continued)

[Adams et al., 2010] ADAMS, BRAM ; JIANG, ZHEN MING ; HASSAN, AHMED E.: Identifying crosscutting concerns using historical code changes. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE ’10. Cape Town, South Africa : ACM, 2010 — ISBN 978-1-60558-719-6, pp. 305–314

[D’Ambros et al., 2009b] D’AMBROS, MARCO ; LANZA, MICHELE ; LUNGU, MIRCEA: Visualizing Co-Change Information with the Evolution Radar. In: IEEE Trans. Software Eng vol. 35 (2009), Nr. 5, pp. 720–735

[Zimmermann et al., 2003] ZIMMERMANN, T. ; DIEHL, S. ; ZELLER, A.: How history justifies system architecture (or not). In: Software Evolution, 2003. Proceedings. Sixth International Workshop on Principles of, 2003, pp. 73–83

[Ali et al., 2013] ALI, N. ; JAAFAR, F. ; HASSAN, A.E.: Leveraging historical co-change information for requirements traceability. In: Reverse Engineering (WCRE), 2013 20th Working Conference on, 2013, pp. 361–370

[Kagdi et al., 2006] KAGDI, H. ; MALETIC, J.I.: Mining for Co-Changes in the Context of Web Localization. In: Web Site Evolution, 2006. WSE ’06. Eighth IEEE International Symposium on, 2006, pp. 50–57

Improving change dependencies identification

[Zimmermann et al., 2004] ZIMMERMANN, THOMAS ; WEIẞGERBER, PETER: Preprocessing CVS Data for Fine-Grained Analysis. In: Proceedings 1st International Workshop on Mining Software Repositories (MSR 2004). Los Alamitos CA : IEEE Computer Society Press, 2004, pp. 2–6

Design Degradation

[Martin & Martin, 2006] MARTIN, ROBERT C. ; MARTIN, MICAH: Agile Principles, Patterns, and Practices in C#. first. ed. : Prentice Hall, 2006

March/2014