research in software engineering
DESCRIPTION
This presentation is about a lecture I gave within the "Software systems and services" immigration course at the Gran Sasso Science Institute, L'Aquila (Italy): http://cs.gssi.infn.it/. http://www.ivanomalavolta.comTRANSCRIPT
Ivano Malavolta
RESEARCH in software engineering
Roadmap
Software engineering research
Empirical strategies
Writing good research papers
Homework
Software engineering research
Some contents of this part of lecture extracted from Ivica Crnkovic’s lecture on software engineering research at Mälardalen University (Sweden)
What makes good research?
is it HARD? is it USEFUL?
is it ELEGANT?
These are all orthogonal and equally respectful
Very little chances that you will excel in all three axes
We are young researchers, don’t refuse usefulness, why limit your impact to dusty publications?
http://goo.gl/d1YM9v
My vision about research
Research
Theory Industrial projectsProgramming Experimentation
Ivano Malavolta. Research Statement. November 2013. http://goo.gl/99N5AS
The basic characteristic of SE
Real world practical PROBLEM
Real world practical SOLUTION
?
Research objectives
Key objectives
• Quality àutility as well as functional correctness
• Cost à both of development and of use
• Timeliness à good-enough result, when it’s needed
Address problems that affect practical software
Real world practical PROBLEM
Real world practical SOLUTION
Research objectives: example
Example
Research strategy
Real world practical PROBLEM
Real world practical SOLUTION
Research setting IDEALIZED PROBLEM
Research setting SOLUTION to
IDEALIZED PROBLEM
Research product (technique, method, model, system, …)
Research product: example
Validation of the results
Real world practical PROBLEM
Real world practical SOLUTION
Research setting IDEALIZED PROBLEM
Research setting SOLUTION to
IDEALIZED PROBLEM
Research product (technique, method, model, system, …)
Validation task 1 Does the product solve the idealized problem?
Validation of the results
Real world practical PROBLEM
Real world practical SOLUTION
Research setting IDEALIZED PROBLEM
Research setting SOLUTION to
IDEALIZED PROBLEM
Research product (technique, method, model, system, …)
Validation task 1 Does the product solve the idealized problem?
Validation task 2 Does the product help to solve the practical problem?
Validation of the results: example
SE research process
Research questions
Research validation
Research results
Types of research questions
FEASIBILITY
CHARACTERIZATION
METHOD/MEANS
GENERALIZATION
DISCRIMINATION
Does X exist, and what is it? Is it possible to do X at all? What are the characteristics of X? What exactly do we mean by X? What are the varieties of X, and how are they related?
How can we do X? What is a better way to do X? How can we automate doing X?
Is X always true of Y? Given X, what will Y be?
How do I decide whether X or Y?
Example: software architecture
The software architecture of a program or computing system is the structure or structures of the system, which comprise software components, the externally visible properties of those components and the relationships among them
L. Bass, P. Clements, R. Kazman, Software Architecture In Practise, Addison Wesley, 1998
System
subsystem Subsystem
component component component
Example: SA research questions
FEASIBILITY
CHARACTERIZATION
METHOD/MEANS
GENERALIZATION
DISCRIMINATION
Is it possible to automatically generate code from an architectural specification?
What are the important concepts for modeling software architectures?
How can we exploit domain knowledge to improve software development?
What patterns capture and explain a significant set of architectural constructs?
How can a designer make tradeoff choices among architectural alternatives?
SE research process
Research questions
Research results
Research validation
Research results
Real world practical PROBLEM
Real world practical SOLUTION
Research setting IDEALIZED PROBLEM
Research product (technique, method, model, system, …)
Types of research results QUALITATIVE & DESCRIPTIVE MODELS
TECHNIQUES
SYSTEM EMPIRICAL MODELS ANALYTIC MODELS
Report interesting observations Generalize from (real-life) examples Structure a problem area; ask good questions
Invent new ways to do some tasks, including implementation techniques Develop ways to select from alternatives
Embody result in a system, using the system both for insight and as carrier of results
Develop empirical predictive models from observed data
Develop structural models that permit formal analysis
Example: SA research results QUALITATIVE & DESCRIPTIVE MODELS
TECHNIQUES
SYSTEM EMPIRICAL MODELS ANALYTIC MODELS
Early architectural models Architectural patterns Domain-specific software architectures UML to support object-oriented design
Architectural languages Communication metrics as indicator of impact on project complexity Formal specification of higher-level architecture for simulation
SE research process
Research questions
Research results
Research validation
Research validation
Real world practical PROBLEM
Real world practical SOLUTION
Research setting IDEALIZED PROBLEM
Research setting SOLUTION to
IDEALIZED PROBLEM
Research product (technique, method, model, system, …)
Validation task 1 Does the product solve the idealized problem?
Validation task 2 Does the result help to solve the practical problem?
Types of research validation PERSUASION
IMPLEMENTATION
EVALUATION ANALYSIS
Formal model Empirical model
EXPERIENCE
Qualitative model Decision criteria Empirical model
I thought hard about this, and I believe…
Here is a prototype of a system that…
Given these criteria, the object rates as…
Given the facts, here are consequences…
Rigorous derivation and proof Data on use in controlled situation
Report on use in practice Narrative Comparison of systems in actual use Data, usually statistical, on practice
Example: SA research validation PERSUASION
IMPLEMENTATION
EVALUATION ANALYSIS
Formal model Empirical model
EXPERIENCE
Qualitative model Decision criteria Empirical model
Early architectural models
Early architectural languages
Taxonomies, performance improvement
Formal schedulability analysis User interface structure
Architectural patterns Domain-specific architectures Communication and project complexity
“NO-NO”s for software engineering research
• Assume that a result demonstrated fro a 10K-line system will scale to a 500K-line system
• Expect everyone to do things “my way”
• Believe functional correctness is sufficient
• Assume the existence of a complete, consistent specification
• Just build things without extracting enduring lessons
• Devise a solution in ignorance of how the world really works
Building blocks for research
Feasibility
Characterization
Method/means
Generalization
Selection
Qualitative model
Technique
System
Empirical model
Analytic model
Persuasion
Implementation
Evaluation
Analysis
Experience
Question Result Validation
Is this a good plan?
Feasibility
Characterization
Method/means
Generalization
Selection
Qualitative model
Technique
System
Empirical model
Analytic model
Persuasion
Implementation
Evaluation
Analysis
Experience
Question Result Validation
A common good plan
Feasibility
Characterization
Can X be done better?
Generalization
Selection
Qualitative model
Technique
Build Y
Empirical model
Analytic model
Persuasion
Implementation
Measure Y, compare to X
Analysis
Experience
Question Result Validation
Is this a good plan?
Feasibility
Characterization
Method/means
Generalization
Selection
Qualitative model
Technique
System
Empirical model
Analytic model
Persuasion
Implementation
Evaluation
Analysis
Experience
Question Result Validation
A common, but bad, plan
Feasibility
Characterization
Method/means
Generalization
Selection
Qualitative model
Technique
System
Empirical model
Analytic model
Persuasion
Implementation
Evaluation
Analysis
Experience
Question Result Validation
Two other good plans
Can X be done at all?
Characterization
Is X always true of Y?
Selection
Qualitative model
Technique
Build a Y that does X
Empirical model
Formally model Y, prove X
“Look it works!”
Implementation
Check proof
Experience
Question Result Validation
Method/means Evaluation
How do you trust a research then?
1. What are the problems from the real world? – Are they general? – What are the elements of them?
2. Are the solutions general? What are their limits?
Real world practical PROBLEM
Real world practical SOLUTION
?
EMPIRICAL SOFTWARE ENGINEERING
Some contents of this part of lecture extracted from Matthias Galster ‘s tutorial titled “Introduction to Empirical Research Methodologies” at ECSA 2014
Empirical strategies*
*We will have a dedicated course on this topic
Empirical software engineering
Scientific use of quantitative and qualitative data to – understand and – improve
software products and software development processes Data is central to address any research question Issues related to validity addressed continuously
[Victor Basili]
Why empirical studies?
Anecdotal evidence or “common-sense” often not good enough
– Anecdotes often insufficient to support decisions in the industry – Practitioners need better advice on how and when to use
methodologies
Evidence important for successful technology transfer
– systematic gathering of evidence – wide dissemination of evidence
Dimensions of empirical studies
“In the lab” versus “in the wild” studies Qualitative versus quantitative studies Primary versus secondary studies
“In the lab” versus “in the wild” studies
Common “in the lab” methods – Controlled experiments – Literature reviews – Simulations
Common “in the wild” methods
– Quasi-experiments – Case studies – Survey research – Ethnographies – Action research
Examples
Qualitative versus quantitative studies
Qualitative research studying objects in their natural setting and letting the findings emerge from the observations
– inductive process – the subject is the person
Quantitative research quantifying a relationship or to compare two or more groups with the aim to identify a cause-effect relationship
– fixed implied factors – focus on collected quantitative data à promotes comparison and
statistical analyses
They are complementary
Primary versus secondary studies
Primary studies empirical studies in which we directly make measurements or observations about the objects of interest, whether by surveys, experiments, case studies, etc. Secondary studies empirical studies that do not generate any data from direct measurements, but:
– analyze a set of primary studies – usually seek to aggregate the results from these in order to
provide stronger forms of evidence about a phenomenon
Examples
…and what about this?
Types of empirical studies
• Survey • Case study • Experiment
Survey
Def: a system for collecting information from or about people to describe, compare or explain their knowledge, attitudes and behavior Often an investigation performed in retrospect Interviews and questionnaires are the primary means of gathering qualitative or quantitative data These are done through taking a sample which is representative from the population to be studied
Example: our survey on arch. languages 1. ALs Identification
– Definition of a preliminary set of ALs – Systematic search
2. Planning the Survey
3. Designing the survey
4. Analyzing the Data – vertical analysis (and coding) + horizontal analysis
Case study
Def: an empirical enquiry to investigate one instance (or a small number of instances) of a contemporary software engineering phenomenon within its real-life context, especially when the boundary between phenomenon and context cannot be clearly specified Observational study Data collected to track a specific attribute or establishing relationships between different attributes Multivariate statistical analysis is often applied
Example
Experiment
Def: an empirical enquiry that manipulates one factor or variable of the studied setting.
1. Identify and understand the variables that play a role in software development, and the connections between variables
2. Learn cause-effect relationships between the development process and the obtained products
3. Establish laws and theories about software construction that explain development behaviour
Experiment process
Example
http://dl.acm.org/citation.cfm?id=2491411.2491428
What to choose?
How to have an impact in reality?
This is called technology transfer
Writing good software engineering papers
Contents of this part of lecture extracted from Ivica Crnkovic’s lecture on software engineering research papers writing at Mälardalen University (Sweden)
Research Papers
The basic and most important activity of the research
• Visible results, quality stamp • Means for communications with other researchers
What, precisely, was your contribution? – What question did you answer? – Why should the reader care? – What larger question does this address?
What is your new result? – What new knowledge have you contributed that the reader can use
elsewhere? – What previous work (yours or someone else’s) do you build on? What do
you provide a superior alternative to? – How is your result different from and better than this prior work? – What, precisely and in detail, is your new result?
Why should the reader believe your result? – What standard should be used to evaluate your claim? – What concrete evidence shows that your result satisfies your claim?
If you answer these questions clearly, you’ll probably communicate your result well
A good research paper should answer a number of questions
Let’s reconsider our SE research process…
Research questions
Research results
Research validation
What do program committees look for? The program committee looks for
– a clear statement of the specific problem you solved – the question about software development you answered – an explanation of how the answer will help solve an important
software engineering problem
You'll devote most of your paper to describing your result, but you should begin by explaining what question you're answering and why the answer matters
Research questions
Research results
Explain precisely – what you have contributed to the store of software engineering
knowledge – how this is useful beyond your own project
What do program committees look for? The program committee looks for
– interesting, novel, exciting results that significantly enhance our ability
• to develop and maintain software • to know the quality of the software we develop • to recognize general principles about software • or to analyze properties of software
You should explain your result in such a way that someone else could use your ideas
What do program committees look for? What’s new here?
Use verbs that shows RESULTS, not only efforts
Philosophical moment
• What existing technology does your research build on?
• What existing technology or prior research does your research provide a superior alternative to?
• What’s new here compared to your own previous work?
• What alternatives have other researchers pursued?
• How is your work different or better?
What has been done before? How is your work different or better?
70
Explain the relation to other work clearly
What, precisely, is the result?
• Explain what your result is and how it works. Be concrete and specific. Use examples. – Example: system implementation
• If the implementation demonstrates an implementation technique, how does it help the reader use the technique in another setting?
• If the implementation demonstrates a capability or
performance improvement, what concrete evidence does it offer to support the claim?
• If the system is itself the result, in what way is it a contribution to knowledge? Does it, for example, show you can do something that no one has done before?
Why should the reader believe your result? Show evidence that your result is valid—that it actually helps to solve the problem you set out to solve
73
What do program committees look for? Why should the reader believe your result?
• If you claim to improve on prior art, compare your result objectively to the prior art
• If you used an analysis technique, follow the rules of that analysis technique
• If you offer practical experience as evidence for your result, establish the effect your research has. If at all possible, compare similar situations with and without your result
• If you performed a controlled experiment, explain the experimental design. What is the hypothesis? What is the treatment? What is being controlled?
• If you performed an empirical study, explain what you measured, how you analyzed it, and what you concluded
A couple of words on the abstract of a paper People judge papers by their abstracts and read the abstract in order to decide whether to read the whole paper.
It's important for the abstract to tell the whole story
Don't assume, though, that simply adding a sentence about analysis or experience to your abstract is sufficient; the paper must deliver what the abstract promises
Example of an abstract structure:
1. Two or three sentences about the current state of the art, identifying a particular problem
2. One or two sentences about what this paper contributes to improving the situation
3. One or two sentences about the specific result of the paper and the main idea behind it
4. A sentence about how the result is demonstrated or defended
Coming back to the initial example…
State of the art
Overall contribution
Specific results Validation
✓✗ ✓ ✓ ✗
Second try…
State of the art
Overall contribution
Specific results Validation
Homework
Homework
ICSE 2014 features a "Future of Software Engineering" track, which provides delegates with a unique opportunity to assess the current status of software engineering and to indicate where the field is heading in the future. FOSE is an invitation-only ICSE track that is held (approx.) every 7 or more years at ICSE An international group of leading experts has been invited to report on different topics, to provide a broad and in-depth view of the evolution of the field.
http://2014.icse-conferences.org/fose
Homework GOALS: 1. to have the chance to study a specific area of software
engineering that may be of interest to you 2. to be exposed to recurrent and important problems in
software engineering TASKS: 1. Pick an article from the FOSE 2014 proceedings 2. Carefully read it and analyse it in terms of:
– its research domain, its evolution over time, and its future challenges – [where possible] understand which research strategies have been
applied either in the paper or in the research area in general
3. give a presentation (max 25 slides) to the classroom – other post-docs and students will attend the presentations
What this lecture means to you?
You now know how to carry on research in SE Don’t focus on the “size” of the problem, but on
– the relevance (the practical, but also the theoretical!) – the accuracy in the investigation (problem and evaluation research)
When conducting empirical research, don’t make claims you cannot eventually measure Finally, don’t think in black and white only
– don’t divide the world in methods, analyses, case study, etc. – don’t be afraid to look also at other disciplines à we are software
engineers in any case J
Suggested readings
1. Checking App Behavior Against App Descriptions (Alessandra Gorla, Ilaria Tavecchia, Florian Gross, Andreas Zeller), In Proceedings of the 36th International Conference on Software Engineering, ACM, 2014.
2. Linares-Vásquez, M., Bavota, G., Bernal-Cárdenas, C., Oliveto, R., Di Penta, M., and Poshyvanyk, D., "Mining Energy-Greedy API Usage Patterns in Android Apps: an Empirical Study", in Proceedings of 11th IEEE Working Conference on Mining Software Repositories (MSR'14), Hyderabad, India, May 31- June 1, 2014, pp. 2-11
3. Shaw, M. (2003), Writing Good Software Engineering Research Paper., in Lori A. Clarke; Laurie Dillon & Walter F. Tichy, ed., 'ICSE' , IEEE Computer Society, , pp. 726-737 .
4. Shaw, M. (2002), 'What makes good research in software engineering?', STTT 4 (1) , 1-7 .
References
http://link.springer.com/book/10.1007%2F978-3-642-29044-2
Contact Ivano Malavolta |
Post-doc researcher Gran Sasso Science Institute
iivanoo
www.ivanomalavolta.com