mining logical clones in software: revealing high-level business & programming rules

31
Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules Wenyi Qian 1 , Xin Peng 1 , Zhenchang Xing 2 , Stan Jarzabek 3 , Wenyun Zhao 1 1 Fudan University, China 2 Nanyang Technological University, Singapore 3 National University of Singapore, Singapore

Upload: wood

Post on 25-Feb-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules. Wenyi Qian 1 , Xin Peng 1 , Zhenchang Xing 2 , Stan Jarzabek 3 , Wenyun Zhao 1 1 Fudan University, China 2 Nanyang Technological University, Singapore 3 National University of Singapore, Singapore. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Logical Clones in Software:Revealing High-Level Business &

Programming RulesWenyi Qian1, Xin Peng1, Zhenchang Xing2, Stan Jarzabek3, Wenyun Zhao1

1Fudan University, China2Nanyang Technological University, Singapore3National University of Singapore, Singapore

Page 2: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Logical Clones

• may not well documented• revealing high-level rules

Page 3: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Logical Clones

• Logical clones consisting of:–Similar methods–Similar code fragments–Similar entity classes–Persistent data projects

Page 4: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Logical Clones

• Today’s techniques on clone/similarity detection:– Simple clone (text, token, AST…)– Structural clone (simple clone)– Similar design structures (similarity metrics, machine learning)

• They are not enough to detect high-level clones:– lack of high-level information– need of pre-defined templates, such as certain design pattern

Page 5: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Approach Overview

input

abstraction

output

Page 6: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Program Model

• Methods & functional clusters• Entity classes• Code clones• Persistent data objects

Page 7: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Program Model

• Methods & functional clusters– Semantic clustering

Page 8: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Program Model

• Entity classes– Encapsulating information with getter/setter

Page 9: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Program Model

• Code clones– Simple clones in different methods

Page 10: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Program Model

• Persistent data objects– Data tables in DB or data entries in files

Page 11: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Process

PosScreenprocessPay

PosPayCheck

PosScreenprocessPay

PosPayGiftCard

PosClearPayment

PosScreen

<Method> <Method>

<Method>

<Method>

<Method>

<Entity class>

<Entity class>

<Entity class>

Page 12: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Process

PosScreenprocessPay

PosPayCheck

PosScreenprocessPay

PosPayGiftCard

PosClearPayment

PosScreen

<Method> <Method>

<Method>

<Method>

<Method>

<Entity class>

<Entity class>

<Entity class>

Page 13: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Process

Page 14: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Process

PosScreenprocessPay

PosPayCheck

PosScreenprocessPay

PosPayGiftCard

PosClearPayment

PosScreen

<Method> <Method>

<Method>

<Method>

<Method>

<Entity class>

<Entity class>

<Entity class>

Page 15: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Process

Page 16: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Process

PosScreenprocessPay

PosPayCheck

PosScreenprocessPay

PosPayGiftCard

PosClearPayment

PosScreen

<Method> <Method>

<Method>

<Method>

<Method>

<Entity class>

<Entity class>

<Entity class>

Page 17: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Mining Process

Page 18: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Tool: MiLico

Page 19: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Case Study

• Project: Opentaps 1.4.0– 14,351 classes & interfaces– 253,743 methods

• 1690 logical clones mined– at least 3 nodes & 2 instances

Page 20: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Case Study

Page 21: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Categories of Logical Clones

• Categories of Mined Logical Clones (manual work)– Programming Convention (37%)– Design Structure (24%)– Business Task (23%)– Business Process (16%)

Page 22: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Categories of Logical Clones

• Programming Convention– Similar ways to implement similar functions

Page 23: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Categories of Logical Clones

• Design Structure– Similar interaction structures

Page 24: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Categories of Logical Clones

• Business Task– Similar ways to implement similar business task

Page 25: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Categories of Logical Clones

• Business Process– Similar business process or sub-process

Page 26: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Human Study

• 5 senior graduate students, 2 questions:• Helpful for Programming understanding?• Helpful for Reuse/Evolution?

Page 27: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Human Study

Page 28: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Human Study

• 5 senior graduate students, 2 questions:• Helpful for Programming understanding?

YES• Helpful for Reuse/Evolution?

YES

Page 29: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Discussion

• Helpful for reuse, without knowledge of code details

• Developers with good domain knowledge will use logical clones better

• Making MiLiCo integrated with IDEs will make logical clones more useful

Page 30: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Conclusion

• The concept of logical clones• The approach for mining logical clones• The tool: MiLoCo• A case study, showing that logical clones

are helpful in software understanding, reuse and maintainance

Page 31: Mining Logical Clones in Software: Revealing High-Level Business & Programming Rules

Thanks for your attention!