machine learning techniques for code smell …...on the diffuseness and the impact on...

Machine Learning Techniques for Code Smell Detection: A Systematic Literature

Review and Meta-Analysis (2019)Веселов Иван

20 февраля 2019

Code smell

Термин, обозначающий код с признаками проблем в системе.

2/67

Feature envy

3/67

Поиск потенциальных ошибок

4/67

Identification of Move Method Refactoring Opportunities (2009)

5/67

Identification of Move Method Refactoring Opportunities (2009)

6/67

Существующие проблемы

1. Code smell не всегда означает наличие проблемы

2. Существующие детекторы находят разные code smell

3. Часто детекторы требуют настройки параметров

7/67





F. A. Fontana, J. Dietrich, B. Walter, A. Yamashita, M. Zanoni, Antipattern and code smell false positives: Preliminary conceptualization and classification, in: Software Analysis, Evolution, and Reengineering (SANER), 2016 IEEE 23rd International Conference on, Vol. 1, IEEE, 2016, pp. 609–613.

8/67

Subtype knowledge anti-pattern

10/67

Subtype knowledge anti-pattern

11/67





F. A. Fontana, P. Braione, M. Zanoni, Automatic detection of bad smells in code: An experimental assessment., Journal of Object Technology 11 (2) (2012) 5–1.

12/67

Наивная оценка согласияПоиск feature envy

Метод проекта A.foo() A.getValue(int) ... Z.destroy()

Детектор 1 smelly smelly ... OK

Детектор 2 OK smelly ... smelly


13/67

Наивная оценка согласияПоиск feature envy

Метод проекта A.foo() A.getValue(int) ... Z.destroy()


Детектор 2 OK smelly ... smelly


14/67

Наивная оценка согласия

15/67

Cohen's kappa

16/67

Fleiss' kappa

17/67





E. Fernandes, J. Oliveira, G. Vale, T. Paiva, E. Figueiredo, A review-based comparative study of bad smell detection tools, in: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE ’16, ACM, New York, NY, USA, 2016, pp. 18:1–18:12. doi:10.1145/2915970.2915984 18/67





19/67

ML может помочь




20/67

Machine Learning Techniques for Code Smell Detection: A

Systematic Literature Review and Meta-Analysis

Поиск статей

22/67

Research questions

● RQ1: Code smells considered

● RQ2: Machine Learning Setup

● RQ3: Evaluation Setup

● RQ4: Performance Meta-Analysis

23/67

Research questions





24/67

Code smells considered

25/67

Опасные code smells

26/67

ML нужен не везде

27/67

Research questions





28/67

Machine Learning Setup

● RQ2.1: What independent variables have been considered?

● RQ2.2: What classification types have been considered?

● RQ2.3: What machine learning algorithms have been used?

● RQ2.4: What training strategies have been proposed?

29/67






30/67

Метрики Чидамбера и Кемерера (CK)

● WMC

● DIT

● NOC

● CBO

● RFC

● LCOM1

31/67

Depth of Inheritance Tree

● DIT (C0) = 0

● DIT (C0’) = 0

● DIT (C1) = 1

● DIT (C2) = 2

● DIT (C3) = 3

● DIT (C4) = 432/67

Response For a Class

33/67

Метрики процесса

● Количество изменений

● Время существования файла

● Похожесть имен файлов

34/67

Текстовая информация

35/67

Альтернативы● Метрики ориентированные на разработчика

● A. Perez , R. Abreu , Framing program comprehension as fault localization, J. Softw. 28 (10) (2016) 840–862.

● J. Padilha , J. Pereira , E. Figueiredo , J. Almeida , A. Garcia , C. Santanna , On the effectiveness of concern metrics to detect code smells: an empirical study, in: International Conference on Advanced Information Systems Engineering, Springer, 2014, pp. 656–671.

36/67

Альтернативы● Метрики ориентированные на разработчика

● A. Perez , R. Abreu , Framing program comprehension as fault localization, J. Softw. 28 (10) (2016) 840–862.

● J. Padilha , J. Pereira , E. Figueiredo , J. Almeida , A. Garcia , C. Santanna , On the effectiveness of concern metrics to detect code smells: an empirical study, in: International Conference on Advanced Information Systems Engineering, Springer, 2014, pp. 656–671.

37/67

Scattering and Tangling

38/67






39/67

Виды классификации

● Нет smell

● Есть не критичный smell

● Есть обычный smell

● Есть критичный smell

40/67






41/67

Использованные алгоритмы

42/67

Использованные алгоритмы

● Нужно внимательнее относиться к настройке гиперпараметров

● Ансамбли мало исследованы

43/67






44/67

Подходы к разбиению датасета

45/67

Research questions





46/67

Evaluation Setup

● RQ3.1: What types of validation techniques have been exploited?

● RQ3.2: Which have been the evaluation metrics?

● RQ3.3: Which have been the datasets considered?

47/67

Evaluation Setup




48/67

Способы тестирования

49/67

Рекомендации

Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2012). A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Transactions on Software Engineering, 38(6), 1276–1304. doi:10.1109/tse.2011.103

50/67

Evaluation Setup




51/67

Метрики оценки

52/67

Метрики оценки

Нужно меньше использовать метрики, которые зависят от threshold значения

53/67

Evaluation Setup




54/67

Датасеты

55/67

Датасеты

● Большинство исследований используют маленькие датасеты (до 8 проектов)

● Только в одном исследовании датасет разметили вручную

56/67

Датасеты● F. Palomba , D. Di Nucci , M. Tufano , G. Bavota , R. Oliveto , D.

Poshyvanyk, A. DeLucia , Landfill: an open dataset of code smells with public evaluation, in: Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on, IEEE, 2015, pp. 482–485 .

● F. Palomba , G. Bavota , M. Di Penta , F. Fasano , R. Oliveto , A. De Lucia, On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation, Empir. Softw. Eng. (2017) 1–34.

57/67




58/67




60/67

Датасеты

● 395 релизов 30 open-source проектов

● 13 видов code smell

● 17,350 провалидированых случаев

● Реальных code smell не так много

61/67

Research questions





62/67

Метаанализ

● Effect Size

● Hedges’ g

● Inverse-variance method

63/67

Влияние фичей

● Нужно выбирать не произвольные фичи

● Фичи должны подбираться под решаемую задачу

64/67

Влияние алгоритмов

Если выбрать неподходящий алгоритм, то результаты могут оказаться плохими

65/67

Влияние разбиения датасета

● Within-project работает лучше

● Возможно, переобучение

66/67

machine learning techniques for code smell …...on the diffuseness and the impact on...

Documents