kernel recipes 2013 - automating source code evolutions using coccinelle

Post on 24-May-2015

733 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

New APIs are continually added to the Linux kernel, to improve functionality, reliability, or performance. Nevertheless, it is a challenge to update old code to use these new APIs. Coccinelle is a program matching and transformation tool for C code that has been extensively been applied to the Linux kernel. This talk will introduce Coccinelle and illustrate how it can be used for this kind of code renovation.

TRANSCRIPT

Automating Source Code Evolutions usingCoccinelle

Julia Lawall (Inria/LIP6)

Joint work withGilles Muller, René Rydhof Hansen,

Nicolas Palix

September 24, 2013

1

Legacy software:Changing priorities, changing requirements

Linux kernel examples:I Booleans:

– Use 0 / 1?– Use true / false?

I Managed memory:– kmalloc requires kfree, request_irq requiresfree_irq

– Devm interface implicitly cleans up allocated resources.

Issues:I These changes require pervasive, scattered modifications.I Nothing forces the changes to be made.I Developers don’t pick up on new coding strategies.

2

The use of booleans over time

Lin

ux 2

.6.2

0

Lin

ux 2

.6.2

2

Lin

ux 2

.6.2

4

Lin

ux 2

.6.2

6

Lin

ux 2

.6.2

8

Lin

ux 2

.6.3

0

Lin

ux 2

.6.3

2

Lin

ux 2

.6.3

4

Lin

ux 2

.6.3

6

Lin

ux 3

.0

Lin

ux 3

.2

Lin

ux 3

.4

Lin

ux 3

.6

Lin

ux 3

.8

Lin

ux 3

.10

0

10000

20000

30000

Occurrences boolean values

boolean variables

ints in bools

(Roughly every 6 months, February 2007 - June 2013).

3

The use of booleans over time - rate of bad code

Lin

ux 2

.6.2

0

Lin

ux 2

.6.2

2

Lin

ux 2

.6.2

4

Lin

ux 2

.6.2

6

Lin

ux 2

.6.2

8

Lin

ux 2

.6.3

0

Lin

ux 2

.6.3

2

Lin

ux 2

.6.3

4

Lin

ux 2

.6.3

6

Lin

ux 3

.0

Lin

ux 3

.2

Lin

ux 3

.4

Lin

ux 3

.6

Lin

ux 3

.8

Lin

ux 3

.10

0

5

10

15

20

%

int values in bool variables

4

Booleans: A concrete example

Desired property: bool variables should be true or false.

Code fragment:

static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {struct drbd_peer_request *rs_req;bool rv = 0;spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {

if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {

rv = 1;break;

}}spin_unlock_irq(&mdev->tconn->req_lock);return rv;

}

5

The use of devm functions over time

Linux 2.6.20

Linux 2.6.22

Linux 2.6.24

Linux 2.6.26

Linux 2.6.28

Linux 2.6.30

Linux 2.6.32

Linux 2.6.34

Linux 2.6.36

Linux 3.0

Linux 3.2

Linux 3.4

Linux 3.6

Linux 3.8

Linux 3.10

0

200

400

600

Occurrences

platform kzalloc

platform devm_kzalloc

i2c kzalloc

i2c devm_kzalloc

6

Devm functions: A concrete example

Desired property (simpler than introducing devm functions):I A common pattern is platform_get_resource followed

by devm_ioremap_resource.

I devm_ioremap_resource does error handling forthe platform_get_resource value.

I Separate error handling is not needed.

Code fragment:

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!res)

return -ENODEV;ahb->regs = devm_ioremap_resource(&pdev->dev, res);

7

How to make these changes automatically andreliably?

Requirements:I Find relevant code.

– grep does this...

I Make changes.– sed does this...

Problem: Grep and sed don’t know about code structure orsemantics.

8

Our approach: Coccinelle

I Static analysis to find patterns in C code.

I Automatic transformation.

I User scriptable, based on patch notation(semantic patches).

http://coccinelle.lip6.fr/

http://coccinellery.org/

9

Boolean example, part 1

@@ -1932,14 +1932,14 @@static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {

struct drbd_peer_request *rs_req;- bool rv = 0;+ bool rv = false;

spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {

if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {

- rv = 1;+ rv = true;

break;}

}spin_unlock_irq(&mdev->tconn->req_lock);return rv;

}

First step:I Replace bool b = 0; by bool b = false;

I Replace bool b = 1; by bool b = true;10

Semantic patch: booleans, part 1

@@identifier b;@@

bool b =- 0+ false;

@@identifier b;@@

bool b =- 1+ true;

11

Semantic patch application: booleans, part 1

Result:

@@ -1932,14 +1932,14 @@static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {

struct drbd_peer_request *rs_req;- bool rv = 0;+ bool rv = false;

spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {

if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {

rv = 1;break;

}}spin_unlock_irq(&mdev->tconn->req_lock);return rv;

}

Affects 187 lines, 124 files in Linux 3.10.I Fixes the declaration, but not the subsequent assignment...

12

Boolean example: part 2

Problem:I We cannot replace every 0 by false, and 1 by true.I The assignment must involve a boolean variable.

Solution: Type constraints on metavariables.

@@ bool b; @@b =- 0+ false

@@ bool b; @@b =- 1+ true

13

Semantic patch application: booleans, part 2

Result:

@@ -1932,14 +1932,14 @@static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {

struct drbd_peer_request *rs_req;- bool rv = 0;+ bool rv = false;

spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {

if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {

- rv = 1;+ rv = true;

break;}

}spin_unlock_irq(&mdev->tconn->req_lock);return rv;

}

Affects 657 lines, 236 files in Linux 3.10.

14

Remaining issues

I Function arguments.

I Function return values.

I Booleans used with non-bool variables.

Beyond the scope of this talk...

15

platform_get_resource example, part 1

mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!mem) {

dev_err(&pdev->dev, "no mem resource?");return -ENODEV;

}irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);if (!irq) {

dev_err(&pdev->dev, "no irq resource?");return -ENODEV;

}... // 35 lines of codedev->base = devm_ioremap_resource(&pdev->dev, mem);if (IS_ERR(dev->base)) {

r = PTR_ERR(dev->base);goto err_unuse_clocks;

}

16

platform_get_resource example, part 1

- mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);- if (!mem) {- dev_err(&pdev->dev, "no mem resource?");- return -ENODEV;- }

...+ mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);

dev->base = devm_ioremap_resource(&pdev->dev, mem);if (IS_ERR(dev->base)) {

r = PTR_ERR(dev->base);goto err_unuse_clocks;

}

Problem: Need both platform_get_resource anddevm_ioremap_resource.

I Separated by arbitrary code fragments.

17

Semantic patch: platform_get_resource, part 1

Copy the code fragment, from the first line to the last line

@@expression res, pdev, n, e, x;

@@

mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!mem) {

dev_err(&pdev->dev, "no mem resource?");return -ENODEV;

}irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);if (!irq) {

dev_err(&pdev->dev, "no irq resource?");return -ENODEV;

}...dev->base = devm_ioremap_resource(&pdev->dev, mem);

18

Semantic patch: platform_get_resource, part 1

Drop the irrelevant parts

@@expression res, pdev, n, e, x;

@@

mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!mem) {

dev_err(&pdev->dev, "no mem resource?");return -ENODEV;

}

...dev->base = devm_ioremap_resource(&pdev->dev, mem);

19

Semantic patch: platform_get_resource, part 1

Abstract over irrelevant detailsy

@@expression mem, pdev, n, e;statement S;@@

mem = platform_get_resource(pdev, IORESOURCE_MEM, n);if (!mem) S

...e = devm_ioremap_resource(&pdev->dev, mem);

20

Semantic patch: platform_get_resource, part 1

Introduce transformationsy

@@expression mem, pdev, n, e;statement S;@@

- mem = platform_get_resource(pdev, IORESOURCE_MEM, n);- if (!mem) S

...+ mem = platform_get_resource(pdev, IORESOURCE_MEM, n);

e = devm_ioremap_resource(&pdev->dev, mem);

21

Semantic patch: platform_get_resource, part 1

Add sanity checks

@@expression mem, pdev, n, e;statement S;@@

- mem = platform_get_resource(pdev, IORESOURCE_MEM, n);- if (!mem) S

... when != mem+ mem = platform_get_resource(pdev, IORESOURCE_MEM, n);

e = devm_ioremap_resource(&pdev->dev, mem);

22

Semantic patch application:platform_get_resource, part 1

Result:

- mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);- if (!mem) {- dev_err(&pdev->dev, "no mem resource?");- return -ENODEV;- }

irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);if (!irq) {

dev_err(&pdev->dev, "no irq resource?");return -ENODEV;

}...

+ mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);dev->base = devm_ioremap_resource(&pdev->dev, mem);if (IS_ERR(dev->base)) {

r = PTR_ERR(dev->base);goto err_unuse_clocks;

}

Affects 42 files in Linux 3.10.

23

platform_get_resource example, part 2

Problem: Some calls are missed because &pdev->dev isrenamed.

Code fragment:

struct device *dev = &pdev->dev;void __iomem *addr = NULL;struct stmmac_priv *priv = NULL;struct plat_stmmacenet_data *plat_dat = NULL;const char *mac = NULL;

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!res)

return -ENODEV;

addr = devm_ioremap_resource(dev, res);if (IS_ERR(addr))

return PTR_ERR(addr);

24

Semantic patch: platform_get_resource, part 2

@@expression res, pdev, n, e, x, e1, e2;statement S;@@

x = &pdev->dev... when != x = e1

- res = platform_get_resource(pdev, IORESOURCE_MEM, n);- if (!res) S

... when != memwhen != x = e2

+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);e = devm_ioremap_resource(x,res);

Result: Updates 4 more files in Linux-3.10.

25

Summary

We have seen the need for:

I Search and replace for atomic code fragments.– Declaration and initialization of boolean variables.

I Search and replace using type information.– Assignments of boolean variables.

I Search and replace for scattered code fragments– platform_get_resource somewhere followed bydevm_iomap_resource.

I Taking renamings into account.– &pdev->dev vs dev.

26

Conclusion

CoccinelleI Define and perform arbitrary transformations across a

code base.

Reasonable performance in most cases.I Boolean example takes 7 minutes on a 4 core laptop.

Impact:I Over 1000 patches in Linux based on Coccinelle.I Over 40 semantic patches in linux/scripts/coccinelle.

http://coccinelle.lip6.fr/http://coccinellery.org/

27

top related