2013 coverity scan. project spotlight: libreoffice

Coverity Scan Project Spotlight: LibreOffice

COVERITY SCAN PROJECT SPOTLIGHT: LIBREOFFICE

2

Coverity Scan Service The Coverity Scan™ service began as the largest public-private sector research project in the world focused on open source software quality and security. Initiated in 2006 with the U.S. Department of Homeland Security, Coverity now manages the project, providing our development testing technology as a free service to the open source community to help them build quality and security into their software development process. With almost 700 projects participating, the Coverity Scan service enables open source developers to scan–or test–their Java, C and C++ code as it is written, flag critical quality and security defects that are difficult (if not impossible) to identify with other methods and manual reviews, and provide developers with actionable information to help them to quickly and efficiently fix the identified defects. More than 20,000 defects identified by the Coverity Scan service were fixed by open source developers in 2012 alone.

We’ve expanded beyond our annual Coverity Scan Report to create a series of open source project spotlights. This spotlight features LibreOffice, an open source office suite developed by The Document Foundation, and a member of the Scan service since October of 2012.

Introduction to LibreOfficeLibreOffice got its start in 2010 as an offshoot of the OpenOffice open source collaboration suite. OpenOffice had its roots in StarOffice, which Sun Microsystems acquired in 1999. In starting LibreOffice, The Document Foundation set out to build a better open source software suite by eschewing strict rules about design protocol and strict restrictions, and adopting a more “people-centered” philosophy.

LibreOffice embodies the collaborative philosophy which is often associated with open source projects. Its philosophy was described by one of their committers, Norbert Thiebaud, as “believe in the people.” People are willing to do the right thing and you must trust them. The community is focused on mentoring and supporting new members. Long-time members lead by example to communicate the values of the project, with new members emulating that behavior. At the core of these principles is the promise of better quality and higher reliability software that gives greater flexibility at zero cost.

Today, the project has the backing of all the major Linux distributions, including Red Hat, Novell and Ubuntu, as well as the support of the Free Software Foundation (FSF), Google, Intel and AMD. It is the default office suite of the most popular Linux distributions, and is available in more than 112 languages and for a variety of computing platforms, including Microsoft Windows, Mac OS X and Linux.


3

LibreOffice joined Coverity Scan in October of 2012. When the LibreOffice team first started using the Scan service, the project had a defect density of 1.11. The team has been working diligently to reduce the number of defects in the code, fixing more than 2,000 defects so far, and improving their defect density by almost 25%.

LIBREOFFICE ANALYSIS: 2013

Year VersionLines of Code

AnalyzedOutstanding

Defects Defects Fixed

2013 to date 4.1 9,016,902 4,717 2,150

DEFECT DENSITY BY PROJECT SIZE

Lines of Code LibreOffice Open Source Proprietary

More than 1 million .84 .75 .66


4

The following are the specific types of defects outstanding in LibreOffice:

OUTSTANDING DEFECTS AS OF OCTOBER 4, 2013

API usage errors 23

Class hierarchy inconsistencies 3

Code maintainability issues 79

Concurrent data access violations 2

Control flow issues 447

Error handling issues 1628

Incorrect expression 49

Insecure data handling 11

Integer handling issues 252

Memory – corruptions 56

Memory - illegal accesses 42

Null pointer dereferences 886

Parse warnings 1

Performance inefficiencies 3

Program hangs 4

Resource leaks 74

Security best practices violations 26

Uninitialized members 1090

Uninitialized variables 40

Various 1

Grand Total 4717


5

As of October 4, 2013, they had fixed the following high- and medium-impact defects:

DEFECTS FIXED YEAR AS OF OCT. 4, 2013

API usage errors 14

Class hierarchy inconsistencies 1

Code maintainability issues 7

Concurrent data access violations 2

Control flow issues 133

Error handling issues 1386

Incorrect expression 26

Insecure data handling 3

Integer handling issues 39

Memory – corruptions 17

Memory - illegal accesses 34

Null pointer dereferences 122

Performance inefficiencies 4

Program hangs 6

Resource leaks 133

Security best practices violations 9

Uninitialized members 188

Uninitialized variables 26

Grand Total 2150


6

Coverity Scan uncovered a buffer overrun in the LibreOffice code, which leads to illegal memory access. Specifically, they defined a static array of 12 8-byte structures starting on line 225 in the image below:

While iterating over that array, they specify the upper bound for the index as the size of the array (96), rather than the number of elements (12). Each element is 8 bytes, so element 96 would start at offset 760. As a result, the code on line 245 will access memory all the way up to 768 bytes past the beginning of the array. Since the array only occupies 96 bytes, the loop will read and operate on 672 bytes of illegally accessed data!


7

To correct this issue, they now divide the size of the array by the size of each element, giving them the number of elements in the array to loop over:


8

Q&A Norbert Thiebaud

A Committer to LibreOffice Q: Can you tell us a little bit about the philosophy of the LibreOffice project? A: We have a very “people centered” philosophy that can be summarized as, “believe in the people.” People are willing

to do the right thing. The core group of engineers who started the project created a community of being accepting and helpful. When people want to help with the project, be nice. There is a willful effort to be accepting. The experienced developers recognize and value people who are trying to contribute and make a special effort to help them be successful. Long-time members of the project lead by example to communicate the values of the project, and new members try to emulate those values.

Q: Why do you think you are able to achieve high levels of quality? A: LibreOffice is a very large project, with a very, very long history, and has more than seven million net lines of code,

some of which are more than 20 years old. Automation and tools are a must to be able to cope with the scale, to catch potential problems as early as possible. For the ones that get away, we have a great, dedicated group of QA volunteers that help triage bugs reported by our fantastic and diverse user base, to help channel issues to developers.

We’ve also developed tools like bi-bisect, which allows testers to bisect between binary versions of the product to identify when a particular bug or regression appeared. We have an ever-growing stack of unit tests and integration tests. We regularly use dynamic analysis tools like Valgrind to identify performance and/or memory issues. And of course we benefit from static analysis tools like Coverity that can do what no single developer can: review the entirety of the code base, every time.

Q: What is it about the developers on your project that you think enables them to create high-quality code? A: LibreOffice has many, very senior developers that have been with the codebase for a decade or so... and they are

quite accessible to the less seasoned developers. They mentor new volunteers and expose them to the tools and methods we use to avoid pitfalls. We are also growing a gerrit-based infrastructure to foster code review and it is integrated with buidlbots to be able to build and test patches as early as possible in the process, at the patch level.

Q: What happens if you have a developer working on the project who submits code that doesn’t meet your quality expectations? A: LibreOffice has essentially two levels of access to the code. The first level of access is Committers that can push

code, and the second is Contributors that can propose patches via gerrit. For the latter, the patch has to be reviewed and approved by a Committer. At that stage, if there are issues with the patch, the work is commented on and a new version of the patch can be uploaded in gerrit based on these comments.

For Committers, they can still use gerrit to seek the opinion of their peers, but they can push their work out directly. That work will generate activity on a commit-list ML which can trigger some post-facto review comments. In general, there is little problem with Committers not meeting quality expectations, and certainly nothing that is not solved quickly by peer pressure and cooperation with other developers to address any concerns.


9

Over time, Contributors who prove they are doing a good job can become Committers. It’s very organic when this happens; there is no set limit of patches or set amount of time.

Q: What sort of process do you follow to ensure high-quality code? A: I would say an 'all-of-the-above' process. That is, there is not a rigid, formal process, but a wide variety of tools

combined with motivated developers, QA and users that are all aiming for a quality product. In other words, use all the technology help you can get and trust your extended teams to each do their part to maintain the highest level of quality. But without a rigid process, that could, at times, be used as an excuse: "But I followed the process, so it must be good." Process, or more exactly guidelines, are there to guide and help achieve the goals, but are no substitute for personal involvement and responsibility. In LibreOffice, we believe that having people that are involved and take ownership of their work is much more important to achieve quality than any process.

Q: Do you have people in formal roles to ensure the quality of the code? A: No, we do have tools and people that manage these tools, who can and do raise attention to them if and when

things slip, but quality is everybody's concern and priority. In general, the project tries to keep the structure as flat as possible. From the development perspective, there are essentially only two formal levels: Contributor and Committer. In LibreOffice, like in any open source project, some people by virtue of their skills and experience become informal, de-facto leads but these are not formal roles and arise organically.

Q: Can you describe how development testing and the Coverity Scan service fit into your release process? A: We have set-up a buildbot that regularly uses Coverity Scan. We are still ramping up that infrastructure, but due to

the size of the codebase (and therefore the size of the resulting scan), we are doing these builds every two to three weeks right now. Eventually, I think we will get to a weekly build.

Some developers are reviewing the output of Coverity Scan, triaging and fixing the bugs that need fixing. The long history, the complexity of the LibreOffice code base and the fact that the Coverity scans are relatively recent tools in our arsenal mean that we are still quite a bit behind with regard to triaging all the defects reported by Coverity, but we are catching up!

Q: What tools do you use, beside Coverity, and how do they impact your ability to deliver high-quality code? A: We use anything and everything we can get our hands on and find the man-power to run. The latter means that

the frequency of use of some of these tools can be variable, but that includes things like other static code analysis tools, code coverage, unit tests, automated integration tests, and of course good old elbow-grease by a small army of LibreOffice volunteers.

Q: What challenges do you face with regard to maintaining high-quality code that are unique to open source and how do you overcome these challenges?

A: The best tool in the world can only get you so far. The magic ingredient to a successful project is people. So the greatest challenge is to make sure we retain and continue to attract great people, and help them with all the tooling and support we can to leverage their enthusiasm and dedication.


10

Q: Do you have any last comments you would like to make? A: LibreOffice started its new adventure, under the “umbrella” of The Document Foundation, about three years

ago, building on the very large existing codebase inherited from OpenOffice.org. We have been making great progress at re-structuring, cleaning up and improving the quality of LibreOffice and Coverity makes an important contribution there. Thank you!

Conclusion and Next Steps for Coverity Scan LibreOffice has done an excellent job of addressing key defects in their code in the short time they have been part of the Coverity Scan service. We would like to thank the LibreOffice team for their participation in the service.

Register your open source project with Coverity Scan or sign up for a free trial to get visibility into the quality and security of your software code.

http://scan.coverity.com

http://scan.coverity.com

SoftwareIntegrityReportProject Name: LibreOffice

Version: 4.1

Project Description: Scan project LibreOffice

Project Details:

Lines of Code Inspected: 9,016,902Target Level 1 ACHIEVED

Project Defect Density: 0.51

High-Impact and Medium-Impact Defects: 4608

Company Name: LibreOfficePoint of Contact: Coverity ScanClient email: [email protected] Date: Oct 3, 2013 12:21:30 PMReport ID: 7346e342-5dfe-4271-bde8-010a74ed206f

Coverity Product: Static AnalysisProduct Version: 6.6.1Coverity Point of Contact: Integrity ReportCoverity email: [email protected]

The Coverity Integrity Rating Program provides a standard way to objectively measure the integrity of your ownsoftware as well as software you integrate from suppliers and the open source community. Coverity IntegrityRatings are established based on the number of defects found by Coverity® Static Analysis when properlyconfigured, as well as the potential impact of defects found. Coverity Integrity Ratings are indicators of softwareintegrity, but do not guarantee that certain kinds of defects do not exist in rated software releases or that a release isfree of defects. Coverity Integrity Ratings do not evaluate any aspect of the software development process used tocreate the software.

A Coverity customer interested in certifying their ratings can submit this report and the associated XML file [email protected]. All report data will be assessed and if the Coverity Integrity Rating ProgramRequirements are met, Coverity will certify the integrity level achieved for that code base, project, or product.

High-Risk Defects

High-impact defects that cause crashes, programinstability, and performance problems.

Medium-Risk Defects

Medium-impact defects that cause incorrect results,concurrency problems, and system freezes.

Defect Risk by Component

Component Owner DefectDensity

Other 0.63

writer 1.52

calc 1.19

draw 0.95

filters 0.26

- - 0.00

- - 0.00

- - 0.00

- - 0.00

- - 0.00

- - 0.00

Defects by Assigned Severity

High-severity defects have been tagged by developers as aclear threat to the program's stability and/or security.

Defect Severities by Component

Component Owner DefectDensity

Other 0.63

writer 1.52

calc 1.19

draw 0.95

filters 0.26

- - 0.00

- - 0.00

- - 0.00

- - 0.00

- - 0.00

- - 0.00

Defects by Triage State

Coverity Software Integrity Report

The Coverity Software Integrity Rating is an objective standard used by developers, management, and business executives toassess the software integrity level of the code they are shipping in their products and systems.

Coverity rating requirements are based on an assessment of several factors:

• Defect density: For a given component or code base, the number of high-risk and medium-risk defects found by staticanalysis divided by the lines of code analyzed. Defect density excludes fixed defects and defects dismissed as falsepositives or intentional. For example, if there are 100 high-risk and medium-risk defects found by static analysis in acode base of 100,000 lines of code, the defect density would be 100/100,000 = 1 defect per thousand lines of code.

• Major severity defects: Developers can assess the severity of defects by marking them as Major, Moderate, or Minor(customizations might affect these labels). We consider all defects assigned a severity rating of Major to be worthreporting in the Integrity Report regardless of their risk level because the severity rating is manually assigned by adeveloper who has reviewed the defect.

• False positive rate: Developers can mark defect reports as false positives if they are not real defects. We consider afalse positive rate of less than 20% to be normal for Coverity Static Analysis. A false positive rate above 20% indicatespossible misconfiguration, incorrect inspection, use of unusual idioms in the code, or a flaw in our analysis.

Coverity Integrity Level 1 requires the software has less than or equal to 1 defect per thousand linesof code, which is approximately the average defect density for the software industry.

Coverity Integrity Level 2 requires the software has less than or equal to 0.1defect per thousand lines of code, which is approximately at the 90th percentile forthe software industry. This is a much higher bar to satisfy than Level 1. A million-linecode base would have to have 100 or fewer defects to qualify for Level 2.

Coverity Integrity Level 3 This is the highest bar in the rating system today. Allthree of the following criteria need to be met:

• Defect density less than or equal to 0.01 defect per thousand lines of code, which is approximately in the 99thpercentile for the software industry. This means that a million-line code base must have ten or fewer defectsremaining. The requirement does not specify zero defects because this might force the delay of a release for a few straystatic analysis defects that are not in a critical component (or else giving up on achieving a target Level 3 for therelease).

• False positive rate less than 20%. If the rate is higher the results need to be audited by Coverity to qualify for thisintegrity rating level. A higher false positive rate indicates misconfiguration, usage of unusual idioms, or incorrectdiagnosis of a large number of defects. The Coverity Static Analysis has less than 20% false positives for most codebases, so we reserve the right to audit false positives when they exceed this threshold.

• Zero defects marked as Major severity by the user. The severity of each defect can be set to Major, Moderate, or Minor.This requirement ensures that all defects marked as Major by the user are fixed because we believe that once humanjudgment has been applied, all Major defects must be fixed to achieve Level 3.

Level Not Achieved indicates that the target level criteria are not met. This means that the software has too manyunresolved static analysis defects in it to qualify for the desired target integrity level. To achieve the target integrity levelrating, more defects should be reviewed and fixed.

How to Use Your Software Integrity RatingSet software integrity standards for your projects, products, and teams.It is often difficult for developers and development management to objectively compare the integrity of code bases, projects,and products. The Coverity Software Integrity Rating is a way to create "apples-to-apples" comparisons and promote thesuccess of development teams that consistently deliver highly-rated software code and products. Development teams can alsouse these ratings as objective evidence to satisfy requirements for quality and safety standards.

Audit your software supply chain.It is challenging for companies to assess the integrity of software code from suppliers and partners that they integrate withtheir offerings. The Coverity Software Integrity Rating is a way to help companies create a common measurement of softwareintegrity across their entire software supply chain.

Promote your commitment to software integrity.The integrity of your software has a direct impact on the integrity of your brand. Showcasing your commitment to softwareintegrity is a valuable way to boost your brand value. It indicates that they are committed to delivering software that is safe,secure, and performs as expected.

2013 coverity scan. project spotlight: libreoffice

Technology

libreoffice libreoffice

libreoffice code

coverity scan service

libreoffice team

starting libreoffice

libreoffice analysis

security defects

spotlight features libreoffice