software test & performance issue apr 2009

36
VOLUME 6 • ISSUE 4 • APRIL 2009 • $8.95 • www.stpcollaborative.com Turn the Hackers ' Tool Into Your App ' s Security page 12 BEST PRACTICES: Testing .NET Apps Speak in Tongues: The CERT C Spec S Sn ni i f f f f O Ou u t t S S e e c cu ur r i i t t y y F F l l a aw ws s W W i i t t h ho ou u t t A A F Fa au us s t t i i a an n B B a ar r g g a ai in n Don't Get Burned With Two Impossible Choices

Upload: kenw

Post on 13-Nov-2014

235 views

Category:

Documents


1 download

DESCRIPTION

http://www.stpmag.comhttp://www.stpmag.com/backissues2009.htm

TRANSCRIPT

Page 1: Software Test & Performance issue Apr 2009

VOLUME 6 • ISSUE 4 • APRIL 2009 • $8.95 • www.stpcollaborative.com

Turn the Hackers'Tool Into Your App'sSecurity page 12

BESTPRACTICES:

Testing .NET Apps

Speak in Tongues:The CERT C Spec

SSnniiffff OOuutt SSeeccuurriittyy FFllaawwss WWiitthhoouutt AA FFaauussttiiaann BBaarrggaaiinn

Don't Get Burned WithTwo Impossible Choices

Page 2: Software Test & Performance issue Apr 2009
Page 3: Software Test & Performance issue Apr 2009

ContentsVOLUME 6 • ISSUE 4 • APRIL 2009

1122CCOOVVEERR SSTTOORRYYMaking a Career of Evil—Using AHacker's Tool to Secure Your Apps

Fuzz testing turns the tables on those that would do harm. Learn aboutthis negative testing technique that takes penetration to a whole new level.

By Ari Takanen

DDeeppaarrttmmeennttss

Delay release or deploy with bugs? Whenyour only two options are bad and worse,there's sometimes another way to go.

By Matt Love

2222 Sniff Out SecurityFlaws Without

Breaking the Skin

2299 Bad Choice vs. Worse Choice

1166 Speak Security Lingua Franca

Like a pack of wild dogs, hackers arealways poking around. Build a cagearound your app with dynamic taint pro-pogation. By Brian Chess and Jacob West

4 • EditorialThe results are in. Here's what testers toldus about how you use the Web.

6 • ContributorsGet to know this month’s experts and thebest practices they preach.

7 • Out of the BoxNew products for testers.

9 • ST&PediaIndustry lingo that gets you up to speed.

10 • The Conference ReportHere's what you missed at February's FutureTest conference in NYC.

33 • Best PracticesSuch fragile creatures, those .NET applica-tions. By Joel Shore

34 • Future TestThe difference between traditional and mile-stone consulting. By Phil Simon

Make CERT C your native tongue andbuild secure applications from the start.Developed by Carnegie Mellon Univer-sity, the specification translates ordinary C-language code into safe andreliable. By Paul Humphries

APRIL 2009 www.stpcollaborative.com • 3

A Publication

Software Test & Performance (ISSN- #1548-3460) is published monthly by Redwood Collaborative Media, 105 Maxess Avenue, Suite 207, Melville, NY, 11747. Periodicals postage paid at Huntington, NY and additional mailing offices. Software Test & Performance isa registered trademark of Redwood Collaborative Media. All contents copyrighted © 2009 Redwood Collaborative Media. All rights reserved. The price of a one year subscription is US $49.95, $69.95 in Canada, $99.95 elsewhere. POSTMASTER: Send changes of addressto Software Test & Performance, 105 Maxess Road, Suite 207, Melville, NY 11747. Software Test & Performance Subscribers Services may be reached at [email protected] or by calling 1-847-763-1958.

Page 4: Software Test & Performance issue Apr 2009

We recently launched an onlinesurvey to ask about your profes-sional education needs and goals.To those of you who took a fewminutes to complete the survey,thank you. If you haven’t com-pleted it yet, we invite you to do sonow. A link is below.

In addition to questions aboutthe topics, tools, and technologiesyou most want to learn about, weasked about some of the Web serv-ices you use and the social networking Web sitesyou visit most.

Not surprisingly, more than half of you (55percent) regularly visit LinkedIn, the businessand social networking site. LinkedIn countsamong its 36 million members“executives from all Fortune 500companies,” according to itsAbout page. When I first startedreceiving invitations to joinsocial networks years ago, I resis-ted, even though most werefrom people I knew and trusted.My thinking was that I hadenough on my plate just to main-tain my own contact database(now 4,500 strong). Why shouldI help maintain someone else’sprofessional contact list? But TedBahr, my boss at the time, urgedme to give it a try.

What I realized after joiningwas that these networks providea great way to stay in touch withcontacts long since forgotten. (Ialso learned that one should notsimply invite everyone in one’sdatabase, as many have forgottenme). The main takeaway is thatmy 548 direct connections linkme to more than five million other people, anyof whom I can communicate with through anumber of channels.

Back to our recent survey: Nearly two in five(38 percent) of you use Facebook, which in myexperience is used for personal contact far morethan for business. Its interface appears to be

aimed at “friending” other people,sharing photos and words and join-ing and interacting via commoninterest groups. Sure, Facebook hasgroups for testers. I belong to sev-eral, including the Software TestingClub, Software Testing Services andSoftware Testing and QualityAssurance/Control, which is by farthe largest with more than 2,000members. Still, there were only acouple hundred one-off posts,

mainly from job seekers. One third of you visit QAforums and/or

Stickyminds. Of the two, QAforums is by far thebusier, with tens of thousands of posts and viewsper subject compared with just a relative hand-

ful in SQE’s forum. A few of youindicated that other Web sites werepart of your routine; ITToolbox(12 percent), Open SourceTesting (11 percent), DZone (5percent) and Daniweb (2 per-cent). But 41 percent replied thatnone of those listed were “onlineforums you visited most.”

Of this latter group, the stand-out was opensourcetesting.org, aWeb site devoted to software test-ing tools. It was created and is runby Mark Aberdour in his sparetime. Aberdour is CEO of KineoOpen Source, a solution and serv-ice provider based in the U.K.The not-for-profit Web site doesoffer a discussion forum, but it’slightly trafficked; the site’s mainthrust is as a repository. As such,it does the job well; it’s informa-tive, well organized, and extreme-ly well stocked.

With the dearth of busyforums out there, clearly software test and QAprofessionals need more places to interact.Which ones do you use and what would you liketo see done differently? If you haven’t taken ourone-page survey yet, please take two minutesnow and visit tinyurl.com/cmtkqt. We look for-ward to hearing from you. ý

Testers’ Web 2.0Usage Snapshot

VOLUME 6 • ISSUE 4 • APRIL 2009

Ed Notes

Edward J. Correia

•Most testers

use LinkedIn

regularly; about

a third visit

Facebook,

QAforums or

Stickyminds.

4 • Software Test & Performance APRIL 2009

President Andrew Muns

Chairman Ron Muns

105 Maxess Road, Suite 207Melville, NY 11747+1-631-393-6051

fax +1-631-393-6057www.stpcollaborative.com

Cover Illustration by Misha

Editor

Edward J. Correia

[email protected]

Contributing Editors

Joel Shore

Matt Heusser

Chris McMahon

Art Director

LuAnn T. Palazzo

[email protected]

Publisher

Andrew Muns

[email protected]

Associate Publisher

David Karp

[email protected]

Director of Events

Donna Esposito

[email protected]

Director of Operations

Kristin Muns

[email protected]

Reprints

Lisa Abelson

[email protected]

(516) 379-7097

Subscriptions/Customer Service

[email protected]

847-763-1958

Circulation and List Services

Lisa Fiske

[email protected]

Page 5: Software Test & Performance issue Apr 2009
Page 6: Software Test & Performance issue Apr 2009

You’ve heard of fuzzy math. If you turn to page 12, you’ll learn about fuzzy testing, a practice withroots in the world of hackers. ARI TAKANEN, chief technical officer at Codenomicon, tackles the sub-ject of our lead feature, explaining fuzz testing’s usage scenarios beyond that of penetration testing andsecurity auditing. Ari is a noted speaker and author on software testing and security. He conducted exten-sive research on fuzz testing with the Oulu University Secure Programming Group, and also was involvedin the pioneering work done by the PROTOS project (1998 to 2001).

PAUL HUMPHREYS is a software engineer with LDRA Ltd., and responsible for ongoing enhancementof LDRAs’ static code analyzer. LDRA provides solutions and services for safety-critical systems in aero-space, defense and other industries. A veteran of software development for nearly two decades, Paul hasbeen with companies such as British Aerospace and GEC Marconi. He holds a masters degree in Computingfor Commerce and Industry. Beginning on page 16, Paul explains best practices for producing reliableand secure software systems using CERT C, a secure language developed by Carnegie Mellon's SoftwareEngineering Institute.

As the chief scientist at Fortify Software, BRIAN CHESS is focused on develop-ing practical solutions for securing software systems. Brian also is co-founderof the company, which makes software security analysis tools. He holds a Ph.D.in computer engineering from the University of California at Santa Cruz, wherehe studied the application of static analysis for finding security-relevant sourcecode defects. His article is co-authored by JACOB WEST, who manages Fortify’sSecurity Research Group. Turn to page 22 to learn how dynamic taint propa-gation can be used to find input validation bugs with less effort and technicalsavvy than typical security testing.

One key problem with security code audits is that they tend to cause more problems than they solve.Beginning on page 29, M ATT LOVE, a software development manager at test tools maker Parasoft, helpsyou solve the “one size fits all” problem of having to decide between delaying the project or going tomarket as-is. Matt has been a Java developer since 1997. He holds a bachelor’s degree in computer engi-neering from the University of California at San Diego.

Contributors

TO CONTACT AN AUTHOR, please send e-mail to [email protected].

6 • Software Test & Performance APRIL 2009

Advertiser URL Page Number

Hewlett-Packard www.hp.com/go/alm 36

Klocwork www.klocwork.com 28

Software Test & Performance www.stpcollaborative.com 5

Test & QA Newsletter www.stpmag.com/tqa 35

Wildbit www.beanstalkapp.com 2

Index to Advertisers

Page 7: Software Test & Performance issue Apr 2009

Hewlett-Packard in late Februaryreleased Performance Center 9.5,including an enterprise-grade versionof LoadRunner platform with supportfor more protocols and simplified per-formance tracking over time.

“There’s a drive toward refreshingexisting apps to make them more inter-active and engaging,” said Subbu Iyer,senior director of products for HPSoftware and Solutions, of the move toso-called Web 2.0 standards. “That intro-duces a slew of performance issues. ForAjax, there are several frameworks,” hesaid, for example, and referred to theasynchronous nature of the technique.“Ajax-related architecture [also] intro-duces performance issues.”

LoadRunner now supports Adobe’sAction Message Framework (AMF) andthe RTMP format for streaming media.“Because [developers] want [to build]interactive Flex apps with backend data,they are reaching out to backend sys-

tems in a high-performance way.” For situations in which the tester

might not know which protocols are inuse, Performance Center 9.5 introducesProtocol Advisor. “Developers often missout on telling testers all the protocols ortechnology that’s embedded in the appli-

cation.” Testers simply run the applica-tion and Protocol Advisor presents a listof all the protocols being used. “Testerscan [that] there’s some Ajax, some Flex,HTTP…They get a better picture of theapp to make sure the scripting covers allthese technologies. It’s a complex pic-ture from a performance testing per-spective, and validating that is a largechallenge. Now our customers can dothat testing from a single tool.”

Of particular interest to agile shopsmight be new reporting capabilities thatpermit quick spotting of application per-formance trends. “For Agile, I can now sayI ran a tests on Monday, Tuesday andWednesday and I can quickly build agraph to show how the apps is improvingor degrading in performance over[those] days of the week.” You can do thesame thing in Excel now, of course, butIyer points out that “we provide it online,in real time and provide for several differ-ent metrics,” including transactions persecond, and response time of a particularcomponent, such a login transaction. “Weactually provide the output in tabularform as well as in a graphical display.”

SaaSy ServicesHP has modified its support and consul-tancy policies to better suit short termprojects, now offering 1- and 3-monthengagements in addition to the previousminimum of one year. “If you needresources, want to leverage our skills fortesting oracle apps, let’s say, or your work-load got higher but they don’t want tohire, we have the ability to do the testingfor you for a short term.” PerformanceCenter 9.5 and the short-term servicesbecame available on Feb. 24.

Out of the Box

Among Performance Center 9.5's latest fea-tures is Trending, which automates the job of ana-lyzing performance data from one release to thenext, presenting stats graphically in a browser.

Performance Center 9.5 Ready for ‘Web 2.0’

Maybe it’s not like Hanes’ Inspector 12,but Borland’s TeamInspector applies aseries of metrics for code analysis, testcoverage, standards compliance andbuild trends before it gives applications

its stamp of approval for “applicationreadiness.” The company in lateFebruary unveiled the new tool, which ispart of Borland Management Solutionssuite for application tracking, measure-

ment and quality analysis. According to the company, Team

Inspector “helps minimizes the risk ofrelease failure by continuously monitor-ing the code and core assets of any soft-ware system, across a multi-project portfo-lio” and presents dashboard-style all code-related metrics of a release. In its currentversion, the product includes inspectorsfor Ant, Nant, Checkstyle, Emma, JUnitand NUnit. Pricing was not disclosed.

TeamInspector Gives AppsA ‘Stamp of Approval’

APRIL 2009 www.stpcollaborative.com • 7

Page 8: Software Test & Performance issue Apr 2009

Regression Test In ABrowser Sandbox When testing requires closer proximity toa browser than provided by some far-flungcloud, there’s a way you can keep it vir-tual and still have it running on your desk-top. Xenocode on Feb. 23 unveiledBrowser Sandbox, a free tool that permitsany number of different browsers orbrowser versions to be executed on a sin-gle Windows machine at the same time.Regression testing never had it so good.

Regular readers might recall coverageon these pages of Virtual ApplicationStudio, Xenocode’s US$40-per-seat toolthat turns any application into a self-con-tained executable, able to be e-mailed ortransported on a USB drive to run on anymodern Windows PC without so much astouching the registry. Browser Sandboxlets you do the same thing with browsers,and download and launch them rightfrom the Web.

Browser Sandbox is available now atwww.xenocode.com/browsers, whereyou’re also find “sandboxed” versions ofIE 6, 7 and 8, FireFox 2 and 3, Chrome,Opera and Safari.

Real-Time Klocwork Klocwork, which makes automation toolsfor source code analysis, has joined withreal-time operating system developerENEA Embedded Technology to create aversion of Klocwork Insight intended tosimplify the validation of software for avi-ation and other safety-critical systems.

Insight is Klocwork’s flagship sourcecode analysis tool for C/C++, C# and Javathat works from within many integrateddevelopment environments, permitting“developers to check in bug-free code,”according to the company. If you’reresponsible for command and control sys-tems such as those in avionics, you’ll alsoneed to conform to the FAA’s DO-178Bspecification, which defines coding stan-dards for fail-safe systems. Thanks to thepartnership, users of the tool, combinedwith ENEA’s DO-178B expertise, will alloworganizations to “achieve credit for numer-ous objectives of DO-178B certification.”

Send product announcements to [email protected]

8 • Software Test & Performance APRIL 2009

Don’t Lose SleepOver Software AssetsA service launched in early March bysoftware governance solutions companyProtecode might help testers sleep bet-ter at night.

The Software IP Audit is a service thatthe company says can be particularlyhelpful to companies that for example

are in the process of a merger or acquisi-tion, product introduction, demand forIP indemnity or are about to commer-cialize a research project. It serves toidentify licensing and copyright attrib-utes of open source and other softwareassets of an enterprise, and reports rele-vant obligations, similarities between twocode sets and other attributed of binariesor source code. Pricing was not disclosed.

Whether you're using multiple cloud-service providers or just one, BrowserFarm adds real-browserloads to their virtual ones.

SOASTA Keeps ’Em DownOn the BrowserFarm

Cloud testing company SOASTA onMarch 11 introduced BrowserFarm, whichcombines the virtual load generated by itsCloudTest On-Demand service with that ofreal browsers. According to the company,the feature enables more realistic valida-tion of the “last mile client-side experienceof Web systems.” Introductory pricingthrough June 11 is set at US$500 for a 500-browser performance test.

The advantage of using real browsersfor testing, according to SOASTA CEOTom Lounibos, is that it allows testers “tounderstand what could lead to aban-doned shopping carts, poor user experi-ence or frozen transactions…” The sys-tem can simulate geographically dis-bursed loads from thousands of instances

of Linux- and Windows-based browsersrunning various application technolo-gies, combined with hundreds of thou-sand of virtual instances. This givestesters “the flexibility to measure end-user experience regardless of a user’slocation or browser choice,” he said.

The BrowserFarm release comes onthe heels of the Feb. 18 launch ofCloudTest Global Platform, a cloud-basedload and performance testing and analy-sis tool that works atop cloud platformsfrom 3Tera, Amazon, Enomaly andRackspace and streams performance datato user dashboards in real time. Pricingfor the service starts at $1,000 per test-hour, including all underlying platformcosts, tools and results analysis.

Page 9: Software Test & Performance issue Apr 2009

Paul Melson is informationsecurity officer at PriorityHealth, an insurance com-pany in Grand Rapids, MI.he has been in IT for 13years, focusing exclusivelyon security for the last seven.During his career Paul hasalso consulted on matters ofincident response and compli-ance for government, financial, highereducation and manufacturing industries.ST&Pedia: What's the differencebetween penetration and vulnerabilitytesting?Paul Melson: Vulnerability testing is asubset of penetration testing, in whichthe tester attempts to identify the pres-ence of vulnerabilities–typically publiclyknown vulnerabilities–in a system.Penetration takes this to the next level inseveral ways. The main difference is thatpenetration testers attempt to exploit vul-nerabilities that they find and attempt toescalate privileges as far as they can go.This serves to simulate a targeted attackby an intelligent hacker and provide theclient with a realistic understanding ofthe risk each vulnerability poses. Manypenetration testers will also attempt toidentify and exploit previously unknown[industry lingo: "zero day” or “0-day"] vul-nerabilities.What tools do you use?Different testers use different tools, andthere is some controversy about the useof automated scanners among pentesters. I believe based on my experienceboth as a consultant and a client that theuse of network and Web scanning tools isprevalent even among pen testers. Butthere are those pen testers that vehe-mently deny that they use scanners. Andto their point, a skilled tester will alwaysdo better than a scanner.

In many cases, a pen tester will write acustom tool to work on a tricky or newvulnerability and these tools get re usedin later tests as well. Exploit frameworksand fuzzers are also frequently used infinding and exploiting zero day vulnera-bilities.

But I think the one thatwould maybe surprise a fewfolks is that passwordguessers/crackers are stillwidely used. And they're stillwidely used because theystill work.What are some commonvulnerabilities?

Default or weak passwords,SQL injection, cross site scripting. And ofcourse the first two are heavily used by thebotnet/malware distributors to compro-mise a Web site and use it to attack Webbrowsers. There's still a lot of work fordevelopers, especially Web applicationdevelopers, to do around input handling.How we do we decide how much timeand effort to invest in security?Risk assessment, vulnerability scoring,threat modeling whatever you do andwhatever you call it, at the end of the dayyou have to have a system to prioritizewhere you focus security resources.Can you tell us a little more about thatthreat modeling thing?Threat modeling is the current ideal wayto prioritize security spend. I should firstexplain what the progression is, becausein the real world you can't always opt todo threat modeling.

Risk assessment prioritizes securityspend based on the value of the assetsyou are trying to protect. This is the oldschool, but sometimes it's all you have togo on.

Vulnerability scoring prioritizes secu-rity spend against known vulnerabili-ties/deficiencies by quantifying how like-ly a vulnerable or deficient system is to beattacked, the relative difficulty of a suc-cessful attack, and the consequence of asuccessful attack. Or, put more simply, asystem's Vulnerability Score (or RiskScore) is the product of its Likelihood,Exploitability, and Impact.

Threat modeling takes this goal to anadditional level by analyzing a system, itsenvironment, the transactional processesthat it runs, and [develops] a detailed listof all of the potential threats against thatsystem. Ideally this happens in the design

stages of a project and the designers canplan to address each of the identifiedthreats before work begins.

Peter Torr at Microsoft wrote a blogpost about 'Guerilla Threat Modeling'that is probably the single best shortpiece on threat modeling.How, exactly, can we 'insert securitythroughout the life cycle'? What does thateven mean?First, I think it means making security aconcern that is addressed at every step ofdesign/build/maintain process.

As far as how to bake this in to theprocess, I wish I had an easy answer. In itssimplest form, someone comes to the dis-cussion asking, "What are our securityrisks and how do we address them?"Some good ideas that are practicalinclude requiring security testing of sys-tems before they can be put into produc-tion, using standards for commodityfunctions like authentication and log-ging, and using some tried and true secu-rity concepts like "least privilege" and"defense in depth" at the design phase.As much as security industry pundits havetrashed it, defense in depth–building,documenting, and using mitigating con-trols at strategic points throughout a sys-tem–still works very well.When are we finally going to have this'security' testing thing locked down into achecklist and standardized?There is already some excellent work outthere in the area of codifying and stan-dardizing Web security testing. OWASPhas developed an excellent set of guideson secure development, code review, andsecurity testing. I also believe that wewon't be “done” any time soon. Fifteenyears ago nobody was doing code reviewand looking at buffer overflows. Tenyears ago nobody was looking at inputchecking and SQL injection. Securitytesting will continue to evolve andprogress for the foreseeable future. Andnow that there's real money to be madeon both the defensive and offensive sidesof Web security, you can rest assured thatnew classes of vulnerabilities are rightaround the corner. ý

Q&A: Paul Melson

ST&PediaTranslating the jargon of testing into plain English

Matt Heusser and Chris McMahon are career soft-ware developers, testers and bloggers.They’re col-leagues at Socialtext, where they perform testingand quality assurance for the company’s Web-based collaboration software.

APRIL 2009 www.stpcollaborative.com • 9

Matt Heusser andChris McMahon

Page 10: Software Test & Performance issue Apr 2009

More than 100 people were gathered atthe Roosevelt Hotel in New York City inFebruary for FutureTest, a single-trackconference on Web testing for high-leveltest managers and executives. The his-toric mid-town location was an idealbackdrop for this intimate managementsummit, and a successful first eventorganized by this magazine’s new man-agement, Redwood Collaborative Media.

Redwood president and CEO AndyMuns and his father, chairman RonMuns were on hand to and kick off theconference, introduce the company andstate some of its goals. “In the comingmonths, we’ll be introducing more edu-cational content, news and informationand more great networking opportuni-ties,” said the chairman. He reiteratedRedwood’s intention to focus intently onthe software testing community. “Ourgoal is to give you the tools you need tohelp you achieve results faster and moreeasily. And we thank you for being partof this transformation.”

Also on hand to give presentationswere more than a dozen industry nota-bles, including test-team managementexpert Judy McKay, Scrum guru RobertSabourin and Selenium core developerPatrick Lightbody. Other experts intheir field were sent by Amazon, Bankof America, Cigital, eBay, Hewlett-Packard, IBM, Resource Interactive,Time and uTest.

Opening the program was JeffJohnson, an expert on human-computerinterface and author of the successfulGUI Bloopers book series. He positedthat much of the Web is not yet of com-mercial quality, and illustrated—withreal examples taken from the Internet—

that many Websites contain serious UImistakes. Pointless choices, unhelpfuldescriptions and conflicting, outdatedand useless content are just a few exam-ples. Some were downright hilarious,such as an airline booking app that listsairports alphabetically (regardless ofcity) and a dialog box requesting thatthe user “Please choose your gender,” asif that were possible by clicking.

Presenting one of a number of secu-

rity-focused talks was Ryan Townsend,lead security engineer of Time, Inc. Inaddition to discussing methods forembedding security into the QAprocess. Townsend discussed privacyissues, cookie stealing, phishing, deface-ment and ways to protect e-commercefunctions. He also touched on “Webhacking 2.0,” and pointed out that newrich interfaces based on Flash and Ajax

Industry's BestGather in NYC ForWeb-Test Confab

Conference Report

10 • Software Test & Performance APRIL 2009

Above, test-industry veteran and author JudyMcKay; clockwise from top left, the introduc-tion of “The Cyber Tester,” from Cigital’s PacoHope; Redwood chairman Ron Muns; Hope andBrowserMob’s Patrick Lightbody, Jinesh Variaof Amazon Web Services; and Edward Correiasmiling for for the camera with Rob Sabourin.

By Edward J. Correia

Page 11: Software Test & Performance issue Apr 2009

open new doors to hackers.In this information-rich presenta-

tion, Townsend detailed the importanceof vendor/partner due diligence whenincorporating third-party elements intoa Website. He also covered SLAs andultimately the importance of risk-basedtesting. “At Time we like the test/QAdepartment to be involved with all stagesof application implementation,” he said,including the planning, analysis, design,implementation and maintenance,because remediation costs increase overtime, he said. Is presentation also includ-ed a warning against relying too heavilyon automated testing.

Jinesh Varia, a technology evangelistat Amazon Web Services, was next with ademonstration of how cloud computing

would change the face of testing.Central to the technology’s usefulness totesters, he asserted, is its flexibility andefficiency to configure servers whenneeded to strip them down when done.Such efficiencies will become ever moreimportant to business. “As Web per-formance becomes more crucial for ouruse, it’s critical to test the performanceof your Website,” he said. Varia followedwith a demonstration of testing “in thecloud,” and the creation of virtual testlabs. He was joined by Selenium coredeveloper Patrick Lightbody, whodemonstrated Browser Mob, a cloud-based Web-application testing tool andLightbody’s latest venture.

Perhaps one of the most polishedpresentations was “Testing RIAs in a

Flash,” by Kristopher Schultz of Re-source Interactive. That’s the companythat does Web-site development forHewlett-Packard, Sherwin Williamsand Victoria’s Secret, among others.He described practices for the designand testing of complex rich-internetapplications from the creation of stat-ic-design and motion comps, behind-the-scenes technical design, graphicproduction and coding, to the skin-ning and tweaking stages. Text-baseddocumentation used in the processincludes a UI guide, wireframes, usecases and the software requirementsspecification. He also covered aprocess for validating the targetedFlash player and how to test acrossbrowsers and on “weak” machines. ý

APRIL 2009 www.stpcollaborative.com • 11

•“Our goal is to give

you the tools you

need to help you

achieve results faster

and more easily.”

– Ron Muns, Redwood Chairman

•Pho

togr

aphs

by

Joel

Sho

re

Page 12: Software Test & Performance issue Apr 2009
Page 13: Software Test & Performance issue Apr 2009

a program, device or system with malformed or otherwise unexpected input datawith the intention of finding critical crash-level defects. I’ve found it useful for iden-tifying critical security problems in communication software.

The tests are targeted at remote interfaces. That means that fuzzing is able tocover the most exposed and critical attack surfaces in a system relatively well, and canidentify many common errors and potential vulnerabilities quickly and cost-effec-tively. Only a year ago, it was mostly an unknown hacking technique that few qualityassurance specialists knew about. Today, QA engineers and security auditors alike areturning the hacker tool against its creators. Fuzzing has become a mainstream test-ing technique used by major companies building software and devices for criticalcommunication infrastructure.

Negative RequirementsTo understand the principles behind fuzzing, it’s helpful to look at how it fits into theentire software lifecycle. Since the software development process starts from require-ments gathering, let’s first look at how the requirements for security and fuzzing canbe mapped together. A software requirement specification often consists of two dif-ferent types of requirements. First there’s a set of positive requirements that definehow software should function. Then there’s the negative requirements that definewhat software should not do. The actual resulting software is a cross-section of both.

Acquired features and conformance flaws map against the positive requirements.Fatal features and unwanted features map into the negative requirements. The unde-

fined grey area between the posi-tive and negative requirementsleave room for the innovative fea-tures that never made it to therequirements specifications or tothe design specifications but wereimplemented as later decisionsduring the development. These areoften difficult to test, and mightnot make it to the test plans at all.The main focus of fuzzing is not tovalidate correct behavior of thesoftware but to explore the nega-tive requirements.

Two Types of FuzzersTwo automation techniques arecommonly used with fuzzing. Themajor difference between the twolies in where the “model” of theinterface is acquired. The easiestmethod of building a fuzzer startsby reusing a test case from featuretesting or performance testing—be it a test script or a captured mes-sage sequence—and then aug-

menting that piece of data with mutations or anomalies. In its simplest form, mutation fuzzing can be accomplished with bit flipping, data

insertion or other random data modifications. The idea is to try unexpected inputs.The other fuzzing method involves building a model from communication protocolspecifications and state-diagrams.

Ari Takanen is chief technical officer at Codenomicon, which makes tools for testing software security.

By Ari Takanen

F uzzing is a relative newcomer to the test and automationscene. It’s a negative software testing method that feeds

Pho

togr

aph

by A

pple

by T

exas

www.stpcollaborative.com • 13

Page 14: Software Test & Performance issue Apr 2009

14 • Software Test & Performance APRIL 2009

Mutation-based fuzzers break downthe structures used in the messageexchanges and tag those building blockswith meta-data that helps the mutationprocess. Similarly, in full model-basedfuzzers, each data element needs to beidentified, but that process also can beautomated. The information needed isoften already given in the specificationsthat are used to generate the models(Figure 1).

Besides information on the datastructures, the added meta-data also caninclude details such as the boundarylimits for the data elements. In model-based fuzzing, the test generation isoften systematic, and involves no ran-domness at all. Although many muta-tion and block-based fuzzers often claimto be model-based, a true model-basedfuzzer is based on a dynamic model thatis “executed” either at runtime or off-line. In PROTOS research papers, thisapproach of running a model duringthe test generation or test execution wascalled Mini-Simulation. The resultingexecutable model is basically a fullimplementation of one of the end-points in the communication.

Fuzzing Among Other TechniquesLooking at different types of black-boxtesting, we can identify three main cate-gories of testing techniques. These arefeature testing, performance testingand robustness testing. Feature testing

is the traditional approach of validatingand verifying functionality. Perfor-mance testing looks at the efficiency ofthe built system. Both exercise the sys-tem using valid inputs.

Introduced by PROTOS protocol-security researchers in 1999, robustnesstesting on the other hand, looks at thesystem under invalid inputs, and focuseson system stability, security and reliabili-ty. By comparing these three testing cat-egories, we can note that most featuretests map one-to-one against use-casesin the software specifications. Perfor-mance testing however, uses just oneuse-case but loops that in either a fastloop or in multiple parallel executions.In robustness testing, you build thou-sands or sometimes millions of misuse-cases for each use-case. Fuzzing is oneform of robustness testing, focusing onthe communication interfaces and dis-

covery of security related problems suchas overflows and boundary value condi-tions, in order to more intelligently testthe infinite input space that is requiredto try out in robustness testing.

Fuzz BuzzThe purpose of fuzzing is to find securi-ty-critical flaws. The timing of such testswill have heavy impact on the total costof the software. Therefore the most com-mon view in analyzing fuzzing benefits isto look at costs related to identificationand repair or security-related bugs.Software security has a special additionalattribute to it, as most of the costs areactually borne by the end user in theform of maintenance, patch deploymentand damages from incidents.

Security compromises or denial ofservice attacks impact the users of thesoftware, not the developers. This is whythe cost metrics often include the repaircosts for the developers as well as thecosts from damages to end-users. Theseare often the very same metrics that youmight have developed for analyzing theneeds for static analysis tools. The costper bug will vary depending on whichphase of the software lifecycle your test-ing efforts take place in (the earlier thebetter). This type of analysis is not easyfor static analysis tools due to the rate offalse positives that do not have any signif-icance for security. A metric collectedearly in the process might not give anyindication of the real cost savings.

It’s different for fuzzing. While a stat-ic analysis tool often delivers a poor suc-cess rate based on analyzing the realsecurity impact of the found flaws, withfuzz testing there are no false positives.All found issues are real and will providea solid metric for product securityimprovements.

Fuzz-Test AutomationFuzzing maps nicely to various test

FUZZ TESTING

FIG. 1: FUZZING PROCESS MODEL

WASN’T FUZZY,WAS HE?

The term ‘fuzzing’ or ‘fuzz testing’ emerged around 1990, but in its original meaning

fuzzing was just another name for random testing, with very little use in QA beyond some

limited ad-hoc testing. Still, the transition to integrating the approach into software

development was evident even back then. From 1998 to 2001 the PROTOS project (at

University of Oulu) conducted research that had a focus on new model-based test

automation techniques as well as other next-generation fuzzing techniques. The purpose

was to enable the software industry itself to find security-critical problems in a wide

range of communication products, and not to just depend on vulnerability disclosures

from third parties.

Page 15: Software Test & Performance issue Apr 2009

automation techniques. While differ-ent levels of test automation are usedin all testing organizations, fuzzingcan be added just about anywhere inthe domain. In fact, test automationexperts are often the first people thatfamiliarize themselves with fuzzingand other related test generation tech-niques. Test automation often focusesonly on the repeatability of tests. Butautomation has led to significantimprovements in test design and effi-ciency.

The more advanced your tools, theless work that will be required to inte-grate fuzzing in your testing cycles. Notall fuzzing tools are model-based, butfuzzing techniques are always automatedwith almost zero human involvement.Tests are automatically generated andexecuted, and reports are also typicallygenerated automatically. Most of thework can be focused on analyzing and fix-ing the found issues.

Fuzzy Tools Comparing fuzzing tools is difficult, andthere is no accepted method. The easiestway might be to enumerate he interfacerequirements. One toolkit might sup-port about 20 or so protocol interfaceswhere another will cover more than 100protocols.

Testing a Web application requires adifferent set of fuzzers than testing avoice over IP (VoIP)application. Somefuzzing frameworksare adept at testingsimple text-basedprotocols but pro-vide no help for test-ing complex struc-tures such as ASN.1or XML. Other fuzztests come inprepackaged suiteswith common pro-tocols such asSSL/TLS, HTTP,and UPnP. Still oth-ers might requireyou to build the tests yourself.

The test direction and physicalinterfaces also can impact the usabilityof some tools, and some test only serv-er-side implementations in a client-server infrastructure, for example. In astudy conducted by Charlie Miller,which appears in Fuzzing for SoftwareSecurity and Quality Assurance(Artech House, 2008), fuzzers were

compared by running them against asoftware intentionally planted withsecurity vulnerabilities. Based on thatsample, fuzzer efficiency ranged from0 percent to 80 percent. Random test-ing provided inefficient test results,and model-based tests peaked at high-er efficiency. The tool with the mosttest cases rarely was the most efficient

one. Looking at the number of testcases will often lead to selection of atool that has the least intelligence inthe test generation. Pleasantly surpris-ing, all planted bugs were found by atleast one fuzzer. So in critical environ-ments, it might be good to employ afew solutions, rather than entrustingall your efforts in a single fuzzing toolor technology.

Fuzzing as a security-testing techniqueseems to have a future. And if you don’tplan on using it yourself, someone else—quite possibly a hacker—surely will. Soit’s best to fight fire with fire and beatthem at their own game.

Fuzzing tools are easily accessible asfree open source tools as well in com-mercial products. Fuzzing is an efficient

method of findingremotely exploitableholes in critical sys-tems, and the returnof time and effortplaced in negativetesting is immediate.Finding just a singleflaw prior to releasecan save enormouscosts and timeresources for internalcrisis management,not to mention thecompromise to adeployed system anddamage to reputa-

tion. No bug can stay hidden if correcttools are used correctly.

Still, there is always room foradvancement, and fuzzing research anddevelopment are ongoing. ý

REFERENCESThis article was based on Fuzzing for Software SecurityTesting and Quality Assurance (Artech House, 2008)• Web site: http://www.fuzz-test.com• PROTOS project:

http://www.ee.oulu.fi/research /ouspg/protos/

FUZZ TESTING

APRIL 2009 www.stpcollaborative.com • 15

FIG. 2: REQUIREMENTS OF FUZZING

WHERE'S THE FUZZ?

While fuzzing was originally intended as a tool mainly for penetration testers and secu-

rity auditors, today its usage is more widespread and diverse. Soon after the exposure

caused by PROTOS, fuzzing quickly became adopted by network equipment manufactur-

ers for their quality assurance processes. From that, fuzzing technologies evolved into

quality metrics for monitoring the product lifecycle and product maturity.

Perhaps because of the rapid quality improvements in network products, fuzzing soon

also became a recommended purchase criterion for enterprises and pushed by vendors

who were already conducting fuzzing and thought that it would give them a competitive

edge. As a result, service providers and large enterprises started to require fuzzing and

similar testing techniques from all their vendors, further increasing the usage of fuzzing.

Today fuzzing is used in three phases at the software lifecycle:

• QA Usage of Fuzzing in Software Development

• Regression testing and product comparisons using Fuzzing at test laboratories

• Penetration testing use in IT operations

As the usage scenarios range from one end to another, so does the profile of the actual

users of the tools. Different people look for different aspects in fuzzers. Some users pre-

fer random fuzzers, whereas others look for intelligent fuzzing. Other environments

require appliance-based testing solutions, and still other test environments dictate soft-

ware-based generators. Fortunately, all of these are readily available today.

Page 16: Software Test & Performance issue Apr 2009

there are programming standards that canhelp identify some of the most common soft-ware defects that can deprive software systemsof integrity.

Defects, bugs or mistakes in a software arti-fact represent a deviation from what isrequired for correct behavior. For the pur-poses of this article, a software vulnerability isdefined as a defect that affects security whenit is present in an application or informationsystem.

Although the defect may be minor andnot affect the performance or results pro-

duced by the software, it might still beexploitable by an attacker and result in a sig-nificant breach of security.

What follows is a look at some of thesedefects and how they affect business appli-cations. Also covered are the common typesof verification and validation (V&V) tech-niques and how they can help achievedefect-free software at the outset of devel-opment.

By Paul Humphries

C oncerns over security have become part of the ever-increasingdemands placed on software developers and testers. Fortunately,

16 • Software Test & Performance APRIL 2009

Paul Humphries is software engineer with LDRA,which makes test tools for safety-critical systems.

Page 17: Software Test & Performance issue Apr 2009

Illu

stra

tion

by

S.G

urso

zla

Examples will be provided for each stepof the V&V with real life methods to helpyou fully understand the problems and howto best implement the solutions. There’salso a focus on regulatory conformance andprogramming standards, with particular ref-erence to the CERT C secure coding stan-

dard and how it addresses security problems.

Exploitable Software DefectsProblems such as denial of service or comput-er resource depletion can often be traced tosoftware malfunction or failure due to pro-gram errors. Such errors are a result of poor

APRIL 2009 www.stpcollaborative.com • 17

The CERT C Standard: Lessons In Etiquette and Protocol For BuildingSecure Applications From the Start

Page 18: Software Test & Performance issue Apr 2009

CERT C SPEC

analysis, design or coding defects, whichin turn are exploited by hackers on net-worked or Internet-facing systems.Toguard against security vulnerabilities,software development projects need toincorporate security testing and verifica-tion into all phases of a project plan.

Historically, software validation hasbeen weighted toward system andacceptance testing and performed inthe latter stages of development. Assuch, it consumes significantresources. In this tradition, code veri-fication has been performed manually,usually adhering to an in-house codingstyle guide. Not only is manual inspec-tion slow and inefficient, it it’s also notsufficiently consistent or rigorous touncover the variety of defects that canresult in errors and serious faults inlarge, complex software applications.

Safety-critical systems such as thosedeveloped for aerospace and automo-tive industries incorporate an increas-ing amount of computer-aided naviga-tion and management systems, and aretherefore prime candidates for carefulverification. But general-use applica-tions also deserve scrutiny, as the costof failure mounts along with theirimportance to the bottom line.

Defect TypesHigh-level languages such as C and C++are commonly used for diverse and far-reaching types of applications, due totheir inherent flexibility and powerful

capabilities. However, the very flexibili-ty that appeals to developers also opensthe door for various defect types, someof which are extremely difficult to rec-ognize without careful analysis andappropriate guidelines.

Table 1 provides one possible list ofdefect types, which when representedas a set of programming standards, maybe checked to uncover software errors.These defect types are taken from thefirst tier of a generic tree structure ofdefects based on a list of the seven

“sins” originally expounded by B.Meyer for requirements capture.

At the top level there are inappro-priate representations, missing con-structs, unnecessary constructs, incor-rect constructs, organization, and inad-equate constructs. Incorrect con-structs, in particular, may be decom-posed to a number of levels to consid-er the type of defects introduced by themisuse of constructs such as aggrega-tion, modularisation, and branchingstatements.

Identifying Defects, RemovingErrors, and Avoiding FailuresV&V techniques help identify if, when,and how the development processdrifted from what was intended orrequired by the user.

Validation focuses on producingthe right software system while verifi-cation ensures that the software isbuilt the right way. V&V should be evi-dent at each stage of development andconducted with reference to the out-puts from previous stages. Verificationis at the hub of a quality process, eval-uating whether or not a product, serv-ice, or system complies with a regula-tion, specification, or conditionsimposed at the start of a developmentphase.

There are numerous V&V tech-niques that may be applied at one ormore phases of development, notably:formal methods, static analysis, dynam-ic analysis, modified condition/deci-

TABLE 1: DEFECT TYPES

Example Defect TypeMissing construct(Silence)

Unnecessary construct(Noise)

Incorrect constructs

Identified ErrorDetected as a dataflow anom-aly – highlighting either redun-dant variables or a potentialbug.

May be detected by checkingeach condition of the controlflow graph – highlighting eitherredundant logic or a potentialbug.

Unpopulated array items canbe left with garbage which canlead to program failure.

Example Coding ProblemOmission of a statement, e.g.:

{int x, y;/* no value assigned to x */y = x;}

Control flow with unreachable or infeasible code, e.g.:

unsigned int x;

if( x < 0 ) {…} /* infeasible */

Array (aggregation) initialization has insufficient items.

{int iarr[3] = { 1, 2 }; /* insufficient initialisers */}

18 • Software Test & Performance APRIL 2009

BOOK 'EM, DANO

The U.S. Department of Homeland Security (DHS) has sponsored projects to identify the

source of software vulnerabilities to help understand the significance of computer security.

Notably, the National Vulnerability Database in 2004 found that 64 percent of the identified

vulnerabilities are due to programming errors that could have been prevented by adhering to

automated security checking.

The CERT Coordination Centre (CERT C Center) at Carnegie Mellon’s

Software Engineering Institute (SEI) has gathered evidence on the caus-

es of security breaches.This research has led to the formation of CERT

C, a new breed of software development guidelines driven by organiza-

tions and institutions keen to protect their systems from attack. With

this new compliance standard available, organizations are now

demanding that developers not only produce reliable and safe soft-

ware, but also ensure their software systems are impenetrable.

Detecting defects at the point of injection, rather than later in the development process, also

greatly reduces the cost of remediation and ensures that software quality is not degraded

with excessive maintenance.

Page 19: Software Test & Performance issue Apr 2009

CERT C SPEC

sion coverage and unit testing. Redundancy, in the form of diversi-

ty, can also be practiced by the use oftwo independent techniques, onebeing static analysis and another,dynamic analysis. Both can bring con-siderable benefits.

Formal methods for trackingrequirements are based on a mathe-matical approach to specification,development and verification of soft-ware and hardware systems. Formalmethods can vary from using com-monly accepted notation to the fullformality of theorem proving. Adegree of formal specification reapsbenefits at later stages in the processfor any software application.

As in all project management, acost-benefit analysis must be used todetermine where, when, and how toapply formal methods to achieve proj-ect goals within budget. Successful useof formal methods invariably relies ona sharp focus: choosing the right tech-niques and tools, and applying themin the right way to just the right partsof the system.

Formal methods allow defects inrequirements and designs to be detect-ed earlier in development, greatlyreducing the incidence of mistakes ininterpreting and implementing cor-rect requirements and designs. Thereis much to gain by ensuringrequirements are captured infull, are well understood, andare specified completely andunambiguously.

A popular formalism adopt-ed by many developers is thatof use case specification. Usecases are employed to describea system’s behaviour as itresponds to a request that orig-inates from outside of that sys-tem. The use case techniquecaptures a system’s behaviouralrequirements by detailing sce-nario-driven threads throughthe functional requirements.

So, for example, if a user(the actor) requests to write toa file, possible scenario preconditionsmay be:

a. the file does not existb. the file does exist Considerations for the interaction

include user privileges, filtering theinput and available file space.

Static Analysis involves the analysisof a program without actually execut-

ing the program. A variety of differentanalyses may be performed, but per-haps one of the most significant isdataflow analysis.

Dataflow Analysis is a process inwhich the control-flow graph is anno-tated with operations on variables.

This form of analysis is able to revealdata anomalies such as the use ofuninitialized variables, or variablesthat have been assigned a value butare never referenced.

Within the overall process of staticanalysis, there is an initial (main) partof analysis that facilitates all furtheranalysis. Specifically, it extracts details

about the structure of software andprovides various textual and graphicalrepresentations of the code. A staticcall graph, as shown in Figure 2, is oneexample. This is used to convey struc-ture in terms of procedure calls. Anupside-down tree of nodes (proce-

dures) linked by edges (callsto procedures) has a root, ormain procedure, that fans outto increasing lower-levelcalled procedures, until at thelowest level it reaches the leafnodes.

The main purpose of staticanalysis, however, is to obtainsoftware metrics and highlightpossible coding errors. If weconsider the “write to file”example above, and the soft-ware fulfilling the action ofopening the file, it is possibleto apply static checks to thecode to uncover commondefects.

Using the same example, afile open should not follow a previousopen of the same file without an inter-vening file close, as this can lead todangerous race conditions resulting inabnormal program termination ordata integrity violations.

Dynamic Analysis involves executinga program with test data and monitor-ing the process. Many aspects of test

APRIL 2009 www.stpcollaborative.com • 19

FIG. 1: CERT C CALLING

•A file-open should

not follow a previous

open of the same file without

an intervening close.

Page 20: Software Test & Performance issue Apr 2009

20 • Software Test & Performance APRIL 2009

CERT C SPEC

execution can then be subjected tosubsequent analysis. With control-flowtracing, the analysis determines theprecise path taken as control flowsthrough the program during execu-tion using techniques such as arraybound checking and storage alloca-tion and deallocation. Such informa-tion may be presented in a variety offormats. Most often, however, dynamicanalysis simply determines the coverageof various program elements.

Clearly, it is desirable to establishthat all executable code has been exer-cised by the test data. If not all codehas been covered at least once by someinput data, then further datasets arerequired or code exists that is redun-

dant, unreachable or infeasible.Coverage metrics may be obtained forstatements, branches or decisions, andjumps to or within procedures. Desk-checking code, by manually steppingthrough each combination of datainputs, rapidly becomes unmanage-able as the number of possible pathsincreases. An automated process,using tools that support dynamicanalysis, is therefore essential whenattempting to achieve full coverage.

Modified condition/decision coverage(known as MC/DC) is a techniquewhereby a logical decision (expres-sion) having n conditions is executedwith data such that altering the valueof each condition from ‘true’ to

‘false’, with the other conditions heldconstant, produces a change in theresult of the whole decision. A mini-mum of (n+1) data items is needed toachieve full MC/DC.

This extra coverage means that pos-sible errors will be hit and there is agreater confidence level in the codewhen conditions are tested.

Unit Testing checks that the outputsof a unit of code are appropriate tothe requirements of the unit and thatit responds in a known way under allinput states. The sensitivity to any gen-eral fault is enhanced because the out-puts are examined close to the pointof generation, rather than in a com-plete system where they can be maskedby other activities.

Regulatory Conformance AndProgramming StandardsConformance with standards stipulat-ed by governing bodies, such as theFederal Aviation Administration(FAA), generally requires the applica-tion of a number of different V&Vtechniques. For instance, MC/DC cov-erage is essential in the U.S. for certi-fying software in avionics using theDO-178B guidelines.

However, guidelines from organiza-tions such as the Motor IndustrySoftware Reliability Association(MISRA) in the U.K. and CERT in theU.S. focus on eliminating defectsintroduced at the coding phase usingprecisely defined programming stan-dards.

The code checker provided by mosttool vendors will normally be integrat-ed into a static analyzer, and involveslexical and syntactic analysis of thesource code for a single file or poten-tially a complete system. Lexical analy-sis is the process of converting asequence of characters into tokens orlexemes, which can then be parsed incontext with the syntax of the pro-gramming language.

Programming standards, which maybe either rules or advisory guidelines,are applied to the code, and any viola-tions of those standards are reported.Often the violations carry a severity,and may be classified or filtered toprovide focus upon certain types ofdefects in the software.

CERT C Center and The Common Weakness Enumeration CERT was created by the Defense

Motivating the move to defect tracking by general-market software companies is the

cost of defects. Recall Barry Boehm’s groundbreaking work in software economics, in

which he quantified the relative expense to fix a bug at different times in the develop-

ment lifecycle. Although his work was based on the waterfall model, and not the now

commonly used iterative development model, the underlying principle remains the

same: that it’s a lot less expensive to correct defects during development, than to cor-

rect them after deployment.

The figure shows that costs should ideally track as close to the preferred trend analy-

sis (solid red) line as possible, as opposed to letting this slide over to the less desirable

but often typical (dashed purple) line. In the latter scenario, developers defer all soft-

ware application checking to the quality and assurance phase of development which

results in a much greater cost (black solid line).

Boehm found that if automated software checking is applied at the implementation

stage, the cost is 1.6 hours per bug. However, if automated checking is delayed to after

the software is in service, the cost is 14 hours per bug. By checking code as soon as it

exists and making it an integral part of a developer’s day-to-day work, software check-

ing reduces costs by raising the quality level of code. Similarly, if the quality of software

entering testing is higher, there are fewer test failures, fewer defects, the time for test-

ing is reduced, and costs are saved.

THE COST OF DEFECTS

Page 21: Software Test & Performance issue Apr 2009

APRIL 2009 www.stpmag.com • 21

CERT C SPEC

Advanced Resource Projects Agency(DARPA) in November 1988 to dealwith Internet security problems follow-ing the Morris Worm strike. (SeeC’mon Worm)

Again with reference to the write tofile example, the CERT C SecureCoding Standard provides a numberof guidelines aimed at removingpotential insecurities related to fileinput/output.

Essentially, file handling defectsmay allow an attacker to misuse anapplication through unchecked orunfiltered user input, i.e. the programassumes that all user input is safe.Programs that do not check user inputcan allow unintended direct executionof commands or SQLstatements (known asbuffer overflows, SQLinjection or other non-validated inputs).

One example of this iswhere the user isrequired to provide a filename, for the purpose ofstoring further input,which is then created.However if pathname isentered together with anunchecked file name,this may lead to a systemfile being overwritten.

The guidelines inCERT C are spread acrossthirteen distinct chaptersand begin by coveringlanguage independentpreprocessor directives,followed by C languagespecifics: declarationsand initialization throughto error handling andmiscellaneous items.

Of course, thisapproach to the C lan-guage is not uncommon, but it is theemphasis upon security issues that setsCERT C apart from other coding stan-dards.

The CERT C rule MEM31-C statesthat developers should "[f]ree dynam-ically allocated memory exactly once.”This rule can be regarded as high-lighting redundant code, which maybe confusing to the reader or makethe code more difficult to understandand maintain.

However, double-free vulnerabili-ties are viewed by CERT as somethingthat may be exploited to execute arbi-

trary code on a system. Dynamic mem-ory management is generally treatedwith caution due to the effect a mis-take by a developer may have on theresults obtained from a program.From a security viewpoint, resourcedepletion and denial of service are theunderlying rationale for careful check-ing of memory management code.

char* ptr = (char*)malloc (SIZE);...if (abrt) {free(ptr);} ...free(ptr);

The Bottom Line is SecurityTo achieve a secure and reliable soft-ware system, there are a number of

well defined steps andcorresponding V&V tech-niques that should beapplied. The initial focusin any project should beon capturing and specify-ing complete, unambigu-ous requirements.

However, developersshould also apply diverseV&V techniques at allstages of software devel-opment. In particular,the automated verifica-tion of design and imple-mentation artifacts,namely code, leads togreater confidence in thequality of software. Staticanalysis, through theenforcement of appropri-ate programming stan-dards, provides a reliablemeans of removing themajority of defects priorto testing.

Common coding mis-takes are typically thesource of security vulner-

abilities in today’s software systems.CERT C can help tackle security-relat-ed issues for C-language program-ming. Many real world attacks on soft-ware systems have been identified asthe result of exploited vulnerabilitieswhich are traceable to preventabledefects. Indeed, relevant CERT Cguidelines are now referenced byMITRE’s Common Weakness Enum-eration CWE) database for newly dis-covered and disclosed vulnerabilities,so that developers can explicitly seethe association. Visit cwe.mitre.org tofind out more. ý

Although intended purely as an aca-

demic exercise to gauge the size of the

Internet, the effect of the Morris

Worm had repercussions throughout

the worldwide Internet community,

infecting thousands of machines. Many

organizations with systems attached to

the Internet suffered damaging denial

of service attacks. Consequently, soft-

ware vulnerabilities came under the

microscope of the U.S. government.

The CERT C Center is located at

Carnegie Mellon University’s Software

Engineering Institute (SEI). The cen-

ter was primarily established to deal

with Internet security problems in

response to the poor perception of

security and reliability of the Internet.

For a number of years prior to tackling

programming guidelines and other

security-related activities, the CERT C

Center studied and compiled cases of

software vulnerabilities. The Secure

Coding Initiative, launched in 2005,

used the database of catalogued vul-

nerabilities, built up over a period of

12-15 years to develop secure coding

practices in C and C++.

SEI is also working very closely with

sponsors, such as the U.S. Department

of Homeland Security (DHS) and

other defense agencies, to correlate

vulnerabilities with coding errors. DHS

also sponsor MITRE’s Common

Weakness Enumeration (CWE), which

classifies software weaknesses that

lead to vulnerabilities. The CWE now

contains references to CERT C, and

vice-versa, with the intention that

weaknesses may be eliminated by fol-

lowing the secure coding standard.

The philosophy that underpins the

work of the CERT C Center and CWE

is that the majority of vulnerabilities

can be traced back to a relatively

small number of common defects. If

these defects can be eradicated using

suitable automated V&V techniques

then as a consequence a much higher

level of software security can be

attained.

C'MON WORM

•The automated

verification of

design and

implemention

artifacts leads

to greater

confidence in

software quality.

Page 22: Software Test & Performance issue Apr 2009

22 • Software Test & Performance APRIL 2009

Pho

togr

aph

from

Dre

amst

ime.

com

because they lead to today’s most com-monly reported security vulnerabili-ties: SQL injection, cross-site scripting,and remote file inclusion. These vul-nerabilities share the same fundamen-tal pattern: The absence of adequateinput validation allows an attacker tosupply malicious input to a programand cause the program to misbehave.

Unfortunately, crafting maliciousinputs to reveal security vulnerabilitiesis a skill that few quality assuranceengineers posses. Addressing this needis dynamic taint propagation, a tech-nique that allows QA engineers to findvulnerabilities by reusing existingfunctional tests.

The approach described here intro-duces taint propagation logic as a pro-gram is loaded at runtime withoutchanging the program’s source codeor binary on disk. This platform-levelintegration allows us to introducesecurity testing with very little processchange and provides a logical collabo-ration point for enterprise security

teams and quality assurance organiza-tions.

The accuracy of our analysisdepends on rules that govern the areasof the program we instrument. Userscan customize the rules to describeimportant details about the programunder test, such as the nature of pro-prietary input validation mechanisms,to avoid false positives and false nega-tives. For even greater accuracy, weguide users to craft attacks againstreported vulnerabilities and verify thatthe attacks succeed at runtime andincrease confidence in the bugs wereport.

MotivationThe most widely used approach to find-ing injection vulnerabilities is to exer-cise the target program in the samemanner an attacker would. This is toprovide unexpected input and lookfor feedback that indicates the pro-gram has gone wrong. This techniqueis a form of fault injection, a commonapproach taken by security testers andsecurity testing tools. The advantage ofthe approach is that when the pro-gram misbehaves because it hasreceived unusual input, the tester has

strong evidence that a bug exists.Fault injection has a major disad-

vantage too. A test that includes inten-tionally bad input often disrupts thetypical behavior of the program. Evenif no bug exists, the program will like-ly enter an error state or simply fail toprogress toward its intended outcome.If the program accepts an input for-mat with a large number of interde-pendencies or implements a statemachine with deeply nested states,fault injection can require a tremen-dous number of test cases in orderachieve good test coverage. The prob-lem is multiplied by the fact that dif-ferent types of bad input are requiredto test for different kinds of vulnera-bilities.

Consider a Web application thatimplements an online shopping cartand checkout system with three phases(see Figure 1). The process beginswhen an item is added to the cartusing addItemtoCart(). Next, the pro-gram accepts the customer’s informa-tion and validates it inenterCustomerInfo(). After the pro-gram receives valid customer informa-tion, the program processes cus-tomer’s credit card in processCCard()and completes the transaction.However, if the customer informationfails to pass basic validation checks,

By Brian Chess and Jacob West

S oftware bugs can lead to security failures. Bugs related toinput validation and representation are of particular interes t

Brian Chess is chief scientist and co-founderof security tool maker Fortify Software;Jacob West manages the company’s SecurityResearch Group.

Sniff Out VulnerabilitiesWithout Attacking

Page 23: Software Test & Performance issue Apr 2009

Dynamic Taint

Propagation

Can Shepherd

QA People Into

Security Testing

Page 24: Software Test & Performance issue Apr 2009

24 • Software Test & Performance APRIL 2009

TAINT SECURITY

such as a check to ensure that thepostal code for the billing address isvalid, control will not proceed andprocessCCard() will never be exer-cised. Without focused test data, faultinjection techniques will not spend asmuch time exercising processCCard(),and so it is more likely to miss bugs inthe program logic found there.

In many cases this means that faultinjection requires much more timeand effort than functional testing toachieve the same level of test coverage.Our experience is that many organiza-tions either omit security testingentirely or give it only a fraction of theresources devoted to functional test-ing. The result is that many input vali-dation problems are overlooked.

Dynamic taint propagation works bymonitoring the target program as itruns and associating a taint markerwith user-controlled input. The taintmarker propagates through the pro-gram with the input data. If a taintmarker reaches a sensitive functionbefore it encounters appropriate inputvalidation, a vulnerability is reported.

Implementation for JavaWe focus our efforts on tracking taintthrough string variables because dan-gerous input in Java programs oftenarrives as a string. In the Java RuntimeEnvironment (JRE), we modify thejava.lang.String class to include addi-tional values that store the taint statusof each string object. We also modifyclasses used to alter and combinestrings, such as StringBuffer andStringBuilder, to allow us to propa-gate taint between string values.

In order to identify sources (meth-ods that introduce untrusted data) andsinks (methods that untrusted datashould never reach) for tainted values,we instrument the program to set thetaint-storage values added to the Stringclass in cases where values are readfrom outside the program and couldbe influenced by an attacker. We alsoinstrument a variety of security-rele-vant methods whose arguments shouldnot be controlled by an attacker tocheck that their sensitive string argu-ments are not tainted. If a security-rel-evant method is invoked with a taintedstring, a warning is raised.

To better understand how taintpropagation can be used to identify avulnerability, consider the code inListing 1, which demonstrates a classicSQL injection vulnerability. In the

code, the program constructs and exe-cutes a dynamic SQL query thatincludes a value read from an HTTPrequest parameter. If an attacker sup-plies a value such as “‘ OR 1 = 1” forthe parameter name, then theWHERE clause of the query willmatch every row in the users table,giving the attacker access to the entireuser database.

LISTING 1List getUser(HttpServletRequest request) {...String user = request.getParameter("user");try {String sql = "SELECT * FROM users WHERE

id='" + user + "'";stmt.executeQuery(sql);

} ...

}

Listing 2 shows the code from Listing1 modified to include representativedynamic taint propagation logic aroundprogram points that introduce, propa-gate, or potentially misuse taint. Thecode added at runtime to permit taintpropagation is shown in boxes. When aString is created or updated with untrust-ed input, a call to setTaintMarker() isinserted. When taint is propagated fromone string to another, a similar call isused to transfer the taint status to thenew string. Finally, before a call to a secu-rity-relevant operation, such asexecuteQuery(), a call to checkTaint() isinserted to check if the argument to thesensitive operation can be controlled byan attacker.

LISTING 2List getUser(HttpServletRequest request) {...

TABLE 1: SQL INJECTION BITES

Source:Web Input

File:

Method:

Method Arguments:

Stack Trace:

HTTP Request:

org.apache.coyote.tomcat5.CoyoteRequestFacade:295 String[]

org.apache.coyote.tomcat5.CoyoteRequest.getParameterValues(String)

bean.quantity

...

...

Sink: Database

File:

Method:

Method Arguments:

Stack Tracer:

HTTP Request:

com.order.splc.ItemService:201

ResultSetjava.sql.Statement.executeQuery(String)

...

...

...

SQL Injection: A SQL injection issue where external taint reached a database sink.URL: http://localhost/splc/listMyItems.do

processCCard ( )

enterCustomerInfo ( )

addItemToCart ( )XX XX

FIG. 1: DIDN'T BREAK THE SKIN

Page 25: Software Test & Performance issue Apr 2009

String¬ user = request.getParameter1("user");TaintUtil.setTaintMarker(user, 1);try {String sql = "SELECT * FROM users WHERE

id='" + user + "'";TaintUtil.setTaintMarker(sql,

user.getTaintMarker());TaintUtil.checkTaint(sql);stmt.executeQuery1(sql);

} ...

}

To make dynamic taint propagationeffortless for testers, we modify thebytecode for the core Java RuntimeEnvironment (JRE) classes, the pro-gram’s bytecode and the bytecode ofany external libraries the programemploys. We perform the instrumen-tation at runtime by replacing theapplication server’s class loader withone designed to rewrite classes target-ed for instrumentation as they areloaded. Performing instrumentationat load-time avoids changes to the pro-gram’s source code or binary on diskand makes it easy to analyze multipleprograms loaded in the same applica-tion server. This means the program’sbuild and deployment processes donot have to change in order to usedynamic taint propagation. Rewritinga class at runtime roughly doubles theamount of time required for loadingthe class, so programs are noticeablyslower to start. But once a class hasbeen loaded, the additional coderequired for dynamic taint propaga-tion adds little overhead to the pro-gram’s execution time.

Beyond tracking taint as a binaryproperty of a string, it is often desir-able to differentiate multiple sourcesof taint and track them independently.To address this demand, our tainttracking mechanism supports taintflags, which associate informationabout sources that introduce taint withtainted values that they impact. Armedwith detailed information about thesource of a tainted value when it caus-es a vulnerability to be reported, wecan report vulnerabilities more accu-rately and include more useful infor-mation with the vulnerabilities wereport.

When taint reaches a security-sensi-tive sink, we must decide what, if any,vulnerability to report. Our taint prop-agation implementation is capable offine-grained decisions about the typeand priority of error to report depend-ing on which source and sink areinvolved. For example, a SQL query

that contains a value read from anHTTP request parameter wouldreceive a higher priority than the samevulnerability caused by a value readfrom a local properties file. When anerror is reported, it includes detailsabout not only the type of vulnerabili-ty, but also the specific source and sinkinvolved and the line numbers wherethey are located in the original pro-gram source code.

Table 1 shows an overview of a vul-

nerability report for a SQL injectionissue detected with runtime taintpropagation. Notice the vulnerabilityreport contains the URL, as well ascode-level details about the source andsink involved in the vulnerability.

Writing RulesThe choice of which classes and meth-ods to instrument has a clear impacton the effectiveness of our dynamictaint propagation approach. Instru-ment too broadly, and the analysis willproduce false positives (also calledfalse alarms). Instrument too narrow-ly, and the analysis will suffer false neg-atives (miss real vulnerabilities). Wederived the set of classes and methodsto instrument from the rule set we usefor SCA, our static analysis tool. SCAperforms taint propagation on sourcecode without running it, so convertingthe rule set for use with dynamic taintpropagation was a fast way to createrules for thousands of packages andmethods. Because rules can refer to

interfaces or parent classes in aninheritance hierarchy, in some caseswe are able to instrument code eventhough we have not explicitly written arule with it in mind.

Sources of InaccuracyHere we discuss ways to combat bothfalse positives and false negatives andmaximize the accuracy of results pro-duced by dynamic taint propagation.

In programs where security was

addressed during development, manyfalse positives are caused by unrecog-nized input validation because we can-not automatically determine whetheran input validation mechanism is suffi-cient to mitigate a vulnerability. Doingso would require that we keep track ofwhich specific characters and sub-strings can make their way throughthe validation logic and relate thisinformation to the types of attacks pos-sible on each sink. Listing 3 shows theSQL injection from Listing 1 mitigatedwith whitelist validation that ensuresthe untrusted input contains onlyupper and lower case characters fromthe English alphabet. Without knowl-edge of the constraints InputUtil.alphaOnly() places on the input,we will report a false positive on thesubsequent call to executeQuery(). LISTING 3List getUser(HttpServletRequest request) {...String user = request.getParameter("user");if (!InputUtil.alphaOnly(user)) { // ensure user

matches a-zA-Z

TAINT SECURITY

APRIL 2009 www.stpcollaborative.com • 25

TABLE 2: A DIFFERENT BREED

Source:Web Input

File:

Method:

Method Arguments:

Return Value:

Stack Trace:

HTTP Request:

org.apache.coyote.tomcat5.CoyoteRequestFacade:295 String[]

org.apache.coyote.tomcat5.CoyoteRequest.getParameterValues(String)

bean.quantity

' OR 1=1--

...

...

Sink: Database

File:

Method:

Method Arguments:

Stack Tracer:

HTTP Request:

com.order.splc.ItemService:201

ResultSet java.sql.Statement.executeQuery(String)

select id, account, sku,quantity, price, ccno,description from itemwhere account = 'gary'and quantity = '' OR1=1--'

...

...

SQL Injection: A SQL injection issue where external taint reached a database sink.URL: http://localhost/splc/listMyItems.do Verified: 3

Page 26: Software Test & Performance issue Apr 2009

26 • Software Test & Performance APRIL 2009

TAINT SECURITY

log.error("Invalid username specified");return null;

}try {String sql = "SELECT * FROM users WHERE

id='" + user + "'";stmt.executeQuery(sql);

} ...

}

Our approach relies on the user toconvey knowledge of input validationmechanisms to the tool using cleanserules, which specify how to adjust thetaint status of values that undergo vali-dation. Cleanse rules can stipu-late that a value is no longertainted after validation, or theycan selectively adjust the set oftaint flags associated with thevalue based on nature of thevalidation logic. In Table 1, theuser could specify thatInputUtil.alphaOnly() pre-vents meta-character attackssuch as SQL injection and elim-inate the false positive SQLinjection vulnerability thatwould otherwise be reported.

Listing 3 also demonstratesanother scenario that oftenleads to false positives: A con-trol flow path that is exercisedby a functional test that cannotoccur when attack data are present. Inthis case, validation logic is built intothe control flow structure of the pro-gram so that when an attack is identi-fied, a benign log entry is recordedand the transaction is aborted. Sincenormal test data are not likely to con-tain attacks, the transaction will com-plete as expected. However, from anuninformed taint propagation per-spective, the value used in the SQLquery is untrusted and therefore a vul-nerability will be reported.

The best way to weed-out false posi-tives caused by control flow paths that

cannot occur is to verify vulnerabilitiesreported with real attack data. Userscan create attacks that build on con-text-specific advice reported with eachvulnerability to verify their feasibility.When the user mounts an attack, ourimplementation checks whether theattack makes its way to the sink identi-fied with taint propagation. If theattack makes it through to the sink,then it’s likely that the reported vul-nerability is a real bug. This techniquefor verifying bugs is much easier than

mounting arbitrary attacks against theprogram, because it provides userswith contextual help constructing theattack and allows them to easily verifywhether the attack was successful. It isparticularly useful for situations inwhich an attacker could mount a so-called “blind” injection attack, where-in it is not obvious from the programoutput that the attack has succeeded.

Table 2 shows an error report forthe same SQL injection vulnerabilityshown in Table 1, but this time the testdata also included the attack string “‘OR 1=1.” When we witness the attack

string reach the vulnerable sink, weare able to report the issue as verified.

The potential sources for false neg-atives are even more diverse than forfalse positives. The taint propagationimplementation might be missingrules; taint is not tracked throughnative code or when strings aredecomposed into individual charac-ters; and cleanse rules might mistak-enly remove taint from a value that hasreceived insufficient validation.Careful rule writing and an under-

standing of security mecha-nisms can help mitigate manyof these challenges, but even iftaint propagation is workingproperly, poor functional testcoverage allow bugs to gounnoticed. The risk of falsenegatives suggests that dynam-ic taint propagation shouldnot be the sole means forassuring that a program is freefrom injection vulnerabilities.

Integrating with QualityAssuranceDynamic taint propagationcan be deployed with varyingdegrees of involvement from acentral security team. The

degree to which a security team par-ticipates depends on many organiza-tion-specific factors, but in generalboils down to the ability of the securi-ty team and quality assurance team toconduct each of the phases of thesecurity testing process shown inFigure 2. Save functional testing,which is already conducted by thequality assurance team, any of thesesteps can be performed by a centralsecurity team, a quality assuranceteam, or some combination of thetwo. The division of efforts dependson the level of security knowledge andfamiliarity with the program undertest, both of which play an integralrole in the ability of a given group tocomplete each phase effectively.

To ensure that the proper rules arebeing applied during instrumenta-tion, the team must understand boththe kinds of sensitive operations theprogram performs, as well as thenature of the security ramifications ofthose operations. During verification,the ability to construct an effectiveattack string or to pinpoint real vul-nerabilities at the source-code levelmust be combined with an under-

FIG. 2:WELL-HEALED SECURITY

•Dynamic taint propagation should

not be the sole means for assuring

that a program is free from

injection vulnerabilities.

Page 27: Software Test & Performance issue Apr 2009

APRIL 2009 www.stpmag.com • 27

standing of how the program operatesto exercise the potentially vulnerableconstructs and decipher sometimescryptic program logic. When it comesto reporting bugs, strong remediationadvice must include the appropriatesecurity countermeasures and main-tain the necessary program function-ality.

Depending on the level of involve-ment the central security team has indevelopment, they may or may notpossess the necessary understandingof the program under test to functionautonomously. Likewise, only qualityassurance teams with above-averagesecurity knowledge will be capable ofidentifying, verifying, and driving theremediation of security vulnerabili-ties. Based on experience integratingvulnerability identification into thedevelopment phase, we anticipatemost deployments of dynamic taintpropagation to begin with heavyinvolvement from the central securityteam because security knowledge willinitially be the gating factor.Gradually, as the quality assuranceteams builds a foundation of securityknowledge, the process can mature toa point where they conduct most activ-ities with only targeted support fromthe security team.

Related Work and ToolsTaint propagation has long been rec-ognized as a valuable security mecha-nism, and it has been employed innumerous forms. The most widelyused taint propagation system belongsto the Perl programming language.Perl taints user input to ensure thatuser-supplied commands are not exe-cuted in scripts that run with root priv-ileges. Although Perl uses taint in aneffort to prevent successful attackswhereas our purpose is to find bugs,our implementations are similar toPerl in that we taint whole objectsrather than individual characters. Alsolike Perl, we remove taint when astring passes through functions thatare typically used for input validation.

Google’s Vivek Haldar and othersdescribe a taint propagation system forJava that is much like our own butinvolves adding instrumentation to theprogram before it is run. They describethe utility of taint flags, but have notimplemented them. Like Perl, theirstated goal is to prevent successfulattacks rather than to find bugs.

Taint tracking can also be used tofind buffer overflow vulnerabilities inprograms written in C. System archi-tects Eric Larson and Todd Austininstrument C programs to track thepotential size of user input. They

update the potential size of an inputbuffer if the program performsbounds checking. They have appliedtheir technique to find multiple bufferoverflow vulnerabilities in OpenSSH.

Purdue associate professor of com-

TAINT SECURITY

Page 28: Software Test & Performance issue Apr 2009

28 • Software Test & Performance APRIL 2009

TAINT SECURITY

puter science Dongyan Xu and otherstrack which bytes in a C program comefrom user input by reserving a portionof the program’s address space fortaint tracking. Every memory locationin the program has an associated entryin the taint map. As user inputpropagates through the pro-gram, instrumentation addedto the program updates thetaint map. The implementa-tion uses static analysis to elim-inate instrumentation in por-tions of the code that willnever carry taint. The advan-tage of this low-level and high-ly precise approach is that itcan be applied not only to pro-grams written in C, but also toprograms written in interpret-ed languages such as PHPwhen the interpreter is writtenin C.

PHP has been the target ofnumerous taint propagation projects,undoubtedly because PHP has a poorreputation for security and is widelyused in applications that accept userinput over a network. PHP does notyet have a built-in taint propagationmechanism, but there’s a version of

the PHP interpreter by Core SecurityTechnologies (grasp.coresecurity.com)that includes taint tracking with char-acter-level precision.

All of the tools mentioned thus farperform taint propagation at runtime.

They all associate some shadow statewith user input and update that stateaccording to the instructions the pro-gram executes. However, taint propaga-tion does not have to wait until runtime.A taint propagation analysis can also beperformed statically. A static analysis

tool can explore many more possibleexecution paths than would be practicalto exercise during program testing. Thedisadvantage of static taint propagationis that less information is availableabout the true state of the program, so

information about possible exe-cution paths is necessarily lessprecise.

Broader QA RoleDynamic taint propagation doesnot rely on fault injection anddoes not disrupt the normalbehavior of the application. Forthis reason, it does not requireany effort beyond standardfunctional testing. By harness-ing the energy already devotedto functional testing, dynamictaint propagation often findsmore input validation bugs thanother security testing approach-es. And because the technique

integrates well with existing QA prac-tices, it seems an effective way for QAorganizations to contribute to the secu-rity process. ý

REFERENCES1. http://cwe.mitre.org/documents/vuln-trends

•The disadvantage of static taint

propogation is that less information is

available about program state.

SERIOUSSOURCECODE

ANALYSIS

www.klocwork.com

When developing mission-critical software, developers must

quickly and accurately identify, assess and fix critical security

vulnerabilities right at their desktop before they impact anyone

else.

Klocwork’s leading static source code analysis tools provide

powerful, collaborative analysis of C/C++/C# and Java code

before code check-in, when detected issues are easiest and

less costly to fix.

Take the first step towards more secure code – get a free trial of

Klocwork Insight today at www.klocwork.com/freetrialsignup.

Learn more: Next-Generation Source Code Analysis white paper

www.klocwork.com/NextGenPaper

Helping software developers create more secure code. C

M

Y

CM

MY

CY

CMY

K

klocworkPrint_final.pdf 1 03/03/09 2:52 PM

Page 29: Software Test & Performance issue Apr 2009

scans tend to overwhelm developers, ultimately leaving the team with a long list of knownproblems, but little actual improvement. In fact, when an audit tool is used near the endof an application development cycle and it produces a significant number of potentialissues, a project manager is put in the uncomfortable position of having to decide whetherto delay the project and to remediate the code, or send it out into the market as-is.

Trying to inject security into an application through testing is a fool's errand. The num-ber of paths through an application is nearly infinite, and you can’t guarantee that all thosepaths are free of vulnerabilities. It's simply not feasible to identify and test each and everypath for vulnerabilities. Moreover, errors would be difficult to fix considering that theeffort, cost, and time required to fix each bug increases exponentially as the developmentprocess progresses. Most importantly, the bug-finding approach to security fails to addressthe root cause of the problem. Security, like quality, must be built into the application.

Building security into an application involves designing and implementing the applica-tion according to a policy for reducing the risk of security attacks, then verifying that thepolicy is implemented and operating correctly. In other words, security requirementsshould be defined, implemented, and verified just like other requirements.

For example, establishing a policy to apply user input validation immediately after the

By Matt Love

One key problem with security code audits is that they tend tocause more problems than they solve. “One size fits all” audit

Matt Love is a software development manager at Parasoft.

APRIL 2009 www.stpcollaborative.com • 29

Stuck With TwoImpossible Choices

When It

Comes To

Security

Auditing,

One Size

Does Not

Fit All

Pho

togr

aph

by A

ndy

Dea

n

Page 30: Software Test & Performance issue Apr 2009

30 • Software Test & Performance APRIL 2009

ROCK-HARD SECURITY

input values are received guaranteesthat all inputs are cleaned before theyare passed down through the infinitepaths of the code and allowed to wreakhavoc (see Figure 1). If this requirementis defined in the security policy then ver-ified to be implemented in the code, theteam does not need to spend countlessresources finding every bug and testingevery possible user input.

One of the best strategies forbuilding security into the appli-cation is to define how codeneeds to be written to protect itfrom attacks, then use staticanalysis to verify that the policy isimplemented in the code. Thisarticle provides an overview ofhow this can be accomplished.

Establishing ASecurity PolicyWriting code without heed forsecurity then later trying to iden-tify and remove all of the appli-cation’s security vulnerabilities isnot only resource-intensive, it’s alsolargely ineffective. To have any chance ofexposing all of the security vulnerabili-ties that may be nested throughout theapplication, you would need to identifyevery single path through the applica-tion, and then rigorously test each andevery one. A policy-based approachhelps alleviate that problem.

Security policies are espoused bysecurity experts, such as Open WebApplication Security Project (OWASP),and are mandated for compliance with

many regulations, such as Sarbanes-Oxley, that require organizations todemonstrate they have taken "due dili-gence" in safeguarding applicationsecurity and information privacy. Yet,although the term is mentioned fre-quently, it is not often defined.

A security policy is a specificationdocument that defines how code needs

to be written to protect it from attacks.Security policies typically include cus-tom security requirements, privacyrequirements, security coding best prac-tices, security application design rules,and security testing benchmarks.

What do you do if your team doesnot already have well-defined securitypolicy? If the organization has desig-nated security experts, they should bewriting these requirements. If not,security consultants could be broughtin to help develop appropriate

requirements for the specific applica-tion under development. Obviously,this would require considerable inter-action with the internal team membersmost familiar with the application.

The security policy should describewhat types of resources require privi-leged access, what kind of actions shouldbe logged, what kind of inputs should bevalidated, and other security concernsspecific to the application. To be sure keyrequirements are not overlooked, I rec-ommend listing all the important assetsthat a given application interacts with,then prioritizing them based on theimportance of protecting each asset.

Applying the Security PolicyHaving an effective security policydefined on paper will not translate to asecure application unless the securitypolicy is followed during development.Static analysis can be used to automati-cally verify whether most security policyrequirements are actually implementedin the code and identify code thatrequires rework. Verifying the remainingsecurity policy requirements mightrequire unit testing, component testing,peer code review or other techniques.

Using static analysis to automaticallyverify the code’s compliance to applica-

tion-specific security policyrequirements (for instance, forauthentication, authorization,logging, and input validation)requires expressing thoserequirements as custom staticanalysis rules, then configuringthe tool to check those customrules. Often, developing suchcustom rules is simply a matter oftailoring the static analysis tool’savailable security policy rule tem-plates to suit your own policy. Forinstance, custom SOA securitypolicy rules can be created fromtemplates such as:

• Do not import WSDLsoutside a certain domain

• Do not import schemas outside acertain domain

Custom Java security policy rulescan be created from templates such as:

• Ensure all sensitive method invo-cations are logged

• Allow only certain providers to bespecified for the ''Security.addProvider()'' method

• Keep all access control methodscentralized to enforce consistency

Static analysis can also be used to

•An effective security policy on paper

will not translate to a secure

application unless it's followed.

Input

Input

Input

SQLStatement

SQLStatement

X PathQuery

SwitchIf

SQLStatement

FIG. 1: ONE CODE BRANCH, MULTIPLE INPUTS

Page 31: Software Test & Performance issue Apr 2009

check whether code complies withindustry-standard security best practicesdeveloped for the applicable languageand technologies. Many available staticanalysis tools can check compliance tosuch standards “out of the box,” andwith no special configuration.

If you are developing in Java, youwould want to perform static analysisto check industry-standard Java securi-ty rules such as:

• Validate an 'HttpServlet Request'object when extracting data from it

• Use JAAS in a single, centralizedauthentication mechanism

• Do not cause deadlocks by callinga synchronized method from asynchronized method

• Use only strong cryptographicalgorithms

• Session tokens should expire• Do not pass mutable objects to

'DataOutputStream' in the'writeObject()' method

• Do not set custom security man-agers outside of 'main' method

For SOA, you would want to checkindustry-standard rules such as:

• Avoid unbounded schema se-quence types

• Avoid xsd:any, xsd:anyType andxsd:anySimpleType

• Avoid xsd:list types• Avoid complex types with mixed

content• Restrict xsd simple types• Use SSL (HTTPS) in WSDL serv-

ice ports• Avoid large messages• Use nonce and timestamp values

in UsernameToken headersTo illustrate how following such

industry-standard rules can prevent secu-rity vulnerabilities, consider the rule“Validate an 'HttpServletRequest' objectwhen extracting data from it.” Followingthis rule is important because HttpServletRequest objects contain user-mod-ifiable data that, if left unvalidated andpassed to sensitive methods, could allowserious security attacks such as SQL injec-tion and cross-site scripting. Because itallows unvalidated user data to be passedon to sensitive methods, static analysiswould report a violation of this rule forthe following code:

String name = req.getParameter(“name”);

To comply with this rule, the code would need to bemodified as follows:try {String name =

ISOValidator.validate(req.getParameter(“name”));

} catch (ISOValidationException e) {ISOStandardLogger.log(e);

}

XML is no safe haven either. For SOAapplications, applying industry-standardstatic analysis rules can expose commonsecurity vulnerabilities that manifestthemselves in XML. For example, staticanalysis could be used to parse the docu-ment type definitions (DTDs) that defineXML files and check for recursive entitydeclarations that, when parsed, canquickly explode exponentially to a largenumber of XML elements. If such “XMLbombs” are left undetected, they can con-sume the XML parser and constitute adenial of service attack. For instance, stat-ic analysis could be used to identify thefollowing DTD that, when processed,explodes to a series of 2100 “Bang!” ele-ments and will cause a denial of service:

<?xml version=”1.0” ?><!DOCTYPE foobar [<!ENTITY x0 “Bang!”><!ENTITY x1 “&x0;&x0;”><!ENTITY x2 “&x1;&x1;”>...<!ENTITY x99 “&x98;&x98;”><!ENTITY x100 “&x99;&x99;”>

]>

Go with the Flow?Data flow analysis is often hailed as

a panacea for detecting security vul-nerabilities. It is certainly valuable forquickly exposing vulnerabilities inlarge code bases without requiring youto ever write a test case or even run theapplication (see Figure 2). However,there are some notable shortcomings:

• A complex application has a vir-tually infinite number of paths,but data flow analysis can traverseonly a finite number of pathsusing a finite set of data. As aresult, it finds only a finite num-ber of vulnerabilities.

• It identifies symptoms (where thevulnerability manifests itself)rather than root causes (the codethat creates the vulnerability).

Rules-based static analysis exposesroot causes rather than symptoms, andcan reliably target every singleinstance of that root cause. If you useflow analysis, it will probably find youa few instances of SQL injection vul-nerabilities, but it cannot find themall. However, if you enforce an inputvalidation rule through rules-basedstatic analysis—finding and fixingevery instance where inputs are notproperly validated—you can guaran-tee that SQL injection vulnerabilitieswill not occur.

ROCK-HARD SECURITY

APRIL 2009

Page 32: Software Test & Performance issue Apr 2009

32 • Software Test & Performance APRIL 2009

ROCK-HARD SECURITY

I recommend using rule-based staticanalysis to prevent vulnerabilities andthen employing flow analysis to verifythat you implemented the appropriatepreventative measures and that thesemeasures are being applied properly.No problems should be identified atthis point. Issues found at this phaseusually indicate process problems thatshould be addressed immediately. Ifflow analysis does find a problem, iden-tify its root cause, then enable or createa rule that flags the root cause. By inte-grating this rule into your regularenforcement process, you expose otherinstances of the same problem and canprevent it from re-entering the codebase in the future.

Policy Implementation WorkflowAs new vulnerabilities are found, isolatethem and find the root cause for theissue. Once the root cause is identified,a policy is implemented around it. A fixfor the vulnerability is determined, andthen your static analysis tool is config-ured to check whether code is writtenaccording to the new rule. This check-ing is then added to your regularly-scheduled static analysis tests so thatmoving forward, you know that the vul-nerability remains fixed. The policy is

then applied across the application andorganization— ensuring that everyinstance of that vulnerability is fixed.

Penetration TestingOnce you’re confident that the securitypolicy is implemented in the code, asmoke test can help you verify that thesecurity mechanisms operate correctly.This is done through penetration test-ing, which involves manually or auto-matically trying to mimic an attacker's

actions and checking if any tested sce-narios result in security breaches. Whenpenetration testing is performed in thismanner, it can provide reasonable assur-ance of the application's security after ithas verified just a few paths througheach security-related function.

If the security policy was enforced

using static analysis, the penetrationtesting should reveal two things:

1. Problems are related to securitypolicy requirements that cannot beenforced through static analysis (forinstance, requirements involving Perl):If problems are identified, either thesecurity policy must be refined or thecode is not functioning correctly andneeds to be corrected. In the latter case,locating the source of the problem willbe simplified if the code's security oper-

ations are centralized (as required bythe recommended security policy).

2. Requirements are missing: Forexample, consider a Web applicationthat requires users to register. The reg-istration form takes in a variety offields, one of which is the -mailaddress. If the e-mail field is known totake any input, the application is miss-ing a requirement to verify that a valide-mail address is input into the field.

Moreover, to ensure that coderemains secure as the applicationevolves, all security-related tests (includ-ing penetration tests, static analysis tests,and other requirements tests) should beadded to a regression test suite, and thistest suite should be run on a regularly-schedule basis (preferably nightly). Testsare then performed consistently, with-out disrupting the existing developmentprocess. If no problems are identified,no team intervention is required. If testsfind that code modifications reintro-duce previously-corrected security vul-nerabilities or introduce new ones, theteam is alerted immediately. This auto-mated testing ensures that applicationsremain secure over time and also pro-vides documented proof that the appli-cation security policy has been enforced.

Don’t get stuck with Sophie’sChoice. To avoid the dilemma of hav-ing to choose between delaying a proj-ect to fix errors and deploying a prod-uct with known vulnerabilities, incorpo-rate security from the start—at therequirements phase. ý

IsolateVulnerabilities

ImplementPolicy

FixVulnerabilities

RegressionTest

Test for Security

STAR

T HE

RE

VulnerabilitiesFound?

Security TestSucceededNO

FIG. 2: SECURE WORKFLOW

Page 33: Software Test & Performance issue Apr 2009

When it comes to workingwith .NET, protection isthe key. Protection of theactual application fromdecompiling will keepyour intellectual propertyfrom being stolen. Anddevelopment of far-reach-ing test cases should pro-tect against performancewoes resulting from poorlywritten code.

Though .NET offers an efficientframework for developing and deploy-ing Windows applications, whendeployed they provide hackers witheasy access to the source code and theembedded intellectual property. AtV.i. Labs, the mission is to determinethe “crack risk level” when those appsare run.

“We’ve seen a lot of organizationsquickly migrate to .NET and whatbecomes apparent when we reviewtheir code is that by going to .NETthey exposed their innovation or intel-lectual property to be easily reverseengineered, or are subjected to piracyissues because they had licensing rou-tines that could be easily discovered ordisabled,” says product vice presidentVictor DeMarines. “Of all the develop-ment platforms we see, .NET apps areby far the easiest to crack.”

Microsoft in the past has bundledthird-party obfuscation tools that canbe used on portions of an applicationdeemed most sensitive. It providesonly a first line of defense against theoccasional curious user or IT personlooking at the code with a Reflectiontool, DeMarines says. For serious hack-ers it’s not a deterrent.

Obfuscation, he says, is not nearlyenough. “You may want to put in tamp-er-detection code, or anti-debuggingcode, or implement third-party prod-ucts to encrypt the intermediate lan-

guage within .NET. Thisprovides further defenseagainst an advancedthreat, especially if youare worried about piracyor the algorithms in thesoftware.” Most at risk, hesays, are developers creat-ing financial, e-voting,and even casino softwarethat must deal with gov-ernmental regulatory or

compliance standards.Protecting code may require chang-

ing not the overall architecture, butrather the way algorithms and DLLsare packaged and put into executa-bles, or how they are distributed on aserver. Says DeMarines, “Once youstart thinking about what’s sensitive inyour software, you will start to imple-ment a bit differently.”

On the testing side, if test cases arenot complete and fail to test everyaspect of an API, then the entire API ispotentially flawed when you startbuilding services and clients on top ofthat logic, says Michele LerouxBustamante, chief architect at IDesignand a Microsoft MVP for ConnectedSystems.

Working with a customer deploy-ing a system built with WindowsCommunication Foundation (WCF)services, response times for eachrequest were critical. After somehand-written load testing, significantissues with performance were discov-ered, “not to mention that [the] num-ber of concurrent calls when testingon a laptop was dropping very shortonce 30 users were added to the loadtest,” she says. Though the IT depart-ment believed WCF to be the issue,the IDesign team pushed for furthertesting.

Using Redgate’s ANTS profiler, theteam discovered bottlenecks in the

code. Also, there was a problem withthe way the load test sent requests,such that concurrent calls were throt-tled earlier than expected. “Oncethese issues were addressed, the performance time of a single requestwas improved and the concurrent calls throttle was no longer lower than expected.” The moral, saysBustamante, is that a problem may notbe what you think it is, “so it is impor-tant to look a little deeper with appro-priate tools and a little commonsense.”

When exercising each tier, Busta-mante says sometimes the best testclients are those built by hand. “Youcan also build simple load tests thisway, but using a load testing tool suchas that built in to Visual Studio TeamSystem is usually better because withcustom load tests it is sometimes diffi-cult to be sure if problems were intro-duced in the load testing code or if itis indeed the application.” As for pro-filing, she says ANTS profiler for .NETis a must-have “to see where your codeis spending the most time.”

Although with a simple load testand ANTS it’s relatively easy to discov-er any performance issues the applica-tion will have in a production environ-ment, she cautions this doesn’t neces-sarily scale for extremely large loads,but says “the issues you see with simpleload tests of up to 30 users uncovers 90percent of problems without spendingmoney on expensive third-party loadtesting companies.” Those companies,Bustamante says, often can’t do muchmore than you could have uncoveredwith your own tests. ý

Without Protection, .NETApps are Easily Cracked

Best Practices

Joel Shore

Joel Shore is a 20-year industry veteran andhas authored numerous books on personal com-puting. He owns and operates Reference Guide,a technical product reviewing and documenta-tion consultancy in Southborough, Mass.

APRIL 2009 www.stpcollaborative.com • 33

Page 34: Software Test & Performance issue Apr 2009

Organizations might recog-nize the need for consult-ants but remain unsureabout how to use them. In atraditional consulting ar-rangement, a firm mightdeploy a team of individualsfull-time at a client site forforty hours a week, typicallyfour days at ten hours perday per consultant. Undermilestone consulting, aclient engages the firm to check in withthem on a regular basis, ensuring that theproject is meeting its goals. Then there’sthe hybrid consultant—offering equalparts project manager, techie, and appli-cation expert—with on-site visits everytwo weeks or so.

Consultancies typically prefer the tra-ditional consulting arrangement, prima-rily because it maximizes billable timeand revenue. Second, consultants on theground can better steer clients in theright direction throughout the project,manage issues and ensure an overallsmoother implementation. On the down-side, traditional consulting tends to bethe most expensive option for clients.Also, many organizations face end-useravailability issues. Client end-users areoften overworked and too busy to spendtime with consultants. Remember, end-users on implementation teams have dayjobs while consultants are there to imple-ment the new system exclusively.

Consultants on site are billingregardless of whether r not their skillsare being used efficiently or not. In therare event that a project is ahead ofschedule, rare is the consulting compa-ny that attempts to move dates up orsuggests that its consultants do notneed to be on site for several weeks.

Among the most obvious of benefitsof the milestone consulting approach isminimal cost. To the extent that the con-

sultant’s arrival is knownwell in advance, end-userscan focus on their day jobsduring the week knowingthat they will devote certaindays to the new system, coin-ciding with the arrival of theconsultant. In theory, thiscan be more efficient.

But the milestone meth-od should be used judicious-ly; it is rife with potential dis-

advantages. For example, there may beno one keeping an eye on the implemen-tation on a daily basis, allowing goals anddates to fall by the wayside. Issues may notbe broached in time to address themwithout impacting a go-live date. Also, theimplementation’s flow may suffer.Projects that constantly start and stopoften lose momentum. Projects withmore interruptions have a greaterchance of failure and milestone-basedapproaches tend to have this limitation.

Given the cost of consultants, manyclients might question the need to have ateam of three or more highly paid hourlyresources on staff for forty hours perweek. As a general rule, the quality andnumber of required external resourcesvaries indirectly with the quality andnumber of available and experiencedinternal end-users. In other words, anorganization with extensive internalresources and expertise needs fewerexternal consultants. Organizations can-not expect to successfully implementmajor systems exclusively with either con-sultants or end-users. Almost always, acombination of each is required.

Another consideration is end-useravailability. Regular employees still haveto do their day jobs in order for theorganization to conduct business. Forexample, a payroll manager cannot setup, test, and document a new payroll sys-tem at the expense of paying current

employees. If an organization wants tominimize the number of external con-sultants on an implementation, it mustensure that the end-users on its imple-mentation team. It must ensure that theend-users on its implementation team aresignificantly devoted to, and have suffi-cient expertise in that system. A project’stimeframe, complexity and scope are alsocritical factors. All else being equal, con-sultants called in to solve a discrete taskwith no particular due date may not needto show up for months at a time.Assuming an organization’s documenta-tion is sufficient, a consulting firm maybe able to perform the work requiredusing the milestone approach. Con-versely, consider a client with a bevy ofcomplex issues, poor internal documen-tation, and a “drop-dead” date of twomonths to resolve an issue. It’s veryunlikely that the client will be able to useconsultants in a limited capacity. Minimalconsultant input and resources does notmean zero. On just about every new sys-tem implementation or upgrade, organi-zations must use external applicationexperts, technical resources adept atinstalling the application and seasonedproject managers who have dealt withmany of the issues likely to face the client.

Many organizations lack the expert-ise that consultancies provide. As such,organizations benefit from havingknowledgeable, on-site consultantswho ensure that the project stays oncourse, issues are reported andresolved and that individual objectivesare met. Before hiring external con-sultants, senior management shouldconsider budget, the state of its inter-nal documentation, end-user availabil-ity and the timeframe, scope, and com-plexity of the issue or project. A com-plex but poorly-documented issue thatneeds to be resolved yesterday cannotbe accomplished under a milestoneapproach. At the other extreme, a sim-ple but less urgent issue probablydoesn’t require a full-time team ofconsultants to solve it.

Most real-world scenarios fall some-where in between. ý

Future Test

34 • Software Test & Performance APRIL 2009

Traditional vs.Milestone

Phil Simon

Future Test

Philip Simon is an independent consultantserving manufacturing, health care and retailindustries.

Page 35: Software Test & Performance issue Apr 2009

On Another Issue of The

Test & QA ReporteNewsletter!

Each FREE biweekly issue includes original articles that interview top thought leaders in software testing and

quality trends, best practices and Test/QA methodologies.Get must-read articles that appear

only in this eNewsletter!

Subscribe today at www.stpmag.com/tqa

To advertise in the Test & QA ReporteNewsletter, please call +631-393-6054

[email protected]

DDoonn’’ttMMiissss OOuutt

Page 36: Software Test & Performance issue Apr 2009