sss validation and testing september 11, 2003 rockville, md william mclendon neil pundit erik...

33
SSS Validation and Testing September 11, 2003 Rockville, MD Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.

Upload: joy-henry

Post on 13-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

SSS Validation and Testing

September 11, 2003

Rockville, MDRockville, MD

William McLendon

Neil PunditErik DeBenedictis

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.

Page 2: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

• APItest

• Release Testing Experiences at Sandia

• Status daemon

Overview

Page 3: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Distributed Runtime System Testing

• Complex system of interactions• Approach to testing

– Component Testing

– Benchmarks• Performance / Functionality

– Operational Profile

– Stress Testing

• Users expect a high-degree of quality in today’s high end systems!

Page 4: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

APItest

Page 5: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

APITEST - Overview

• Unit-testing tool for network components– Targeted for networked applications– Extensible framework– Dependency calculus for inter-test relationships

• Scriptable Tests (XML Schema Grammar)

• Multi-Protocol Support– TCP/IP, SSSLib, Portals, HTTP

Page 6: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Accomplishments Since Last Meeting

• Spent a week at Argonne (July)– Major rework of framework of APItest

• Individual tests are atomic.• Framework handles the hard work of checking tests,

dependencies, and aggregate results.

– Extensibility • New test types are easy to create

• Dependency System– Define relationships as a DAG encoded in XML.

– Boolean dependencies on edges.

Page 7: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Supported Test Types

• sssTest– use ssslib to communicate with ssslib enabled

components

• shellTest– execute a command

• httpTest– ie. app testing web interfaces (a’la Globus, etc)

• tcpipTest– raw socket via tcpip transmission.

Page 8: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Creating New Test Types is Easy

A simple test that will always pass:

class passTest(Test):__attrfields__=[‘name’]typemap =

{‘dependencies’:TODependencies}

def setup(self):pass

def execute(self, scratch):self.expect[‘foo’] = [ (‘REGEXP’,

‘a’) ]self.response[‘foo’] = ‘a’

Page 9: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Matching and Aggregation

• An individual test can be executed many times in a sequence.– PASS/FAIL can be determined based on the percent of runs

that matched.– Percent Match can be specified as a range as well.

• Expected result is specified as a regular expression (REGEX) or a string for exact matching (TXTSTR)

• Notation:– M[min:max] - Percent matching.

• min/max = bounds on % of tests where actual and expected results match.

• If the actual number of tests is within the range specified the test will PASS, otherwise it will FAIL.

Page 10: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Test Dependencies

T iff A[40:90]OR B[0:0]

M[0:0]

A B

?

M[40:90]

A B

M[100:]

C

?

M[90:]

M[:]

T iff (A[100:] B[90:]) C[:]

M[40:90] : >= 40% and <= 90% of test runs matched

Page 11: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

An Example Dependency

(A[100:] ((B[100:100] C) D[:0] ))

<dependencies> <AND> <dependency name=’A' minPctMatch=’100'/> <OR> <AND> <dependency name=’B' minPctMatch=’100'/> <dependency name=’C'/> </AND> <dependency name=’D’ maxPctMatch=‘0’/> </OR> </AND></dependencies>

Page 12: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

An Example Test Sequence

reset daemon

test daemon

test other stuff

M[:]

M[:]

M[:30]

Page 13: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Standard Test Metadata Attributes

Attribute Type Req’d Default Desc

name string YES name of test

numReps integer NO 1 number of times to execute test

minPctMatch float NO 0.0 min % of repetitions that must match for test to pass

maxPctMatch float NO 100.0 max % of repetitions that must match for test to pass

preDelay float NO 0.0 delay in seconds prior to executing test

postDelay float NO 0.0 delay in seconds after executing test before continuing to next test.

iterDelay float NO 0.0 delay in seconds between tests during loop (only used if numReps > 1)

onMismatch string NO CONTINUE If a test fails to match, what do we do?

CONTINUE,BREAK,HALT

Page 14: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Example Scripts

• A simple shell execution test:

<shellTest name=‘test1’ numReps=‘1’ preDelay=‘5’ postDelay=‘5.3’ command=‘ls -ltr’/>

• Test with a dependency and stdout matching:

<shellTest name=‘test2’ command=‘apitest.py --test’> <output format=‘REGEXP’ type=‘stdout’>.*stdout.*</output> <dependencies> <dependency name=‘test1’ minPctMatch=‘100.0’/> </dependencies></shellTest>

Page 15: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

APItest Output

iterations test name % matched Pass/Fail message---------- --------- --------- --------- ----------[1 of 1] A 100.00% PASS [1 of 1] K 100.00% FAIL m[0.0% : 0.0%][1 of 1] J 0.00% FAIL m[90.0% : 90.0%][5 of 5] M 100.00% PASS [1 of 1] L 100.00% FAIL m[0.0% : 0.0%][1 of 1] N 100.00% PASS [0 of 1] T DEPENDENCY FAILURE(S)

F expected [0.0% : 90.0%], got 100.0J expected [90.0% : 100.0%], got 0.0

[0 of 1] R DEPENDENCY FAILURE(S)J expected [90.0% : 100.0%], got 0.0K expected [0.0% : 90.0%], got 100.0

[1 of 1] S 100.00% PASS [1 of 1] U1 100.00% PASS [1 of 1] U2 100.00% PASS

[0 of 1] U3 DEPENDENCY FAILURE(S)N expected [0.0% : 90.0%], got 100.0S expected [0.0% : 90.0%], got 100.0

[1 of 1] U4 100.00% PASS

Page 16: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

sssTest outputs from Chiba City

iterations test name % matched Pass/Fail message---------- --------- --------- --------- ----------[1 of 1] add-location 100.00% PASS [1 of 1] QuerySDComps 100.00% PASS [1 of 1] QuerySDHost 100.00% PASS [1 of 1] QuerySDProtocol 100.00% PASS [1 of 1] QuerySDPort 100.00% PASS [1 of 1] del-location 100.00% PASS [1 of 1] val-removal 100.00% PASS

iterations test name % matched Pass/Fail message---------- --------- --------- --------- ----------[1 of 1] sss-getproto 100.00% PASS [1 of 1] sss-getport 100.00% PASS [1 of 1] sss-gethost 100.00% PASS [1 of 1] sss-getcomp 100.00% PASS [1 of 1] sss-getproto 100.00% PASS [1 of 1] sss-getport 100.00% PASS [1 of 1] sss-gethost 100.00% PASS [1 of 1] sss-getcomp 100.00% PASS

Page 17: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Release Testing…

Page 18: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Tales from Cplant Release Testing

• Methodical execution of production jobs and 3rd Party benchmarks to identify system instabilities, enabling them to be resolved. Ie:– Rapid job turnover rate (caused mismatches between

scheduler and allocator)– Heavy io (I/O which passes through launch node

process instead of directly to ENFS “yod-io”)

• Wrapping above codes into Ctest framework to enable portable compile, launch, and analysis of synthetic workloads

Page 19: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Ctest

• Extension of Mike Carifio’s work

– Presented at the SciDAC meeting in Houston during Fall of 2002

– Make structure that holds a suite of independent applications.

– Tools to launch as a reproducible workload.

– Goal: 30 users and 60 concurrent apps

Page 20: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Sample Load Profile on CPlant

Page 21: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Issue Tracking

• SNL uses a program called RT– Centralized repository for issue tracking helps give

an overall picture of what problems are.

– Helps give summary of progress.

• Bugzilla is on the SciDAC SSS website– http://bugzilla.mcs.anl.gov/scidac-sss/

– Who’s using it?

Page 22: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Status Daemon …

Page 23: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Status Daemon

• Highly configurable monitoring infrastructure for clusters.– Does not need to run daemon on the node you are

monitoring.

– XML configurable

– Web interface

• “Cluster Aware”• Used on CPlant production clusters• James Laros ([email protected])

Page 24: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Status Daemon Communication

Admin Node

Local Tests

Daemon

Disk

Status Update

Status

XMLConfig File

XML Data

RemoteTests

Remote Tests

Compute nodesLeader

LocalTests

DaemonXML Data

Remote Tests

Compute nodesLeader

LocalTests

DaemonXML Data

Page 25: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated
Page 26: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated
Page 27: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated
Page 28: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated
Page 29: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated
Page 30: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated
Page 31: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Summary

• New Hire– Ron Oldfield

• APItest functionality and flexibility increases

• Release testing experience

• Status Daemon

Page 32: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Plans

• APItest– User / Programmer Manuals– User Interface

• GUI? HTTP?

– daemon mode for parallel testing mode– DB Connectivity– Test Development

• ssslib event tests• HTTPtest work• ptlTest (SNL)

• SWP Integration– Port SWP to Chiba for SC2003?

Page 33: SSS Validation and Testing September 11, 2003 Rockville, MD William McLendon Neil Pundit Erik DeBenedictis Sandia is a multiprogram laboratory operated

Questions?