libmesh: lessons in distributed collaborative design and ...roystgnr/libmesh_cse13_talk-talk.pdf ·...
TRANSCRIPT
libMesh: Lessons in Distributed Collaborative Designand Development
Roy H. Stogner1 John W. Peterson2
1The University of Texas at Austin
2Idaho National Laboratory
Feb 26, 2013
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 1 / 31
Introduction
Outline
1 Introduction
2 Collaboration Strategies
3 API Development
4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History
5 Development and Testing
6 Build Systems
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 2 / 31
Introduction
libMesh Finite Element Library
Scope• Open source, free to download
I LGPL
• 13 Ph.D. theses, 186 papers (30in 2012)
• ∼ 10 current developers
• O (100) current users?
Challenges• Widely dispersed core developers
I INL, UT-Austin, JSC, MIT, Harvard,Argonne
• ITAR, Commercial applications common
• Radically different application types
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 3 / 31
Collaboration Strategies
Outline
1 Introduction
2 Collaboration Strategies
3 API Development
4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History
5 Development and Testing
6 Build Systems
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 4 / 31
Collaboration Strategies
Collaboration Strategies
Communication• Face to face, instant messaging, teleconference• Email lists
I [email protected],[email protected]
• Trac tickets, Redmine issues
• SourceForge, GitHub issues
Code• Email attachments• Ticket attachments
I Repository forks!I Pull requests!
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 5 / 31
API Development
Outline
1 Introduction
2 Collaboration Strategies
3 API Development
4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History
5 Development and Testing
6 Build Systems
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 6 / 31
API Development
Tracking API Changes
API versions easily proliferate...#if PETSC_VERSION_LESS_THAN(3,1,0)
ierr = MatGetSubMatrix(matrix->mat(),
_restrict_to_is,_restrict_to_is_complement,
PETSC_DECIDE,MAT_INITIAL_MATRIX,&submat1);
CHKERRABORT(libMesh::COMM_WORLD,ierr);
#else
ierr = MatGetSubMatrix(matrix->mat(),
_restrict_to_is,_restrict_to_is_complement,
MAT_INITIAL_MATRIX,&submat1);
CHKERRABORT(libMesh::COMM_WORLD,ierr);
#endif
• Maintain a wide range of external compatibility
• Limit libMesh API changes
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 7 / 31
API Development
Signaling API Changes
Development practices• Old, new APIs overlap• Easier with C++ function overloading, default arguments
I Adding f(a,b) does not preclude keeping f(a)I Adding f(a,b=default) can replace f(a)
Runtime warnings• libmesh experimental() (in-flux APIs)
• libmesh deprecated() ( 1 year, 1-2 releases)
Examples• OStringStream workaround class
• Parallel:: global functions
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 8 / 31
Source Code Control
Outline
1 Introduction
2 Collaboration Strategies
3 API Development
4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History
5 Development and Testing
6 Build Systems
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 9 / 31
Source Code Control
• When discussing SCC software, the distinction between “distributed”and “centralized” is often stressed, perhaps unnecessarily.
• Distributed SCC software, like git, is very frequently used in asemi-centralized manner.
• The libMesh library is now distributed from GitHub1, and thereforewe focus on git in this talk, but the discussion should apply to otherSCC software as well.
1https://github.com/libMesh/libmesh
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 10 / 31
Source Code Control Attention Deficit Development
• A more intrinsic difference between various flavors of SCC software israther the ability to make “local commits.”
• git and other “distributed” SCC software packages (hg) have thisfeature.
• SVN lacks this feature, and therefore makes work interruptions (whichcan be rather frequent in collaborative development) difficult tohandle.
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 11 / 31
Source Code Control Attention Deficit Development
• Consider the following scenario:I You are working on a new feature and have several locally-modified
files (“A”, “D”, or “M” state in svn status)I You receive email from a collaborator about a bug fix he’d like you to
test ASAP. His patch may or may not conflict with your current set ofchanges.
• What do you do?
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 12 / 31
Source Code Control Attention Deficit Development
• In SCC software without local commits, your choices are:1 Make a patch of your local changes (e.g. svn diff), revert them, and
hope to come back to them later.2 See if your collaborator’s patch applies cleanly on top of what you are
already doing.3 Create a fresh checkout, apply the patch, recompile everything, and
test.
• The choices aren’t pretty:1 This is manual source code control, something tools should help you
avoid!2 If the patch program fails, the results can be cryptic; if patch succeeds,
it may be hard to revert later.3 This approach clearly doesn’t scale in disk space or CPU cycles.
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 13 / 31
Source Code Control Attention Deficit Development
• In SCC software with local commits, specifically git, and assumingyou are working on my-branch, you:
I git ci your work.I Create a new branch, probably from master.I Apply your collaborator’s patch, let him know what you find.I git co my-branch
• Once you are back on my-branch, you can do a “soft reset” to getback to exactly where you were before the interruption.
• If you don’t want to mess with extra branches, you can instead git
stash what you’re currently doing, try out your collaborator’s patch,and git stash pop to return to your original state.
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 14 / 31
Source Code Control Linear and Nonlinear History
• The first question a git-based development team2 should debate iswhether maintaining a “linear” history is desirable/important.
• There are pros and cons to both linear and nonlinear developmenthistories.
• The answer probably lies somewhere between “rigidly-enforcedlinearity” and “merges gone wild.”
2Especially teams transitioning from SVN.Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 15 / 31
Source Code Control Linear and Nonlinear History
Example - Useful Nonlinearity
* 4df7f73 Adding list of bibtex templates.
* e04db6d Merge pull request #45 from benkirk/eigen
|\
| * e3bd55d get contributed Eigen into build system
| * 13fa33d add eigen-3.1.2 unsupported API
| * d03f946 adding eigen-3.1.2 to contrib
|/
* 1249c5d more fine-grained fallback for --disable-mpi
* e15fef7 use <rpc/xdr.h> when it is there
• A (short-lived) feature branch is created, committed to, and merged backinto master.
• Preserves the context in which development took place. Useful!
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 16 / 31
Source Code Control Linear and Nonlinear History
Example - Rigid Linearity
* bc56be9 Fixes for our Epetra vector interface
* 4f2b016 Making reading work, adding support ...
* 243753e Again, don’t degrade to single precision ...
* b277d0a We are using a vtkDoubleArray, so don’t ...
* 6bac31a Hoist function calls out of loop conditionals.
* fe85fae Standardizing spacing, formatting, indentation, etc.
* ce703f9 Use VTK_LEGACY_REMOVE. Thanks, cato-, for the idea.
* 84da4b4 prevent netcdf from running most of the ...
• Commits fe85fae — 4f2b016 are a group of logically-connected changes.
• This information is lost because the author did his development directly onmaster instead of branching.
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 17 / 31
Source Code Control Linear and Nonlinear History
Example - Misleading Nonlinearity
* cfd23fa Merge branch master
|\
| * 285ebaa Adding citations and webpage generation script.
* | 1aa5d5f trump --enable-petsc with --disable-mpi
* | 9644c5f fallback to rpc/xdr.h when looking for xdr.
|/
* 070515a LibMeshInit can accept more argument constness
• The three “middle” commits are unrelated to one another.
• The author of 9644c5f and 1aa5d5f ran git pull, bringing down anunrelated change, and producing a merge commit.
• Branch does not preserve any particular development context.
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 18 / 31
Source Code Control Linear and Nonlinear History
Example - Merges Gone Wild
* 9f639a6 Merge branch master
|\
| * 936e197 Merge branch master
| |\
| | * 2b80c18 Remove uninitialized dphi warnings ...
| * | 0309eac is_adjoint() bugfixes
| |/
* | 514052e UnsteadySolver fixes and optimization
* | e5aac7b Merge branch master
|\ \
| |/
| * b6155a5 Changes in quadrature_simpson_3D.C for -Wshadow.
• Three “merge” commits and four “real” commits.
• High signal-to-noise ratio.
• git bisect often stops at merge commits.
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 19 / 31
Source Code Control Linear and Nonlinear History
Current Guidelines
• Strive for “useful nonlinearity.”
• Develop separate feature sets on separate branches; merge themback to master when complete.
• Minimize or eliminate periodic/unnecessary merge commits.
• Instead, rebase feature branches on top of master before mergingand pushing
• Rebasing public (aka shared) branches is badTM, so wait until you areready to push, branch from the shared branch locally, rebase it on topof master, and then merge it.
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 20 / 31
Source Code Control Linear and Nonlinear History
• git can be complicated, but itis not inherently so.
• Forget cats, there is more thanone way to skin every type ofanimal in git.
• Teams should find theapproach that works best forthem.
• http://nvie.com/posts/a-successful-git-branching-model
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 21 / 31
Development and Testing
Outline
1 Introduction
2 Collaboration Strategies
3 API Development
4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History
5 Development and Testing
6 Build Systems
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 22 / 31
Development and Testing
Software Tracking
• Trac, Redmine - Wiki and issue tracking systems for softwaredevelopment projects
I http://trac.edgewall.org, http://www.redmine.orgI Interface to your VCS of choiceI Issue tracking (aka tickets) can reference commits and vice versaI Open source: (BSD, GPL2)
• Bitten, BuildBot - Continuous IntegrationI http://bitten.edgewall.org, http://trac.buildbot.netI Build recipes in XML, Python formatsI Can send build failure notifications directly to relevant parties
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 23 / 31
Development and Testing
Issue tracking (tickets)
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 24 / 31
Development and Testing
Issue tracking (tickets)
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 25 / 31
Development and Testing
Build Status
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 26 / 31
Development and Testing
Regression Testing
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 27 / 31
Development and Testing
Diagnosing Failed Builds
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 28 / 31
Build Systems
Outline
1 Introduction
2 Collaboration Strategies
3 API Development
4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History
5 Development and Testing
6 Build Systems
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 29 / 31
Build Systems
Autotools, Pros and Cons
Autoconf• Manages feature selection
I 50+ --enable-foo options
• Portability tests, workarounds
• POSIX shell dependence
Libtool• Easily used via automake
• Broader shared library support
• DLL management in install
• More difficult in-place debugging
Automake• dist, check, install
targets
• Out-of-source builds
• Standardizedconventions
• More difficult METHODsupport
• “bootstrap” processI Do users have
autotools?I Custom scripts for
libMesh
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 30 / 31
Build Systems
Questions?
Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 31 / 31