lpw 2007 - perl plumbing
TRANSCRIPT
Perl Plumbing
Mike Astle
LPW 1 Dec 2007
I'm Mike. I work at Nestoria. We are pleased to again be sponsors of the LPW.
I'm going to talk about the things that programmers build that aren't products that help them develop better products.
Will draw on our experiences from Nestoria and discuss everything in the possibly familiar context of a LAMP(erl) system.
Some of these things you may know intimately, but more likely you have heard of them before, but never really gotten around to using them.
Perl Plumbing Toolbox
Pipe::Flush
Pipe::Leak
Pipe::Crack
Bundle::Pipe
Crack::Expose
HTML::Spanner
Crypt::Drip
Data::Bucket
Talking Pointz
Test::Harness
Devel::Cover
HTTPD::Bench::ApacheBench?
Devel::DProf/Apache::DProf
Plumbing == Infrastructure
Why spend the time?
More effective unit tests
Better benchmarking
Faster optimization
Ever so unsexy
Because you should
Automated Testing
Test::Harness
And His Super Friends
Testing
We've got both kinds of testing - unit >and< blackbox
Test::Harness, Test::Simple, Test::More, Test::WWW::Mechanize, ...
Mature family of modules with >lots< of features
A simple example...
Test Example The Test
astle@nerdcore:~$ cat test.t
#!/usr/bin/perl -w
use Test::More tests => 3;
BEGIN { use_ok( Acme::Test ); }
my $obj = new Acme::Test();
ok $obj, 'new object';
ok $obj->doSomething(), 'doSomething';
Test Example The Output
astle@nerdcore:~$ perl test.t
1..3
ok 1 - use Acme::Test;
ok 2 - new object
ok 3 - doSomething
Test Example - TAP
Test output is in the Test Anything Protocol (TAP) format
Easily digested by Test::Harness by way of the prove utility
Test Example - prove
astle@nerdcore:~$ prove test.t
test....ok
All tests successful.
Files=1, Tests=3, 0 wallclock secs
( 0.02 cusr + 0.00 csys = 0.02 CPU)
t/base/config/basics...................................................ok
t/base/config/dirs.....................................................ok
t/base/config/merging..................................................ok
t/base/config/warnings.................................................ok
t/base/crypt...........................................................ok
t/base/datadumper/basics...............................................ok
t/base/dates...........................................................ok
t/base/grid_search.....................................................ok
t/base/instructions....................................................ok
t/base/langutils.......................................................ok
t/base/logging/diag....................................................ok
t/base/logging.........................................................ok
t/base/obfuscate.......................................................ok
t/base/ranges..........................................................ok
t/base/report..........................................................ok
t/base/utils/basics....................................................ok
t/blackbox/api/es_realestate/create_url................................ok
t/blackbox/api/es_realestate/echo_es...................................ok
t/blackbox/api/es_realestate/keywords..................................ok
t/blackbox/api/es_realestate/metadata..................................ok
t/blackbox/api/es_realestate/search_listings...........................ok
t/blackbox/api/uk_realestate/create_url................................ok
t/blackbox/api/uk_realestate/echo_uk...................................ok
t/blackbox/api/uk_realestate/keywords..................................ok
t/blackbox/api/uk_realestate/metadata..................................ok
t/blackbox/api/uk_realestate/metadata_coord...........................ok
t/blackbox/api/uk_realestate/search_listings...........................ok
t/blackbox/frontend/es_realestate/302_or_html..........................ok
t/blackbox/frontend/es_realestate/apache...............................ok
t/blackbox/frontend/es_realestate/apache...............................ok
t/blackbox/frontend/es_realestate/autosuggest..........................ok
t/blackbox/frontend/es_realestate/cb...................................ok
t/blackbox/frontend/es_realestate/check_404_not_looping................ok
t/blackbox/frontend/es_realestate/cookies..............................ok
t/blackbox/frontend/es_realestate/demo.................................ok
t/blackbox/frontend/es_realestate/dimg.................................ok
t/blackbox/frontend/es_realestate/fbml.................................ok
t/blackbox/frontend/es_realestate/keep_query_for_redirect..............ok
t/blackbox/frontend/es_realestate/kml..................................ok
t/blackbox/frontend/es_realestate/map..................................ok
t/blackbox/frontend/es_realestate/metrics..............................ok
t/blackbox/frontend/es_realestate/redirect_to_www......................ok
t/blackbox/frontend/es_realestate/rss..................................ok
t/blackbox/frontend/es_realestate/share................................ok
t/blackbox/frontend/es_realestate/static_file..........................ok
t/blackbox/frontend/es_realestate/timer................................ok
t/blackbox/frontend/es_realestate/widget...............................ok
t/blackbox/frontend/es_realestate/www_cgi..............................ok
t/blackbox/frontend/uk_realestate/302_or_html..........................ok
t/blackbox/frontend/uk_realestate/acookie..............................ok
t/blackbox/frontend/uk_realestate/apache...............................ok
t/blackbox/frontend/uk_realestate/autosuggest..........................ok
t/blackbox/frontend/uk_realestate/cb...................................ok
t/blackbox/frontend/uk_realestate/check_404_not_looping................ok
t/blackbox/frontend/uk_realestate/cluster_image/end_to_end.............ok
t/blackbox/frontend/uk_realestate/cookies..............................ok
t/blackbox/frontend/uk_realestate/correct_let_to_rent..................ok
t/blackbox/frontend/uk_realestate/correct_location.....................ok
t/blackbox/frontend/uk_realestate/demo.................................ok
t/blackbox/frontend/uk_realestate/dimg.................................ok
t/blackbox/frontend/uk_realestate/botnet....ok
t/blackbox/frontend/uk_realestate/fbml.................................ok
t/blackbox/frontend/uk_realestate/keep_query_for_redirect..............ok
t/blackbox/frontend/uk_realestate/kml..................................ok
t/blackbox/frontend/uk_realestate/map..................................ok
t/blackbox/frontend/uk_realestate/metrics..............................ok
t/blackbox/frontend/uk_realestate/read_from_apache_log.................ok
t/blackbox/frontend/uk_realestate/redirect_to_www......................ok
t/blackbox/frontend/uk_realestate/rss..................................ok
t/blackbox/frontend/uk_realestate/share................................ok
t/blackbox/frontend/uk_realestate/static_file..........................ok
t/blackbox/frontend/uk_realestate/timer................................ok
t/blackbox/frontend/uk_realestate/widget...............................ok
t/blackbox/frontend/uk_realestate/www_cgi..............................ok
t/blackbox/search-api/es_realestate....................................ok
t/blackbox/search-api/uk_realestate....................................ok
t/cache/basics.........................................................ok
t/cache/list_of_all_caches.............................................ok
t/compiles_and_use_strict..............................................ok
t/database/base/arcgrid_datapoint......................................ok
t/database/base/geoid_datapoint........................................ok
t/database/base/geoid_summary..........................................ok
t/database/base........................................................ok
t/database/census......................................................ok
t/database/coverage....................................................ok
t/database/ctax........................................................ok
t/database/get_random_listings.........................................ok
t/database/listingdb...................................................ok
t/database/localdb.....................................................ok
t/database/metricsdb...................................................ok
t/database/searchindex/base............................................ok
t/database/searchindex/set_comment.....................................ok
t/database/users.......................................................ok
t/dependencies.........................................................ok
t/etl/attrhash.........................................................ok
t/etl/convert..........................................................ok
t/etl/dropbox..........................................................ok
t/etl/get_summaries_since..............................................ok
t/etl/images...........................................................ok
t/etl/import...........................................................ok
t/etl/instance/cleanup.................................................ok
t/etl/instance/convert.................................................ok
t/etl/instance/import..................................................ok
t/etl/instance/process.................................................ok
t/etl/instance/receive.................................................ok
t/etl/instance.........................................................ok
t/etl/object_cache.....................................................ok
t/listings/scorer/quality_score_v2.....................................ok
t/listings/scorer/random_scorer........................................ok
t/listings/scorer/simple_scorer........................................ok
t/listings/searchindex.................................................ok
t/listings/slice.......................................................ok
t/listings/stripe......................................................ok
t/listings/striper.....................................................ok
t/listings/transformer.................................................ok
t/localization/basics..................................................ok
t/etl/weather/control.............................ok
t/etl/process/realestate/es............................................ok
t/etl/process/realestate/uk............................................ok
t/etl/process..........................................................ok
t/etl/product/realestate/es............................................ok
t/etl/product/realestate/realestate....................................ok
t/etl/product/realestate/uk............................................ok
t/etl/receive..........................................................ok
t/etl/servers..........................................................ok
t/etl/summary..........................................................ok
t/etl/thumbs...........................................................ok
t/external_apps/lockrun................................................ok
t/geo/cleangeoid.......................................................ok
t/geo/clusterutils.....................................................ok
t/geo/coord_location...................................................ok
t/geo/creategeoid......................................................ok
t/geo/find_locations/base..............................................ok
t/geo/find_locations/coordinates.......................................ok
t/geo/find_locations/es/postcodes......................................ok
t/geo/find_locations/es/specials.......................................ok
t/geo/find_locations/es/spellcheck.....................................ok
t/geo/find_locations/geoids............................................ok
t/geo/find_locations/uk/postcodes......................................ok
t/geo/find_locations/uk/specials.......................................ok
t/geo/find_locations/uk/spellcheck.....................................ok
t/geo/find_locations/wordindex.........................................ok
t/geo/geocoder/common..................................................ok
t/geo/geocoder/coordinates.............................................ok
t/geo/geocoder/es/addresses............................................ok
t/geo/geocoder/es/cleansing............................................ok
t/geo/geocoder/es/hierarchy............................................ok
t/geo/geocoder/es/streets..............................................ok
t/geo/geocoder/uk/addresses............................................ok
t/geo/geocoder/uk/cleansing............................................ok
t/geo/geocoder/uk/postcode_attributes..................................ok
t/geo/geogrid..........................................................ok
t/geo/geoidindex.......................................................ok
t/geo/geometry.........................................................ok
t/geo/inputformat......................................................ok
t/geo/lookupfiles/es...................................................ok
t/geo/lookupfiles/uk...................................................ok
t/geo/postcodes/es.....................................................ok
t/geo/postcodes/uk.....................................................ok
t/geo/thirdparty/arcgrid/base..........................................ok
t/geo/thirdparty/arcgrid/create........................................ok
t/geo/thirdparty/arcgrid/dbfile........................................ok
t/geo/thirdparty/arcgrid/parse.........................................ok
t/geo/thirdparty/arcgrid/serialize.....................................ok
t/geo/thirdparty/arcgrid/testfile......................................ok
t/geo/tileoverlay......................................................ok
t/geo/utils............................................................ok
t/geo/utils_es.........................................................ok
t/geo/wordindex........................................................ok
t/io/writer/multi/file.................................................ok
t/io/writer/multi/memory...............................................ok
t/io/writer/single/file................................................ok
t/io/writer/single/memory..............................................ok
t/link_checker/test_plugins............................................ok
And so on...
Failed Test Stat Wstat Total Fail List of Failed
------------------------------------------------------------------------------
t/blackbox/api/es_realestate/echo_es.t 255 65280 1 1 1
t/blackbox/api/uk_realestate/echo_uk.t 255 65280 1 1 1
t/listings/scorer/quality_score_v2.t 2 512 57 2 28-29
t/localization/check_po_files.t 4 1024 506 4 3-4 425-426
t/pod-names.t 1 256 294 1 163
t/test-template.t 255 65280 1 2 1
51 tests and 5 subtests skipped.
Failed 6/348 test scripts. 10/9515 subtests failed.
Files=348, Tests=9515, 845 wallclock secs (624.34 cusr + 18.95 csys = 643.29 CPU)
Failed 6/348 test programs. 10/9515 subtests failed.
Testing/Staging Server/Cluster
This is >not< a development server no local work
As similar as possible to development environment could be on production hardware running on non-conflicting ports
Enough oomph to run test suite in a reasonable amount of time
Nightly Build
Get the HEAD
We use svn up via IPC::RUN3
You may prefer SVN::Agent or SVN::Client
Check for local changes no tomfoolery
Clean build
Update dependencies
Rebuild configuration
Restart services (e.g. Apache, MySQL)
What about test data?
Nightly Test Run
Only test each revision once nobody likes to hear the same thing twice
We use prove -r (look forward to prove -r -j 4)
Send output in a useful format we do some simple filtering of the prove output
Fix problems right away squeaky wheels eventually get ignored
Congratulate yourself when things are good
Code Coverage
Devel::Cover Your Ass
Code Coverage Tools
How do you know if you have enough tests?
Coverage tools tell you how much of your code is actually executed
Devel::Cover is pretty bad ass
Still alpha after all these years?
Can also be used in mod_perl we don't do this...yet
Run For Cover
Call cover on every .t file output into a single database
Ignore list can be a bit trying trial and error
Generate summary stats from a single database
Running for Cover
Running command: /usr/bin/perl -MDevel::Cover=-ignore,^t/,+ignore,^ext/,+ignore,^/usr/ /home/lokku/code/perl/t/etl/images.t
Running command: /usr/bin/perl -MDevel::Cover=-ignore,^t/,+ignore,^ext/,+ignore,^/usr/ /home/lokku/code/perl/t/database/users.t
Running command: /usr/bin/perl -MDevel::Cover=-ignore,^t/,+ignore,^ext/,+ignore,^/usr/ /home/lokku/code/perl/t/geo/coord_location.t
Ran for Cover
---------------------------- ------ ------ ------ ------ ------ ------ ------
File stmt bran cond sub pod time total
---------------------------- ------ ------ ------ ------ ------ ------ ------
lib/Lokku/Base/Config.pm 92.9 85.0 66.7 94.7 n/a 5.4 90.8
lib/Lokku/Base/Crypt.pm 96.6 85.3 47.8 100.0 n/a 4.7 87.3
lib/Lokku/Base/DataDumper.pm 80.0 68.8 66.7 83.3 40.0 1.4 75.0
lib/Lokku/Base/Dates.pm 80.6 70.1 63.2 91.3 n/a 10.9 75.2
lib/Lokku/Base/GridSearch.pm 100.0 83.3 66.7 100.0 n/a 0.7 95.6
...okku/Base/Instructions.pm 100.0 n/a n/a 100.0 n/a 0.0 100.0
lib/Lokku/Base/LangUtils.pm 100.0 n/a n/a 100.0 n/a 0.0 100.0
lib/Lokku/Base/Logging.pm 86.6 37.5 38.1 100.0 0.0 0.3 68.9
...se/Logging/DiagLogging.pm 100.0 n/a n/a 100.0 20.0 0.1 89.2
...Base/Logging/NoLogging.pm 66.7 n/a n/a 45.5 12.5 0.0 48.6
.../Logging/SimpleLogging.pm 39.3 n/a n/a 36.4 12.5 0.0 34.0
lib/Lokku/Base/Obfuscate.pm 98.7 92.3 50.0 100.0 n/a 3.6 96.5
lib/Lokku/Base/Ranges.pm 74.4 50.0 40.7 91.7 100.0 1.1 62.4
lib/Lokku/Base/Report.pm 82.7 69.4 69.6 94.4 100.0 2.7 80.1
lib/Lokku/Base/Utils.pm 68.2 50.0 40.0 85.2 100.0 58.4 63.2
lib/Lokku/Cache.pm 26.1 0.0 0.0 33.3 n/a 0.0 23.5
lib/Lokku/DB/Base.pm 8.5 0.0 0.0 21.2 24.3 0.1 7.5
lib/Lokku/DB/UsersDB.pm 11.6 0.0 0.0 31.9 0.0 0.1 7.4
...okku/Geo/CoordLocation.pm 21.1 3.3 4.3 50.0 n/a 0.0 16.5
lib/Lokku/Geo/CreateGeoId.pm 16.6 3.2 0.0 33.3 37.0 0.2 16.3
lib/Lokku/Geo/GeoCoder.pm 20.8 0.0 0.0 48.4 87.5 0.1 17.7
lib/Lokku/Geo/GeoIdIndex.pm 34.1 9.7 5.9 57.7 90.9 0.2 29.5
...
Benchmarking
How much do ya HTTPD::Bench::ApacheBench?
Benchmarking
Running tests against your own code (or services) to measure performance
Emphasize comparability
Name and version benchmarks
Generate benchmarks from real query logs or from your staging server for new features
Record context (name, version, hardware, number of clients) in output
Benchmarking - Tools
We use HTTPD::Bench::ApacheBench
A few shortcomings:
No per-request headers (e.g. cookies)
Could be easier to pick out the slow pokes
Benchmarking - Output
---------- Variables ----------
Start Time: 1195660491
Stop Time: 1195662141
Benchmark: frontend-profile
Version: version1
---------- Statistics ----------
Total Correct Responses: 2931
Average: 0.2432
Standard Deviation: 0.3950
Percentiles:
p95: 0.913
p98: 1.451
p99: 2.075
---------- Errors ----------
Benchmarking - Analysis
Ignore the average
Look at percentiles
All SLAs are in nines
Good average can be bad
Bad average can be good
Profiling
Not Just For Java Anymore
Profiling
Profilers are tools for analyzing where your code is spending all of its time
My personal fetish
Sometimes complicated to set up but definitely worth it
Profiling - Tools
We use Devel::DProf
Best of Breed?
There are others in CPAN
Hook into mod_perl with Apache::DProf
Profiling - Output
DProf File: /home/lokku/common/conf/apache/logs/dprof/10284/tmon.out
Total Elapsed Time = 792.3461 Seconds
User+System Time = 13.22611 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
24.0 3.178 3.178 804 0.0040 0.0040 Lokku::URL::SERP::process
17.9 2.371 13.116 172 0.0138 0.0763 Lokku::Website::Base::handler
9.78 1.294 9.353 2506 0.0005 0.0037 HTML::Mason::Request::comp
8.95 1.184 1.292 5806 0.0002 0.0002 Lokku::Template::loc
7.75 1.025 1.129 3783 0.0003 0.0003 Lokku::Website::SERP::__ANON__
3.15 0.417 9.645 4090 0.0001 0.0024 HTML::Mason::Commands::__ANON__
2.80 0.370 5.025 1468 0.0003 0.0034 HTML::Mason::Request::scomp
2.56 0.339 0.339 220 0.0015 0.0015 Lokku::URL::SERP::as_path
1.71 0.226 0.226 1743 0.0001 0.0001 Lokku::Template::url_for
1.70 0.225 0.225 29819 0.0000 0.0000 BerkeleyDB::_tiedHash::NEXTKEY
1.29 0.170 0.170 4 0.0425 0.0425 Lokku::NLP::UK::EN::RealEstate::An
alyse::new
1.14 0.151 0.143 260 0.0006 0.0005 Text::Balanced::_match_codeblock
1.00 0.132 0.149 1652 0.0001 0.0001 Text::Balanced::_match_variable
0.73 0.096 0.096 1691 0.0001 0.0001 CGI::Util::rearrange
0.69 0.091 0.091 7748 0.0000 0.0000 URI::_query::query
0.68 0.090 0.090 2 0.0450 0.0450 Lokku::NLP::new
0.65 0.086 0.086 1493 0.0001 0.0001 Lokku::URL::SERP::set
0.60 0.079 0.079 8434 0.0000 0.0000 HTML::Mason::Interp::apply_escapes
0.57 0.075 0.075 5806 0.0000 0.0000 Locale::Maketext::Lexicon::EXISTS
0.57 0.075 0.081 497 0.0002 0.0002 Lokku::NLP::sort_and_convert
0.49 0.065 0.065 1812 0.0000 0.0000 Text::Balanced::_match_quotelike
Profiling - Analysis
One output file per apache process
Code out there to combine, looks dicey
Just run one child
Look for targets of opportunity
Inclusive and Exclusive display modes
Some reason to believe that output is not always accurate
Is it all worth it?
Hells yes
Small teams need good tools take advantage of the fine work done by others
Low one-time cost, low maintenance, big long-term gain
Run tests daily review other data monthly
Many ways to do things, but the lazy way is often more efficient avoid rolling your own
Thanks!
[email protected]