top 10 perl performance tips

24
Top 10 Perl Performance Tips Perrin Harkins We Also Walk Dogs

Upload: perrin-harkins

Post on 10-May-2015

14.358 views

Category:

Technology


3 download

DESCRIPTION

This talk was presented at YAPC::NA 2010 and OSCON 2010.

TRANSCRIPT

Page 1: Top 10 Perl Performance Tips

Top 10 Perl Performance Tips

Perrin HarkinsWe Also Walk Dogs

Page 2: Top 10 Perl Performance Tips

Devel::NYTProf

Page 3: Top 10 Perl Performance Tips

Ground Rules

● Make a repeatable test to measure progress with○ Sometimes turns up surprises

● Use a profiler (Devel::NYTProf) to find where the time is going

○ Don't flail and waste time optimizing the wrong things!● Try to weigh the cost of developer time vs buying more

hardware○ Optimization is crack for developers, hard to know when

to stop

Page 4: Top 10 Perl Performance Tips

1. The Big Picture

● The biggest gains usually come from changing your high-level approach

○ Is there a more efficient algorithm?○ Can you restructure to reduce duplicated effort?

● Sometimes you just need to tune your SQL● A boatload of RAM hides a multitude of sins● The bottleneck is usually I/O

○ Files○ Database○ Network○ Batch I/O often makes a huge difference

Page 5: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Can make a huge difference in tight loops with many small queries

● connect_cached() avoids connection overhead○ Or use your favorite connection cache, but beware

overuse of ping()● prepare_cached() avoids object creation and server-side

prepare overhead● Use bind parameters to reuse SQL statements instead of

creating new ones

Page 6: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Use bind_cols() in a fetch() loop for most efficient retrieval.○ Less copying is faster.○ Alternatively, fetchrow_arrayref()

● prepare() and then many execute() calls is faster than do()

Page 7: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Turn off AutoCommit for batch changes○ Commit every thousand rows or so saves work for your

database● Use your database's bulk loader when possible

○ Writing rows to CSV and using MySQL's LOAD DATA INFILE crushes the fastest DBI code

○ 10X speedup is not unusual

Page 8: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Use ORMs Wisely○ Consider using straight DBI for the most performance

sensitive sections■ Removing a layer means fewer method calls and

faster code○ Write report queries by hand if they seem slow

■ Optimizer hints and choices about SQL variations are beyond the scope of ORMs but make a huge difference for this kind of query

Page 9: Top 10 Perl Performance Tips

3. Choose the Fastest Hash Storage

● memcached is not the fastest option for a local cache○ BerkeleyDB (not DB_File!) and Cache::FastMmap are

about twice as fast● CHI abstracts the storage layer

○ Useful if you think network strategy may change later

Page 10: Top 10 Perl Performance Tips

3. Choose the Fastest Hash Storage

Cache Get time Set time Run timeCHI::Driver::Memory 0.03ms 0.05ms 0.35s

BerkeleyDb 0.05ms 0.17ms 0.57sCache::FastMmap 0.06ms 0.09ms 0.62sCHI::Driver::File 0.10ms 0.26ms 1.11sCache::Memcached::Fast 0.12ms 0.15ms 1.23sMemcached::libmemcached 0.14ms 0.16ms 1.40sCHI::Driver::DBI Sqlite 0.11ms 1.94ms 2.05sCache::Memcached 0.29ms 0.21ms 2.88sCHI::Driver::DBI MySQL 0.45ms 0.33ms 4.41s

Page 11: Top 10 Perl Performance Tips

4. Generate Code and Compile to a Subroutine

● This is how most templating tools work.● Remove the cost of things that won't change for a while

○ Skip re-parsing templates○ Skip large groups of conditionals○ Choose architecture-specific code

my %subs;my $code = qq{print "Hello $thing\n";};$subs{'hello'} = eval "sub { $code }";$subs{'hello'}->();

Page 12: Top 10 Perl Performance Tips

5. Sling Text Efficiently

● Slurp files when possible. my $text = do { local $/; <$fh>; }

● Seems obvious, but I still see people doing this:my @lines = <$fh>;my $text = join('', @lines);

● Consider memory with huge files.

Page 13: Top 10 Perl Performance Tips

5. Sling Text Efficiently

● Use a "sliding window" to search very large files.○ Too big to slurp, but line-by-line is slow.○ Chunks of 8K or 16K are much faster, but require book-

keeping code. ○ http://www.perlmonks.org/?node_id=128925

● Use the cheapest string tests you can get away with.○ index() beats a regex when you just want to know if a

string contains another string● Use a fast CSV parser

○ Text::CSV_XS is much faster than the regexes you copied from that web page.

Page 14: Top 10 Perl Performance Tips

6. Replace LWP With Something Faster

● LWP is amazing, but modules built on C libraries tend to be faster.

○ LWP::Curl○ HTTP::Lite○ Maybe HTTP::Async for parallel

LWP 32.8/sHTTP::Async 64.5/sHTTP::Lite 200/sLWP::Curl 1000/s

Page 15: Top 10 Perl Performance Tips

7. Use a Fast Serializer

● Data::Dumper is great for debugging, but slow for serialization.

● JSON::XS is the new speed king, and is human-readable and cross-language.

● Storable handles more and is second-best in speed.

Page 16: Top 10 Perl Performance Tips

7. Use a Fast Serializer

YAML 84.7/s

XML::Simple 800/s

Data::Dumper 2143/s

FreezeThaw 2635/s

YAML::Syck 4307/s

JSON::Syck 4654/s

Storable 9774/s

JSON::XS 41473/s

Page 17: Top 10 Perl Performance Tips

8. Avoid Startup Costs

● Use a daemon to run code persistently○ Skip the costs of compiling○ Cache data○ Open connections ahead of time

● mod_perl, FastCGI, Plack, etc. for web● PPerl for command-line

○ Or hit your web server with lwp-get

Page 18: Top 10 Perl Performance Tips

9. Sometimes You Have to Get Crazy

● Use the @_ array directly to avoid copying sub add_to_sql { my $sqlbase = shift; # hashref my ($name, $value) = @_; if ($value) { push(@{ $sqlbase->{'names'} }, $name); push(@{ $sqlbase->{'values'} }, $value); } return $sqlbase;}

Page 19: Top 10 Perl Performance Tips

9. Sometimes You Have to Get Crazy

sub add_to_sql { # takes 3 params: hashref, name, and value return if not $_[2];

push(@{ $_[0]->{'names'} }, $_[1]); push(@{ $_[0]->{'values'} }, $_[2]);}

● 40% faster than original● More than 40% harder to read

Page 20: Top 10 Perl Performance Tips

10. Consider Compiling Your Own Perl

● Compiling without threads can be good for a free 15% or so.● No code changes needed! ● Has maintenance costs.

Page 21: Top 10 Perl Performance Tips

Resources

Tim Bunce's Advanced DBI slides:http://www.slideshare.net/Tim.Bunce/dbi-advanced-tutorial-2007 Also see Tim's NYTProf slides:http://www.slideshare.net/Tim.Bunce/develnytprof-v4-at-oscon-201007

man perlperf Programming Perl appendix on performance

Page 22: Top 10 Perl Performance Tips

Thank you!

Slides will be available on the conference website

Page 23: Top 10 Perl Performance Tips

Avoid tie()

● Slower than method calls!● PITA to debug too.

Page 24: Top 10 Perl Performance Tips

Use a Fast Sort

● For sorting on derived keys, consider a GRT sort.○ Faster than Schwartzian Transform○ Use Sort::Maker to build it.