perl dist::surveyor 2011

24
Dist::Surveyor “what’s in that lib directory?” Tim Bunce - Nov 2011 Creative Commons BY-NC-SA 3.0

Upload: tim-bunce

Post on 10-May-2015

2.418 views

Category:

Technology


2 download

DESCRIPTION

Slides on my lightning talk at the London Perl Workshop, November 2011.

TRANSCRIPT

Page 1: Perl Dist::Surveyor 2011

Dist::Surveyor“what’s in that lib directory?”

Tim Bunce - Nov 2011

Creative Commons BY-NC-SA 3.0

Page 2: Perl Dist::Surveyor 2011

The Context

Perl 5.8

CPAN modules

Applications

Business modules

Page 3: Perl Dist::Surveyor 2011

The Context

• A large library of CPAN distributions- In a local::lib style dir .../cpan-5.008/{man,bin,lib}/

- Installed over many years

- No external record of what has been installed

- Almost 5000 modules

- In production in many systems on many machines

Page 4: Perl Dist::Surveyor 2011

The Itch

• Want to upgrade from perl 5.8- so need to clone our local library of CPAN modules

- to .../cpan-5.012/{man,bin,lib}/

- with recompiled perl extensions

• Want the exact set of distribution versions- so when testing “nothing but perl changed”

Page 5: Perl Dist::Surveyor 2011

“What’s in that lib directory?”

Page 6: Perl Dist::Surveyor 2011

Innocence and Hope

• Vague memory of something called ‘packlists’

• Vague memory of perllocal.pod install log

• Vague memory of some work by brian d foy

• Usual hope that someone’s already done this

• “How hard can it be?”

Page 7: Perl Dist::Surveyor 2011

/.packlist

• Records only what files were installed

• Doesn’t record the origin distribution

• Useless for my needs

Page 8: Perl Dist::Surveyor 2011

what_dists.pl

• Chris Williams’s github.com/bingos/throwaway

• Matches installed modules to distributions

• Only matches to the latest distributions

• Looked like a good place to start

• I hacked it to use perllocal.pod data and a bunch of heuristics.

• It worked, mostly. Annoying edge cases.

• Lots of hacks, heuristics, and blind luck.

Page 9: Perl Dist::Surveyor 2011

perllocal.pod

• Records a “name” and “version”

• Name is the Makefile.PL NAME- can be the module or distribution name

- or something else entirely

• Version is the Makefile.PL VERSION

- not always the version in the distribution filename

• Incomplete!- Not written by Module::Build based distributions

Page 10: Perl Dist::Surveyor 2011

BackPAN::Version::Discover

• “Figure out exactly which dist versions you have installed”

• Based on BackPAN::Index

• Incomplete and “very alpha”

• Matching logic not very robust

• Just doesn’t work very well for us

Page 11: Perl Dist::Surveyor 2011

DPAN

• “start with an existing Perl distribution and work backward to the MiniCPAN that would re-install the same thing” - brian d foy

• Indexes MD5 and other metadata for all BackPAN modules and scripts

• Incomplete: doesn’t yet work out what distribution versions are installed.

Page 12: Perl Dist::Surveyor 2011

GitPAN

• Git repo for every distribution on CPAN

• Includes all distro versions on BackPAN

• Pondered using git hashes and the github API

• But GitPAN isn’t being maintained

Page 13: Perl Dist::Surveyor 2011

Page 14: Perl Dist::Surveyor 2011

MetaCPAN

Page 15: Perl Dist::Surveyor 2011

MetaCPAN• Repository for CPAN metadata- ElasticSearch distributed database (Lucene)

- RESTful API

• CPAN and entire BackPAN fully indexed

• Very detailed metadata

• Full Of Awesome

Page 16: Perl Dist::Surveyor 2011

MetaCPAN

• Find all releases that contain a particular version of a module:

curl -XPOST api.metacpan.org/v0/file/_search -d '{ "query": { "filtered":{ "query":{"match_all":{}}, "filter":{"and":[ {"term":{"file.module.name":"DBI::Profile"}}, {"term":{"file.module.version":"2.014123"}} ]} }}, "fields":["release"]}'

Page 17: Perl Dist::Surveyor 2011

Page 18: Perl Dist::Surveyor 2011

The Method

• Get installed module names, versions, file sizes

• For every module:- find “candidate distributions” that included that

module version, ideally also matching the file size.

• For every candidate distribution:- get all modules and versions shipped in that distro

- score each candidate by the proportion of its modules and versions which match what’s installed

Page 19: Perl Dist::Surveyor 2011

An Example

Page 20: Perl Dist::Surveyor 2011

Cloning From The List

Page 21: Perl Dist::Surveyor 2011

Cloning From The List

• Can’t simply feed results to cpanm- It’ll fetch the latest version of any prereqs

• Tried to put the list in dependancy order

• Tried to use MiniCPAN::Inject

• Finally added a --makecpan dir option

- Fetches distro tarballs and writes index

- can be used as CPAN repo by cpanm

Page 22: Perl Dist::Surveyor 2011

Typical UsageSurvey what distributions are installed in a library:

$ dist_surveyor.pl --makecpan my_cpan \/a/perl/lib/dir > installed_dists.txt

Install exactly those distributions in a new library:

$ cpanm --mirror file:$PWD/my_cpan --mirror-only \-l new_lib < installed_dists.txt

Bonus: re-tests all distros with current prereqs

Page 23: Perl Dist::Surveyor 2011

Status

• Currently a single script

• Ought to be turned into a module

• Looking for a maintainer