How I manage new perls with perlbrew

Perl v5.19.0 was released this morning and I already have it installed as my default perl. This post explains how I do it.

First, I manage my perls with perlbrew. I install that, then use it to install some tools I need globally available:

$ perlbrew install-patchperl
$ perlbrew install-cpanm

You must install cpanm with perlbrew -- if you don't, weird things can happen when you switch perls and try to install stuff.

I keep my perls installed read-only and add a local::lib based library called "@std". (I stole this technique from Ricardo Signes.) That way, I can always get back to a clean, stock perl if I need to test something that way.

(There are still some weird warnings that get thrown doing things this way when I switch perls, but everything seems to work.)

I also install perls with an alias, so "19.0" is short for "5.19.0".

Then I have a little program that builds new perls, sets things up the way I want, and installs my usual modules. All I have to do is type this:

$ newperl 19.0

And then I've got a brand new perl I can make into my default perl.

Here's that program. Feel free to adapt to your own neeeds:

#!/usr/bin/env perl
use v5.10;
use strict;
use warnings;
use autodie qw/:all/;

my $as = shift
  or die "Usage: $0 <perl-version>";
my @args = @ARGV;

# trailing "t" means do threads
my @threads = ( $as =~ /t$/ ) ? (qw/-D usethreads/) : ();

$as =~ s/^5\.//;
my $perl = "5.$as";
$perl =~ s/t$//; # strip trailing "t" if any
my $lib = $as . '@std';

my @problem_modules = qw(
  JSON::XS
);

my @to_install = qw(
  Task::BeLike::DAGOLDEN
);

my @no_man = qw/-D man1dir=none -D man3dir=none/;

# install perl and lock it down
system( qw/perlbrew install -j 9 --as/, $as, $perl, @threads, @no_man, @args );
system( qw/chmod -R a-w/, "$ENV{HOME}/perl5/perlbrew/perls/$as" );

# give us a local::lib for installing things
system( qw/perlbrew lib create/, $lib );

# let's avoid any pod tests when we try to install stuff
system( qw/perlbrew exec --with/, $lib, qw/cpanm TAP::Harness::Restricted/ );
local $ENV{HARNESS_SUBCLASS} = "TAP::Harness::Restricted";

# some things need forcing
system( qw/perlbrew exec --with/, $lib, qw/cpanm -f/, @problem_modules );

# now install the rest
system( qw/perlbrew exec --with/, $lib, qw/cpanm/, @to_install );

# repeat to catch any circularity problems
system( qw/perlbrew exec --with/, $lib, qw/cpanm/, @to_install );

Yes, that takes a while. I kicked it off right before going to get lunch. When I got back, I was ready to switch:

$ perlbrew switch 19.0@std

I also have a couple bash aliases/functions that I use for easy, temporary toggling between perls:

alias wp="perlbrew list | grep \@"
up () {
  local perl=$1
  if [ $perl ]; then
    perlbrew use $perl@std
  fi
  local current=$(perlbrew list | grep \* | sed -e 's/\* //' )
  echo "Current perl is $current"
}

I use them like this (notice that I don't need to type my @std library for this fast switching):

$ up
Current perl is 18.0@std

$ wp
  10.0@std
  10.0-32@std
  10.1@std
  12.5@std
  14.4@std
  16.3@std
  16.3@test
  16.3t@std
* 18.0@std
  19.0@std
  8.5@std
  8.9@std

$ up 19.0
Use of uninitialized value in split at /loader/0x7fa639030cd8/local/lib.pm line 8.
Use of uninitialized value in split at /loader/0x7fa639030cd8/local/lib.pm line 8.
Current perl is 19.0@std

(there's that warning I mentioned)

I hope this guide helps people keep multiple perls for development and testing. In particular, I'd love to see more more people doing development work and testing using 5.19.X so it can get some real-world testing.

See you June 21 for v5.19.1...

Posted in perl programming | Tagged , , , | Comments closed

Anyone want vanillaperl.com?

I'm tired of paying the domain bill for vanillaperl.com (which currently just redirects to strawberryperl.com).

It will lapse at the beginning of July unless someone wants to take it.

If you're interested, leave a comment below and explain what you want to do with it. I'll award it to the best proposal received by June 1.

Posted in perl programming | Tagged , , | Comments closed

OODA vs technical debt

This post is a response to Ovid's series about agility without testing:

I started to respond to the last and realized that my comment was long enough to be a blog post of my own.

First, let me say that I'm enjoying this series. Ovid and Abigail are both challenging conventional wisdom around technical debt and I think that's really healthy.

However, I note that Ovid's evidence in favor of emergent behavior is anecdotal, which is probably inevitable for this sort of thing, but dangerous. "It worked these handful of times I remember it" has confirmation bias and no statistical significance.

We can't run a real experiment, but we can run a thought experiment: 100 teams of strict TDD vs 100 teams of the Ovid approach [which he really ought to brand somehow] from the same starting point (perhaps in parallel universes) for a few months of development.

What could we expect? Certainly, the TDD teams will spend more of their time on testing than the Ovid teams. So the Ovid teams will deliver more features and fix more bugs in the same period of time.

If one believes even a little of the Lean Startup hype, the Ovid teams will have more opportunities to see customer reactions — they will have a shorter OODA loop.

On the flip side, the TDD team has less technical debt and lower risk profile. I disagree with the idea that technical debt is an option. I believe it does have an ongoing cost — that future development is less efficient and more time consuming to at least some degree.

I call this "servicing" technical debt, which is just like paying only the interest on your credit card. You might never pay down any of the technical debt, but as you accumulate more, you'll pay more to service it.

It seems clear to me that which result you prefer depends quite a lot on the maturity of the product (possibly expressed in terms of expected growth rate) and the overall risk level.

For a brand-new startup, the risk of failure is already pretty high regardless of coding style. A faster OODA loop probably reduces risk more than improved tests do, because the bigger risk is building something customers don't want. And with such a high risk of failure, there's a chance that you'll simply be able to default on technical debt.

If I can riff on the financial crisis, a startup has subprime technical debt. It's either successful — in which case there will be growth sufficient to pay off technical debt (if the risk/reward tradeoff justifies it) — or it fails, in which case the debt is irrelevant. Rapid growth deflates technical debt.

For a mature business, however, it might well go the other way. Risk to an existing profit stream is more meaningful and technical debt has to be paid off or serviced (rather than defaulted on) which reduces future profitability that might not be sufficiently offset by growth.

The quandary will be businesses — or products (if part of an established business) — that are in between infancy and maturity. There the "right" approach will depend heavily on the risk tolerance and growth prospects.

Regardless, I tend to strongly favor TDD at the unit-test level, where I find TDD helps me better define what I want out of a particular piece of code and then be sure that subsequent changes don't break that. At the unit test level, the effort put into testing can be pretty low and the benefits to my future code productivity fairly high.

But as the effort of testing some piece of behavior increases — due to external dependencies or other interaction effects — it's more tempting to me to let go of TDD and rely on human observation of the emergent behaviors because I'd rather spend my time coding features than tests.

I think that puts me a little closer to the Ovid camp than the strict TDD camp, but not all the way.

Posted in coding | Tagged , , , | Comments closed

Why the latest File::Temp might surprise you

There was a subtle API change in File::Temp 0.23 that improves consistency, but might break old, buggy code.

Prior to 0.23, here was the calling signature for the functional and object oriented interfaces for File::Temp (with some creative spacing to show the problem):

# functional
my ( $fh, $filename ) = tempfile( $template, %options );
my $tempdir           = tempdir ( $template, %options );

# object oriented
my $tmp = File::Temp->new       (            %options );
my $dir = File::Temp->newdir    ( $template, %options );

Notice how new() doesn't take a template argument. Instead, you're supposed to pass it as an option in the %options hash: TEMPLATE => 'tempXXXXX'.

Frankly, this interface sucks. There are too many ways to get confused or do it wrong:

  • What happens if you pass a leading template to new()?
  • What happens if you leave off the leading template for newdir()?
  • What happens if you pass a TEMPLATE option to newdir(), tempfile() or tempdir()?
  • What happens if you call tempfile() or tempdir() as methods?

A test program

Here's a little test program to try out some variations. Notice that a leading template argument is 'arg_XXXX' and a TEMPLATE option is 'opt_XXXX', so we can see which takes precedence if we try with both:

#!/usr/bin/env perl
use v5.10;
use strict;
use warnings;
use File::Temp qw/tempfile tempdir/;

my @cases = (
    # documented API
    q{tempfile            ('arg_XXXX'                        )},
    q{tempdir             ('arg_XXXX'                        )},
    q{File::Temp->new     (            TEMPLATE => 'opt_XXXX')},
    q{File::Temp->newdir  ('arg_XXXX'                        )},

    # variations with both arg and TEMPLATE
    q{tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{tempdir             ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->new     ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->newdir  ('arg_XXXX', TEMPLATE => 'opt_XXXX')},

    # newdir called like new
    q{File::Temp->newdir  (            TEMPLATE => 'opt_XXXX')},

    # functions called as methods
    q{File::Temp->tempfile('arg_XXXX'                        )},
    q{File::Temp->tempdir ('arg_XXXX'                        )},
    q{File::Temp->tempfile('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->tempdir ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->tempfile(            TEMPLATE => 'opt_XXXX')},
    q{File::Temp->tempdir (            TEMPLATE => 'opt_XXXX')},
);

for my $c ( @cases ) {
    my @result = eval $c;
    my $err = $@;
    $err =~ s/\n.*//ms;
    say $c;
    say "    " . ( $result[-1] ? "Got $result[-1]" : $err ) . "\n";
}

Results with File::Temp 0.22

Here are the result running under File::Temp 0.22 for the documented API:

tempfile            ('arg_XXXX'                        )
    Got arg_Y9B5

tempdir             ('arg_XXXX'                        )
    Got arg_Joq0

File::Temp->new     (            TEMPLATE => 'opt_XXXX')
    Got opt_p9I5

File::Temp->newdir  ('arg_XXXX'                        )
    Got arg_PmNf

That's just as we expect.

Now, let's try those odd cases. First, calling everything with both a leading template and a TEMPLATE option:

tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got arg_gIL3

tempdir             ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got arg_xPXg

File::Temp->new     ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/AYeB74PT0K

File::Temp->newdir  ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got arg_GfwP

For everything except new(), the TEMPLATE argument is ignored and the leading argument works just like in the documented API. But how about new()? You see what's happening don't you? Here's what it thinks you did:

File::Temp->new( arg_XXXX => 'TEMPLATE', opt_XXXX => undef );

Since none of those keys are known, it uses the default directory and template.

What about more wrong variations:

File::Temp->newdir  (            TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/AkI6pFjyq_

File::Temp->tempfile('arg_XXXX'                        )
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/3F2V8UPIbx

File::Temp->tempdir ('arg_XXXX'                        )
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/aSljGO6feU

File::Temp->tempfile('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/MGCo_TSXX5

File::Temp->tempdir ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/TEAXNoECbB

We get more weird behavior. The newdir method doesn't know about TEMPLATE. And calling functions as methods is like doing this:

tempfile( 'File::Temp' => 'arg_XXXX', TEMPLATE => 'opt_XXXX' );

Again, it can't find the template and the default is used.

And finally, there's this:

File::Temp->tempfile(            TEMPLATE => 'opt_XXXX')
    Error in tempfile() using File::Temp: The template must end with at least 4 'X' characters

File::Temp->tempdir (            TEMPLATE => 'opt_XXXX')
    Error in tempdir() using File::Temp: The template must end with at least 4 'X' characters

Why is that an error when the previous method calls weren't? Because it looks like this:

tempfile( 'File::Temp', TEMPLATE => 'opt_XXXX' );

Since there are an odd number of arguments, it thinks it was given a (bad) leading template and some arguments.

If you're ready to facepalm, go right ahead.

What about File::Temp 0.23

In 0.23, sanity (of a sort) returns. All the functions and methods now respect both ways of specifying a template.

tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX'); # fine
File::Temp->newdir  (            TEMPLATE => 'opt_XXXX'); # fine

If you specify both, the last one wins, just as if you gave multiple TEMPLATE arguments.

But there is a catch.

Calling the functions as methods is now an error. In 0.22, you could call functions as methods and File::Temp would (usually) just quietly give you a tempfile where you didn't expect it. That was a bug and now it's a fatal error.

Here's the same test program under 0.2301:

tempfile            ('arg_XXXX'                        )
    Got arg_l2TB

tempdir             ('arg_XXXX'                        )
    Got arg_5y15

File::Temp->new     (            TEMPLATE => 'opt_XXXX')
    Got opt_sziU

File::Temp->newdir  ('arg_XXXX'                        )
    Got arg_3imY

tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_NTAn

tempdir             ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_TZzT

File::Temp->new     ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_CFPu

File::Temp->newdir  ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_ueeQ

File::Temp->newdir  (            TEMPLATE => 'opt_XXXX')
    Got opt_vkNh

File::Temp->tempfile('arg_XXXX'                        )
    'tempfile' can't be called as a method at (eval 19) line 1.

File::Temp->tempdir ('arg_XXXX'                        )
    'tempdir' can't be called as a method at (eval 20) line 1.

File::Temp->tempfile('arg_XXXX', TEMPLATE => 'opt_XXXX')
    'tempfile' can't be called as a method at (eval 21) line 1.

File::Temp->tempdir ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    'tempdir' can't be called as a method at (eval 22) line 1.

File::Temp->tempfile(            TEMPLATE => 'opt_XXXX')
    'tempfile' can't be called as a method at (eval 23) line 1.

File::Temp->tempdir (            TEMPLATE => 'opt_XXXX')
    'tempdir' can't be called as a method at (eval 24) line 1.

If you are calling functions as methods, your code will break. This is sensible because functions and method have very different scope implications.

  • Functions are "global": files and directories get cleaned up at the end of the program
  • Methods are "lexical": they return objects that clean up when the object is destroyed

If you are calling a function as a method, File::Temp has no way to know which way you want it and so it can't DWIM. So BOOM! It dies.

Now go fix your code.

Posted in perl programming | Tagged , , | Comments closed

The Annotated Lancaster Consensus

The official Lancaster Consensus document is on Github. This blog post is an annotated review of it.

The Lancaster Consensus

At the first Perl QA Hackathon in 2008 in Oslo, a number of QA and
toolchain authors, maintainers and experts came together to agree on some
common standards and practices. This became known as
"The Oslo Consensus".

Five years later, at the 2013 Perl QA Hackathon, a similar brain trust came
together to address new issues requiring consensus.

These decisions provide direction, but, as always, the speed of
implementation will depend on the interests and availability of volunteers
to do the work.

Toolchain and testing

Minimum-supported Perl

Going forward, the Perl toolchain will target Perl 5.8.1, released
September 2003. This will allow toolchain modules to reliably use PerlIO
and improved Unicode support.

Because of the many Unicode bug-fixes in early 5.8 releases,
toolchain maintainers reserve the right to later bump the
minimum to 5.8.4 (which ships with Solaris 10).

There was huge agreement about 5.8.1, and the 5.8.4 discussion hinged on the amount of work to avoid early 5.8 bugs compared to the number of people affected, particularly given that there is CPXXXAN to support older Perls.

Specifying pure-perl builds

Some distributions offer an "XS" version or a "Pure Perl" version that can
be selected during configuration. Currently, each of these has their own
way for users to indicate this, which makes it impossible for CPAN clients
or other build tools to help users select automatically.

For example, the version.pm module both checks the PERL_ONLY environment variable and the following three command line flags: --perl-only, --perl_only and --xs. But, the Sentinel module checks only for the command line flag --pp. Other modules do it yet differently, with different environment variables or command line options. It's chaos.

Going forward, the "spec" for Makefile.PL and Build.PL will include command
line options to request a "pure Perl only" build. These will be:

  • PUREPERL_ONLY=1 (for Makefile.PL)
  • --pureperl-only (for Build.PL)

These may be set in the PERL_MM_OPT or PERL_MB_OPT environment
variables just like any other command line option.

If present, distribution authors must ensure that the installed modules do
not require loading XS (whether directly or via Inline) or dynamically
generate any platform-specific code. The installed files must be able to
run correctly if copied to another machine with the same Perl version but a
different architecture (e.g. "fatpacking" an application). If this
condition can not be met, configuration must exit with an error.

Fatpacking is the most common use case for this, so module authors should think about that explicitly when deciding if they can be "pure perl only" or not.

Environment variables for testing contexts

The Oslo Consensus defined two testing contexts: AUTOMATED_TESTING and
RELEASE_TESTING. Of these, AUTOMATED_TESTING has been the most
confusing, as it sometimes was used to mean "don't interact with a user"
and sometimes "run lengthy tests".

I've also used it for tests which depended on some external website working correctly. I wouldn't want to stop someone from installing the module if it failed, but I did want to see what CPAN smokers experienced.

We also (briefly) discussed how some tools like Dist::Zilla are using
AUTHOR_TESTING distinct from RELEASE_TESTING.

Distribution authors should now follow these semantics:

  • AUTOMATED_TESTING: if true, tests are being run by an automated testing
    facility and not as part of the installation of a module; CPAN smokers
    must set this to true; CPAN clients must not set this

  • NONINTERACTIVE_TESTING: if true, tests should not attempt to interact
    with a user; output may not be seen and prompts will not be answered

  • EXTENDED_TESTING: if true, the user or process running tests is willing
    to run optional tests that may take extra time or resources to complete.
    Such tests must not include any development or QA tests. Only tests of
    runtime functionality should be included.

  • RELEASE_TESTING: if true, tests are being run as part of a release QA
    process; CPAN clients must not set this variable

  • AUTHOR_TESTING: if true, tests are being run as part of an author's
    personal development process; such tests may or may not be run prior to
    release. CPAN clients must not set this variable. Distribution
    packagers (ppm, deb, rpm, etc.) should not set this variable.

AUTHOR_TESTING was not really discussed, but I included it in the writeup for completeness. It was discouraged in the Oslo Consensus, but some people seem to have tests they want run throughout development and others they want run only at release time, so it still gets used. For example, Dist::Zilla sets it for "dzil test" since that command is only run by authors, not by end users.

There are already two libraries on CPAN to make it easier to set these
variables correctly:

CPAN smokers and integration testers must indicate automated,
non-interactive testing and may request extended testing, depending on
their resources.

For example, a CPAN tester may decide not to run extended testing on old, slow hardware.

CPAN clients are free to request non-interactive or extended testing
depending on their needs or configuration.

CPAN smokers and clients that "must not set" a variable also must not clear
it if it is already set externally.

Amendments to the Build.PL spec

David Golden and Leon Timmermans have been working on a
Build.PL spec
to describe how any Perl build tool using Build.PL must behave. It is
necessarily based on Module::Build, but does not need to follow its
behaviors exactly.

The group agreed that the use and semantics of .modulebuildrc should
be excluded from the specification.

Installed distributions database

One of the QA hackathon projects was the creation of a replacement
for packlists. An installed-distribution database would facilitate
easy inventory of installed distributions, uninstall tools and tracking of
the dependency graph of installed modules.

The consensus discussions were explicitly not designing the system; the brief was to answer questions about the various ways/places modules can be installed so people doing the actual design work didn't paint themselves into a corner.

The group agreed that because modules can be installed into many different
locations, any such database would need to be "per @INC" and that it would
need to stack in the same way that @INC itself does. That means that
adding paths to @INC could change what the database sees as installed.

Such a database system must not require any non-core dependencies, but
could offer enhanced capabilities if recommended CPAN modules are
installed.

Other implementation details are left to anyone designing such a system.

Post-installation testing

Several people at the hackathon have been interested in a system for
running module tests after installation, for example to ensure that
upgraded dependencies don't break a module or to test overall integrity.

The group agreed that any such testing must make all distribution files
available during testing -- tests must be run from within a distribution
tarball directory. Any such tests must be run using new make or
Build targets: make test-installed or Build test-installed. These
should be equivalent to make test or Build test but without adding
blib to @INC. The prove application must not be used.

These targets don't exist and would have to be created in each tool. But conceptually they should work just like "make test" would, except they should run against the installed modules, not the ones built into "blib".

The group also agreed that any such tests need to respect how modules can
be shadowed in @INC. Setting PERL5LIB could change which is the
"installed" distribution and thus which tests should run. Coordination
with an installed distribution database was encouraged.

Other implementation details, including whether the distribution directory
is saved from the initial installation or retrieved fresh from CPAN/BackPAN,
are left to anyone designing such a system.

META file specification

The 'provides' field

The 'provides' field of the
CPAN::Meta::Spec requires a 'file'
sub-key, but the meaning was unclear for dynamically-generated packages.
We agreed that the 'file' key must refer to the actual file within the
distribution directory that originates the package, whether that is a .pm
file or a .PL or other dynamic generator.

The group also agreed that having a required 'file' sub-key didn't make sense, but I realized afterwards that changing the spec would break any existing validators and that chaos that could cause wouldn't be worth the benefit. But it's absolutely worth making it optional for whenever we get to v3 of the spec.

Improving on 'conflicts'

We briefly discussed some of the known problems with the 'conflicts' key
within prerequisite data.

What most developers seem to want is a way to indicate that installing a
particular module is know to break other modules of particular versions.
E.g. upgrading Foo to 2.0 breaks any Bar before 3.14.

We encouraged anyone interested in improvements to prototype it using an
x_breaks or similar custom key and getting patches to support it into
CPAN clients. Once battle tested, it could be a candidate for a future v3
of the spec.

This discussion had huge risk of turning into a design discussion, so we declared that people should prototype with a custom key rather than get into a spec discussion prematurely.

PAUSE and CPAN

Long-term goal for distribution-level data on PAUSE

Several of the PAUSE issues discussed highlighted the need for PAUSE to
maintain not just package (namespace) level index and permission data, but
also "distribution" level data. This would allow, for example,
transferring permissions for a distribution as a unit instead of needing
to transfer permissions on all packages.

We agreed that this is the right long-term goal, but that other proposals
would be implemented in the near-term to solve current issues.

This was a classic "good", "fast" and "cheap" tradeoff. With volunteer labor, we are "cheap". The long-term idea was "good", but we agreed that we wanted something "fast" more than we wanted something "good" so the rest of the proposals represent what could be done quickly.

Case insensitive package permissions

While not discussed directly, it should be noted that PAUSE package
permissions will shortly become case-insensitive, but case-preserving
to ensure that indexed modules would be unique even if installed on a
case-insensitive file system.

For example, there was a File::Stat on CPAN. Installing it into sitelib on a case-insensitive system (like Mac OS X), meant that use File::stat would actually load File::Stat. The core module would be completely hidden. Ouch!

Rules for distribution naming

Many CPAN ecosystem websites and tools treat a "distribution name" as
a unique identifier, even though nothing has enforced uniqueness to date.
Allowing non-uniqueness is confusing at best and a security risk at worst.

Gory details are in this email to modules@perl.org: "Distribution names are not unique..."

Going forward, distributions uploaded to PAUSE must have a name that
"matches" the name of an indexed package within the distribution and the
uploader must have permissions for that package or else the entire
distribution will not be indexed.

For example, if DAGOLDEN uploads Foo-Bar-1.23.tar.gz, the distribution name
is "Foo-Bar" and there must be an indexable "Foo::Bar" package within the
distribution.

There are about 1000 distributions on CPAN that do not follow this rule and
they will be grandfathered, though they are encouraged to conform to the
standard either by renaming the distribution, adding a new .pm file or by
introducing a properly named package internally.

For example, LWP ships as libwww-perl-6.05.tar.gz. If it included package
libwww::perl;
into one of its .pm files, that package would be indexed and
would conform with the standard.

Technically, the correct package could also be declared only in the
META.json file using a 'provides' field. In such a case the 'file' sub-key
must be 'META.json' to indicate that 'META.json' is the file responsible
for declaring the package.

Flagging abandoned modules and modules requesting help

Currently, when a CPAN author passes away, his or her module permissions
are transferred to a fake author called 'ADOPTME'. Volunteers can step
up to request a takeover if they wish to maintain them.

We agreed that in the short-term, a similar mechanism should be used to
signal abandonment or that an author is looking for someone to share
responsibility. Unlike the case where an author is deceased, these will
use co-maint privileges as a signaling mechanism so that the original
author may remove them as needed.

(In the long-term, the group hopes that a distribution-level data model for
PAUSE will be able to address these needs more directly.)

CPAN search engines and other community sites may use these permissions
markers and associated meanings to communicate the status of distributions.

  • ADOPTME as primary: this generally indicates a deceased author.
    Volunteers can request a takeover via modules@perl.org.

  • ADOPTME as comaint: this indicates a verified, non-responsive author.
    The community may propose that a package be so marked following the same
    rules as for a take-over (i.e. multiple attempts to contact the author
    and a request via modules@perl.org). Volunteers can request a takeover
    of an ADOPTME module via modules@perl.org without an additional waiting
    period.

  • HANDOFF as comaint: this indicates that an author wishes to
    permanently give up the primary maintainer role to someone else

  • NEEDHELP as comaint: this indicates that an author seeks people to
    help maintain the module, but plans to continue as primary maintainer

It's very important that CPAN search engines treat ADOPTME differently from HANDOFF or NEEDHELP. Flagging one's module as "NEEDHELP" shouldn't result in a big red "Abandoned module!" warning.

Matt S. Trout has voluntered to administer requests for modules to be flagged as co-maint ADOPTME. Proposals must follow the normal rules for takeover. You must make several public, documented attempts to contact the author publicly before appealing to modules@perl.org for ADOPTME to get comaint.

With the exception of a 'takeover' from ADOPTME (which must go through
modules@perl.org), CPAN authors must manage these comaint privileges using
the regular PAUSE interface.

A "takeover" from ADOPTME can be immediate because PAUSE admins already know that the author is non-responsive for whatever reason.

An author may also voluntarily transfer primary or co-maint to ADOPTME to
indicate that PAUSE admins may transfer permissions immediately to anyone
who requests it.

Automating PAUSE ID registration

Historically, PAUSE ID's have been manually approved, often with a
substantial delay. We agreed that assuming appropriate protections against
bots/spam are in place, PAUSE should move to an automated approval system.
This would bring it in line with other programming language repositories
and open source community sites.

Additionally, we agreed that unused, inactive PAUSE IDs should be deleted
and made available for reuse after a period of time. Specifically, any
PAUSE ID that ever uploaded anything must not be deleted (because the files
exists on BackPAN under that PAUSE ID). A login to PAUSE (or via a proxy
like rt.pcan.org) is sufficient to indicate activity. Inactive IDs will
not be deleted without a warning message about logging in to PAUSE.

Automating CPAN directory cleanup

Approximately half the files on CPAN are older than 5 years. Many authors
never clean up old distributions. In order to keep the size of CPAN down,
we agreed that under certain conditions, old distribution will be
automatically scheduled for deletion (and will thereafter only exist on
BackPAN).

For a distribution to be selected for deletion, there must be at least 3
stable releases. Anything older than the oldest of those 3 will be
scheduled for deletion if it is older than 5 years and is not indexed in
the 02packages file.

This is a bit confusing, but is intended to be really conservative. For example, if I have Foo-Bar-1.24, Foo-Bar-1.23_03, Foo-Bar-1.23_02, Foo-Bar-1.23_01, Foo-Bar-1.22, Foo-Bar-1.21_01, Foo-Bar-1.20 and Foo-Bar-1.18, Foo-Bar-1.20 is the third oldest stable release, so only Foo-Bar-1.18 would be considered for deletion if more than 5 years old and not indexed in 02packages. The 1.23_XX and 1.21_XX dev releases will be kept.

All perl tarballs will be excluded from deletion, of course.

Scheduled deletion will notify the author as usual and they will have the
usual period of time to cancel the scheduled deletion.

Cleanup will be implemented on some sort of rolling basis by author ID to
avoid bothering authors with frequent deletion notices.

Module registration

The group agreed that the PAUSE module registration has largely outlived its
usefulness. Because only a fraction of CPAN modules are registered,
registration does not provide a comprehensive source of metadata (e.g.
"DSLIP") and much of the information registration covers is more widely
available via META files.

One benefit if module registration is that the data can be changed without requiring a new release the way META files do. On reflection, about the only field that matters for is the "support level".

The group acknowledged the remaining benefit has been that new CPAN authors
often attempt to register their first module and benefit from feedback, but
felt that other venues, such as PrePAN, would offer a
better new author experience. In particular, PrePAN offers community
participation beyond one or two PAUSE admins and a wealth of examples to
learn from (without having to search through a mailing list archive).

brian d foy has been the module registration hero, tirelessly responding to requests for years. PrePAN will help share the burden and new authors will benefit from different points of view.

Therefore, we agreed that existing PAUSE documentation will be changed to
direct new (and experienced) authors to PrePAN for guidance.

Soon, PAUSE will stop publishing the module registration database to CPAN
mirrors. (The index file will exist but be empty to avoid breaking CPAN
clients that expect it.) After an assessment period, module registration
will likely be closed and this feature will be retired from PAUSE.

Participants in the Lancaster Consensus discussions

Discussions lasted over 3 days, participants came and went, but each day
had about 20 people. Thank you to the following participants:

Andreas König, Barbie, Breno Oliveira, Chris Williams, Christian Walde,
David Golden, Daniel Perrett, Gordon Banner, H. Merijn Brand, James
Mastros, Jens Rehsack, Jess Robinson, Joakim Tormoen, Kenichi Ishigaki,
Leon Timmermans, Liz Mattijsen, Matthew Horsfall, Michael Schwern, Olivier
Mengué, Paul Johnson, Peter Rabbitson, Philippe Bruhat, Piers Cawley,
Ricardo Signes, Salve J. Nilsen and Wendy van Dijk

(Apologies to anyone present who was left off the list. Email dagolden at
cpan dot org or send a pull request to be added.)

Posted in cpan, perl programming, toolchain | Tagged , , , , , | Comments closed

© 2009-2014 David Golden All Rights Reserved