Paying respect to Module::Build

Earlier this week, I proposed that Module::Build be deprecated from the Perl 5 core. After some discussion, this proposal has been accepted by the Pumpking.

I want to take a moment to discuss what this means, why I suggested it, and what I think Module::Build trail blazed for Perl 5.

Deprecation means a warning in v5.19 and removal in v5.22

Soon, in Perl 5 version 19 (the current development series), the Module::Build included in the core library path will issue a warning when used. If Module::Build is then installed from CPAN, it will be installed into the "sitelib" path and the deprecation warning will stop.

Sometime in Perl 5 version 21 (starting late Q2 2014), Module::Build will most likely no longer ship with the core Perl and will need to be installed from CPAN.

Fortunately, CPAN clients already recognize a 'configure requires' directive in CPAN distribution meta files (META.yml or META.json) and can bootstrap Module::Build before Build.PL runs.

So... the biggest impact on end-users will be that when Perl 5 version 20 is released in Q2 2014, people will have to install Module::Build from CPAN to squelch the deprecation warning.

Update: Miyagawa reminded me on IRC that the major CPAN clients will install from CPAN automatically if they see a deprecated module in a prereq list, so actually, most people won't even notice. Shiny!

Module::Build was good, but not good enough, and it plateaued

Before Module::Build, we only had ExtUtils::MakeMaker and it sucked. Michael Schwern, its long-time maintainer, even wrote a presentation called MakeMaker is DOOMED! that encouraged people to switch to Module::Build. In hindsight, this was premature.

For all the many problems that Module::Build fixed, it introduced some of its own, built up its own technical debt, and suffered a crisis of maintenance.

I tried to roughly estimate the amount of effort going into Module::Build during four phases of its life. I used lines of Changes file as my metric (though I think "git diff --stats" would be fairly similar):

  • 2001-2007: Ken Williams author and maintainer → 2397 lines of Changes
  • 2007-2009: Ken and Eric Wilhelm tag-team → 310 lines of Changes
  • 2009-2011: David Golden maintainer → 1033 lines of Changes
  • 2011-now: Leon Timmermans maintainer → 95 lines of Changes

After my announcement that I was stepping down as Module::Build maintainer, no one volunteered for seven months until Leon kindly offered to be a "caretaker" and shepherd some patches and releases -- partly as a side effect of his work on a Module::Build replacement called Module::Build::Tiny, which itself was a serious spin off of a half-joke of my own called Acme::Module::Build::Tiny.

Module::Build innovated things now taken for granted

The best thing that Module::Build did was define a de facto specification for using a Build.PL to drive a perl-based (rather than Makefile-based) install program. That work has been formalized into a Build.PL Spec, so other Perl-based builders can be developed.

Module::Build also introduced the META.yml file that evolved into the CPAN::Meta::Spec that is in widespread use today. The META.yml file also helped solve a tricky bootstrapping problem: by specifying configure_requires dependencies within the META file, CPAN clients could install whatever modules were necessary to run Build.PL.

With the release of Perl v5.10, both CPAN and CPANPLUS supported configure_requires, meaning that the groundwork for future Build.PL-based alternatives was already in place!

Module::Build also introduced the install_base parameter as a way to specify a custom install location. It was much easier to understand than PREFIX from ExtUtils::MakeMaker, and was subsequently adopted by ExtUtils::MakeMaker as INSTALL_BASE. This is a critical part of the magic behind tools like local::lib and Carton.

The other crucial innovation was that — for the first time — customizing the build, test and install process could be done by writing only Perl code rather than writing Perl code to spit out Makefile fragments. It made building complex modules much easier — particularly Alien modules like Alien::wxWidgets. That then made projects like Padre possible.

Module::Build also spawned a counter-reaction in the form of Module::Install, which tried to make the easy Perl customization possible, while shielding users from pure ExtUtils::MakeMaker and avoiding the bootstrap problems of Module::Build by bundling itself in inc/. Module::Install then triggered a counter-reaction in the form of Dist::Zilla, which then led to Dist::Milla and Minilla.

Module::Build was the trail blazer for the tools that came after.

Module::Build made its own unique mistakes

People have complained that Module::Build was bloated. In lines of code, it's actually comparable to ExtUtils::MakeMaker. The bigger problem is that it puts 4,200 of its 5,800 lines of code in just one file: Module::Build::Base. (ExtUtils::MakeMaker split similar functionality across three mega files.)

More than just size, Module::Build is complex. The Build.PL file runs configuration and serializes the results into some files, which the Build file uses to reconstruct the original Module::Build object. Arguments can modify properties at any stage. And since Build.PL might really be a subclass, there's a lot of meta object stuff going on just to manage the configuration before ever getting around to the real business of building and installing modules.

It also suffered from feature creep. Instead of just being an install tool, it became a swiss-army-knife author's tool, with release-time features never needed by end-users, but which forced end-users to upgrade Module::Build just to run Build.PL without error. It added new concepts, like "optional features", which were poorly specified and have never achieved much traction.

One of the big, valid complaints is that it never incorporated a proper dependency system. Actions (build, test, etc.) could depend on each other, but there was nothing like Makefile's ability to detect that since file "A" changed, then action "B" had to run.

My personal pet peeve — possibly one of the big reasons I got discouraged doing maintenance — was that it also included Module::Build::Compat, which was used to generate a Makefile.PL from the Build.PL. While this seemed like a benefit to ease transition, it meant that Module::Build needs to maintain feature-compatibility — and in many cases bug-compatibility — with ExtUtils::MakeMaker effectively forever.

Module::Build promised easy subclassing and this was mostly true. But re-use and sharing was nearly impossible. If you had a subclass to do one thing and I had a subclass that did something else and you wanted to combine them, you pretty much had to copy and paste code. Contrast that with Dist::Zilla's incredible plugin ecosystem — where just about anything you want to do has been written up into a plugin that you can just drop in.

Module::Build will live on as a CPAN distribution

Module::Build never became the uncontested successor to ExtUtils::MakeMaker. It's not used as part of the Perl 5 core build process. It originally went in at least in part to ease adoption, but now all CPAN clients can bootstrap it on demand.

It's not a bad module, but it has no reason to live in the Perl 5 core any more.
Removing it means one less thing for the already-stretched Perl 5 porters to maintain, update and support.

Module::Build helped us through a critical transition away from purely Makefile based installers. It will continue to live on CPAN and will continue to support the thousands of distributions that rely on it. If a motivated maintainer came along, it might even start to innovate again, or pay down its technical debt.

I give it — and its creator, Ken Williams — my respect for what it accomplished, even while I bid it farewell from the core.

Posted in p5p, perl programming, toolchain | Tagged , , , | Comments closed

How I manage new perls with perlbrew

Perl v5.19.0 was released this morning and I already have it installed as my default perl. This post explains how I do it.

First, I manage my perls with perlbrew. I install that, then use it to install some tools I need globally available:

$ perlbrew install-patchperl
$ perlbrew install-cpanm

You must install cpanm with perlbrew -- if you don't, weird things can happen when you switch perls and try to install stuff.

I keep my perls installed read-only and add a local::lib based library called "@std". (I stole this technique from Ricardo Signes.) That way, I can always get back to a clean, stock perl if I need to test something that way.

(There are still some weird warnings that get thrown doing things this way when I switch perls, but everything seems to work.)

I also install perls with an alias, so "19.0" is short for "5.19.0".

Then I have a little program that builds new perls, sets things up the way I want, and installs my usual modules. All I have to do is type this:

$ newperl 19.0

And then I've got a brand new perl I can make into my default perl.

Here's that program. Feel free to adapt to your own neeeds:

#!/usr/bin/env perl
use v5.10;
use strict;
use warnings;
use autodie qw/:all/;

my $as = shift
  or die "Usage: $0 <perl-version>";
my @args = @ARGV;

# trailing "t" means do threads
my @threads = ( $as =~ /t$/ ) ? (qw/-D usethreads/) : ();

$as =~ s/^5\.//;
my $perl = "5.$as";
$perl =~ s/t$//; # strip trailing "t" if any
my $lib = $as . '@std';

my @problem_modules = qw(
  JSON::XS
);

my @to_install = qw(
  Task::BeLike::DAGOLDEN
);

my @no_man = qw/-D man1dir=none -D man3dir=none/;

# install perl and lock it down
system( qw/perlbrew install -j 9 --as/, $as, $perl, @threads, @no_man, @args );
system( qw/chmod -R a-w/, "$ENV{HOME}/perl5/perlbrew/perls/$as" );

# give us a local::lib for installing things
system( qw/perlbrew lib create/, $lib );

# let's avoid any pod tests when we try to install stuff
system( qw/perlbrew exec --with/, $lib, qw/cpanm TAP::Harness::Restricted/ );
local $ENV{HARNESS_SUBCLASS} = "TAP::Harness::Restricted";

# some things need forcing
system( qw/perlbrew exec --with/, $lib, qw/cpanm -f/, @problem_modules );

# now install the rest
system( qw/perlbrew exec --with/, $lib, qw/cpanm/, @to_install );

# repeat to catch any circularity problems
system( qw/perlbrew exec --with/, $lib, qw/cpanm/, @to_install );

Yes, that takes a while. I kicked it off right before going to get lunch. When I got back, I was ready to switch:

$ perlbrew switch 19.0@std

I also have a couple bash aliases/functions that I use for easy, temporary toggling between perls:

alias wp="perlbrew list | grep \@"
up () {
  local perl=$1
  if [ $perl ]; then
    perlbrew use $perl@std
  fi
  local current=$(perlbrew list | grep \* | sed -e 's/\* //' )
  echo "Current perl is $current"
}

I use them like this (notice that I don't need to type my @std library for this fast switching):

$ up
Current perl is 18.0@std

$ wp
  10.0@std
  10.0-32@std
  10.1@std
  12.5@std
  14.4@std
  16.3@std
  16.3@test
  16.3t@std
* 18.0@std
  19.0@std
  8.5@std
  8.9@std

$ up 19.0
Use of uninitialized value in split at /loader/0x7fa639030cd8/local/lib.pm line 8.
Use of uninitialized value in split at /loader/0x7fa639030cd8/local/lib.pm line 8.
Current perl is 19.0@std

(there's that warning I mentioned)

I hope this guide helps people keep multiple perls for development and testing. In particular, I'd love to see more more people doing development work and testing using 5.19.X so it can get some real-world testing.

See you June 21 for v5.19.1...

Posted in perl programming | Tagged , , , | Comments closed

Anyone want vanillaperl.com?

I'm tired of paying the domain bill for vanillaperl.com (which currently just redirects to strawberryperl.com).

It will lapse at the beginning of July unless someone wants to take it.

If you're interested, leave a comment below and explain what you want to do with it. I'll award it to the best proposal received by June 1.

Posted in perl programming | Tagged , , | Comments closed

OODA vs technical debt

This post is a response to Ovid's series about agility without testing:

I started to respond to the last and realized that my comment was long enough to be a blog post of my own.

First, let me say that I'm enjoying this series. Ovid and Abigail are both challenging conventional wisdom around technical debt and I think that's really healthy.

However, I note that Ovid's evidence in favor of emergent behavior is anecdotal, which is probably inevitable for this sort of thing, but dangerous. "It worked these handful of times I remember it" has confirmation bias and no statistical significance.

We can't run a real experiment, but we can run a thought experiment: 100 teams of strict TDD vs 100 teams of the Ovid approach [which he really ought to brand somehow] from the same starting point (perhaps in parallel universes) for a few months of development.

What could we expect? Certainly, the TDD teams will spend more of their time on testing than the Ovid teams. So the Ovid teams will deliver more features and fix more bugs in the same period of time.

If one believes even a little of the Lean Startup hype, the Ovid teams will have more opportunities to see customer reactions — they will have a shorter OODA loop.

On the flip side, the TDD team has less technical debt and lower risk profile. I disagree with the idea that technical debt is an option. I believe it does have an ongoing cost — that future development is less efficient and more time consuming to at least some degree.

I call this "servicing" technical debt, which is just like paying only the interest on your credit card. You might never pay down any of the technical debt, but as you accumulate more, you'll pay more to service it.

It seems clear to me that which result you prefer depends quite a lot on the maturity of the product (possibly expressed in terms of expected growth rate) and the overall risk level.

For a brand-new startup, the risk of failure is already pretty high regardless of coding style. A faster OODA loop probably reduces risk more than improved tests do, because the bigger risk is building something customers don't want. And with such a high risk of failure, there's a chance that you'll simply be able to default on technical debt.

If I can riff on the financial crisis, a startup has subprime technical debt. It's either successful — in which case there will be growth sufficient to pay off technical debt (if the risk/reward tradeoff justifies it) — or it fails, in which case the debt is irrelevant. Rapid growth deflates technical debt.

For a mature business, however, it might well go the other way. Risk to an existing profit stream is more meaningful and technical debt has to be paid off or serviced (rather than defaulted on) which reduces future profitability that might not be sufficiently offset by growth.

The quandary will be businesses — or products (if part of an established business) — that are in between infancy and maturity. There the "right" approach will depend heavily on the risk tolerance and growth prospects.

Regardless, I tend to strongly favor TDD at the unit-test level, where I find TDD helps me better define what I want out of a particular piece of code and then be sure that subsequent changes don't break that. At the unit test level, the effort put into testing can be pretty low and the benefits to my future code productivity fairly high.

But as the effort of testing some piece of behavior increases — due to external dependencies or other interaction effects — it's more tempting to me to let go of TDD and rely on human observation of the emergent behaviors because I'd rather spend my time coding features than tests.

I think that puts me a little closer to the Ovid camp than the strict TDD camp, but not all the way.

Posted in coding | Tagged , , , | Comments closed

Why the latest File::Temp might surprise you

There was a subtle API change in File::Temp 0.23 that improves consistency, but might break old, buggy code.

Prior to 0.23, here was the calling signature for the functional and object oriented interfaces for File::Temp (with some creative spacing to show the problem):

# functional
my ( $fh, $filename ) = tempfile( $template, %options );
my $tempdir           = tempdir ( $template, %options );

# object oriented
my $tmp = File::Temp->new       (            %options );
my $dir = File::Temp->newdir    ( $template, %options );

Notice how new() doesn't take a template argument. Instead, you're supposed to pass it as an option in the %options hash: TEMPLATE => 'tempXXXXX'.

Frankly, this interface sucks. There are too many ways to get confused or do it wrong:

  • What happens if you pass a leading template to new()?
  • What happens if you leave off the leading template for newdir()?
  • What happens if you pass a TEMPLATE option to newdir(), tempfile() or tempdir()?
  • What happens if you call tempfile() or tempdir() as methods?

A test program

Here's a little test program to try out some variations. Notice that a leading template argument is 'arg_XXXX' and a TEMPLATE option is 'opt_XXXX', so we can see which takes precedence if we try with both:

#!/usr/bin/env perl
use v5.10;
use strict;
use warnings;
use File::Temp qw/tempfile tempdir/;

my @cases = (
    # documented API
    q{tempfile            ('arg_XXXX'                        )},
    q{tempdir             ('arg_XXXX'                        )},
    q{File::Temp->new     (            TEMPLATE => 'opt_XXXX')},
    q{File::Temp->newdir  ('arg_XXXX'                        )},

    # variations with both arg and TEMPLATE
    q{tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{tempdir             ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->new     ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->newdir  ('arg_XXXX', TEMPLATE => 'opt_XXXX')},

    # newdir called like new
    q{File::Temp->newdir  (            TEMPLATE => 'opt_XXXX')},

    # functions called as methods
    q{File::Temp->tempfile('arg_XXXX'                        )},
    q{File::Temp->tempdir ('arg_XXXX'                        )},
    q{File::Temp->tempfile('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->tempdir ('arg_XXXX', TEMPLATE => 'opt_XXXX')},
    q{File::Temp->tempfile(            TEMPLATE => 'opt_XXXX')},
    q{File::Temp->tempdir (            TEMPLATE => 'opt_XXXX')},
);

for my $c ( @cases ) {
    my @result = eval $c;
    my $err = $@;
    $err =~ s/\n.*//ms;
    say $c;
    say "    " . ( $result[-1] ? "Got $result[-1]" : $err ) . "\n";
}

Results with File::Temp 0.22

Here are the result running under File::Temp 0.22 for the documented API:

tempfile            ('arg_XXXX'                        )
    Got arg_Y9B5

tempdir             ('arg_XXXX'                        )
    Got arg_Joq0

File::Temp->new     (            TEMPLATE => 'opt_XXXX')
    Got opt_p9I5

File::Temp->newdir  ('arg_XXXX'                        )
    Got arg_PmNf

That's just as we expect.

Now, let's try those odd cases. First, calling everything with both a leading template and a TEMPLATE option:

tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got arg_gIL3

tempdir             ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got arg_xPXg

File::Temp->new     ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/AYeB74PT0K

File::Temp->newdir  ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got arg_GfwP

For everything except new(), the TEMPLATE argument is ignored and the leading argument works just like in the documented API. But how about new()? You see what's happening don't you? Here's what it thinks you did:

File::Temp->new( arg_XXXX => 'TEMPLATE', opt_XXXX => undef );

Since none of those keys are known, it uses the default directory and template.

What about more wrong variations:

File::Temp->newdir  (            TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/AkI6pFjyq_

File::Temp->tempfile('arg_XXXX'                        )
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/3F2V8UPIbx

File::Temp->tempdir ('arg_XXXX'                        )
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/aSljGO6feU

File::Temp->tempfile('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/MGCo_TSXX5

File::Temp->tempdir ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got /var/folders/5t/sy1gxkwj2l1gfd20s2g470200000gn/T/TEAXNoECbB

We get more weird behavior. The newdir method doesn't know about TEMPLATE. And calling functions as methods is like doing this:

tempfile( 'File::Temp' => 'arg_XXXX', TEMPLATE => 'opt_XXXX' );

Again, it can't find the template and the default is used.

And finally, there's this:

File::Temp->tempfile(            TEMPLATE => 'opt_XXXX')
    Error in tempfile() using File::Temp: The template must end with at least 4 'X' characters

File::Temp->tempdir (            TEMPLATE => 'opt_XXXX')
    Error in tempdir() using File::Temp: The template must end with at least 4 'X' characters

Why is that an error when the previous method calls weren't? Because it looks like this:

tempfile( 'File::Temp', TEMPLATE => 'opt_XXXX' );

Since there are an odd number of arguments, it thinks it was given a (bad) leading template and some arguments.

If you're ready to facepalm, go right ahead.

What about File::Temp 0.23

In 0.23, sanity (of a sort) returns. All the functions and methods now respect both ways of specifying a template.

tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX'); # fine
File::Temp->newdir  (            TEMPLATE => 'opt_XXXX'); # fine

If you specify both, the last one wins, just as if you gave multiple TEMPLATE arguments.

But there is a catch.

Calling the functions as methods is now an error. In 0.22, you could call functions as methods and File::Temp would (usually) just quietly give you a tempfile where you didn't expect it. That was a bug and now it's a fatal error.

Here's the same test program under 0.2301:

tempfile            ('arg_XXXX'                        )
    Got arg_l2TB

tempdir             ('arg_XXXX'                        )
    Got arg_5y15

File::Temp->new     (            TEMPLATE => 'opt_XXXX')
    Got opt_sziU

File::Temp->newdir  ('arg_XXXX'                        )
    Got arg_3imY

tempfile            ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_NTAn

tempdir             ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_TZzT

File::Temp->new     ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_CFPu

File::Temp->newdir  ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    Got opt_ueeQ

File::Temp->newdir  (            TEMPLATE => 'opt_XXXX')
    Got opt_vkNh

File::Temp->tempfile('arg_XXXX'                        )
    'tempfile' can't be called as a method at (eval 19) line 1.

File::Temp->tempdir ('arg_XXXX'                        )
    'tempdir' can't be called as a method at (eval 20) line 1.

File::Temp->tempfile('arg_XXXX', TEMPLATE => 'opt_XXXX')
    'tempfile' can't be called as a method at (eval 21) line 1.

File::Temp->tempdir ('arg_XXXX', TEMPLATE => 'opt_XXXX')
    'tempdir' can't be called as a method at (eval 22) line 1.

File::Temp->tempfile(            TEMPLATE => 'opt_XXXX')
    'tempfile' can't be called as a method at (eval 23) line 1.

File::Temp->tempdir (            TEMPLATE => 'opt_XXXX')
    'tempdir' can't be called as a method at (eval 24) line 1.

If you are calling functions as methods, your code will break. This is sensible because functions and method have very different scope implications.

  • Functions are "global": files and directories get cleaned up at the end of the program
  • Methods are "lexical": they return objects that clean up when the object is destroyed

If you are calling a function as a method, File::Temp has no way to know which way you want it and so it can't DWIM. So BOOM! It dies.

Now go fix your code.

Posted in perl programming | Tagged , , | Comments closed

© 2009-2014 David Golden All Rights Reserved