Dependencies don't matter -- but stability does

There have been a flurry of recent posts on the "dependency problem", kicked off by this thread about Moose on Perl Monks.  Chris Prather defended Moose dependencies and Dave Rolsky added a rant and a meditation about  dependencies in general.  In the latter article, Rolsky makes a comparison to Debian and constrasts the "binary-only" installation of Debian packages to the "source plus tests" approach of CPAN.  I think the parallel to Debian is apt, but for the wrong reason.

As I see it, the problem with CPAN dependencies is simply the implication of a long dependency chain on stability.  What Debian has that CPAN does not is a clear delineation between stable, testing and unstable repositories. In this paradigm, CPAN is an unstable repository.

Uploaded distributions are immediately available globally as fast as the CPAN mirrors can replicate.  Unless an author chooses a "dev release" version number (1.23_01), the new distribution also becomes the default version for anyone installing one of the modules in the distribution.  Even though dev releases are possible, on CPAN, the author chooses the level of stability to signal and it's completely arbitrary, without any connection to real-world results.

With Debian, the end-users get to pick the stability they want to trade off against frequency of bug fixes and new features.  On CPAN, end-users have to work a lot harder to accomplish the same thing.  This is why long dependency chains make people nervous: there are many more things that could suddenly, unexpectedly, become unstable.

Dave Cantrell's CP5.6.2AN project is a step in the right direction, providing a limited CPAN that only indexes modules that have passing test results on a particular version of Perl.  It introduces a new CPAN paradigm much more like Debian, with a CPAN repository containing distributions with some known degree of stability.

This approach could be extended or made more strict: perhaps a distribution only enters a repository if it passes all tests using dependencies already in the repository and if all things in the repository that depend on it pass their tests with the new distribution as well. Going further, distributions might specify dependencies with an exact version, not a minimum version.  ((That might also imply a change from specifying prerequisites as modules ("Foo::Bar") to specifying them as distributions ("Foo-Bar-1.23"), since module version numbers need not change, but distribution version numbers do.))

Bringing this full circle, Rolsky suggested that maybe Perl's culture of testing is part of the problem, but I think Perl's testing culture is part of the solution. Things like CPAN Testers and cpandeps give us hard data on what works and what breaks.   And if we know that, we can tackle the stability problem, and then the dependency problem will go away.

This entry was posted in cpan, perl programming and tagged . Bookmark the permalink. Both comments and trackbacks are currently closed.

6 Comments

  1. Posted May 7, 2009 at 11:48 am | Permalink

    I think you may have distorted my point for dramatic effect.

    I said that the key to Debian's success at handling deps is that installs always work. You made a point about how they achieve that, which is a good one, but I'm not sure what you think I got wrong.

    If Debian packages all came with a test suite, and that test suite always ran on install, that would make doing what Debian does infinitely harder.

    Even something like CP5.6.2AN will "suffer" from the inclusion of tests. There are still a bazillion platforms on which 5.6.2 could run, and some modules will fail their tests on some of those platforms, or with some particular version of a dependency.

    You mention specifying a specific version, but I think Debian's way is more powerful (and more complex, of course).

    With Debian, a package maintainer can say "I depend on X version 1.11, 1.12, o r1.14". If you have X 1.10, 1.13 or 1.15 the package won't install. CPAN only allows for declaring a minimum version, which really doesn't give authors much control. If a module works with every version of a dep from 0.5 - 1.0, except 0.84, there's no good way to handle that.

    Module::Build does let you write version specs along the lines of Debian, but the rest of the toolchain doesn't do anything useful with the information.

    Also, it's "Rolsky", not "Rolky".

  2. david
    Posted May 7, 2009 at 12:33 pm | Permalink

    I think we're in broad agreement, but I'm emphasizing the stability point over testing/no-testing. You're right that CP5.6.2AN may still give people issues during testing -- but it's also a much better starting point to build a binary repository from than ordinary CPAN. Then we can offer the best of both approaches.

    (Sorry about the name typo. Why aren't you in my dictionary? :-)

  3. Posted May 7, 2009 at 3:07 pm | Permalink

    Schwern had a grant for that a while back, but nothing ever happened (I think it was postponed or something? can't remember).

    So far as I can tell the amount of work required for this to happen is not that big:

    1. support more than one 02packages.details.txt.gz in both pause and the CPAN clients. This might be a PITA to work in but shouldn't be too hard.

    2. implement a UI for pause authors to mark a distribution as stable, explicitly upgrading it into the stable index.

    3. implement configuration in the CPAN clients to choose which index they point to

    The current index is sort of like testing. the unstable index would include distributions with underscores in their verrsion number.

    This could also pave the way towards additional alternative indexes, either as DarkPAN type things, or public indexes (for instance a CPAN index for known good versions of dependencies for a large project with many dependencies).

    • Posted May 8, 2009 at 12:02 pm | Permalink

      For what it's worth, old CPAN/CPANPLUS versions can also use alternative indexes by just using symlinks to create an alternative 02packages.details.gz with the dists shared.

      Obviously sub optimal from a clarity POV, but, o conf urllist = $stable could work today.

  4. Posted May 7, 2009 at 3:48 pm | Permalink

    autarch, I am pretty sure xdg is right on this one. (I’m using handles to avoid “Dave and David”.)

    Even if CP5.6.2AN suffers compared to Debian, I am pretty sure that something like it but stricter would be so much more stable than the “open” CPAN that it would basically not matter whether we attain a Debian level of installation smoothness. What’s more, in such a case you start to get “herd immunity”: users expect things to not break, and so the few cases that do break are isolated cases, so stand out like a sore thumb, and are likely to get reported and then acted upon swiftly.

    Furthermore, on at least one count, this would work far better than Debian: there would be a continuously-updated, rolling “stable” that is never too far from the bleeding edge. This is how the testing culture would be part of the solution: the effort of making packages fit for inclusion would be outsourced in tiny increments to their authors, and that of measuring their fitness outsourced wholesale to computers – rather than being a big-bang integration effort among packagers, with a stop-the-world release every couple of years.

  5. François Perrad
    Posted May 8, 2009 at 12:52 pm | Permalink

    I want the best of both worlds (Debian & CPAN). I wrote a small script cpan2apt (available at http://github.com/fperrad/misc/tree/master).

    This script retrieves module dependencies from deps.cpantesters.org,
    and tries to find a debian package for each module.

    For example, I want install the latest Padre (still in development) on the latest Ubuntu 9.04.
    So, I need to 2 commands :

    $ cpan2apt Padre
    $ cpan Padre

    In this case, Padre depends of 88 Perl modules (not core) but 69 of them have a package.

    A long chain of dependencies is not a problem when the majority of them have a stable version.

© 2009-2014 David Golden All Rights Reserved