Why you should use getcwd and not cwd

The Cwd module provides several functions for finding the current directory. The most similar-seeming are cwd and getcwd. Have you ever wondered why you should pick one or the other?

I always use getcwd and I'll show you why.

My ~/.dzil directory is a symlink to a git repository elsewhere. If I'm in that directory in my terminal, here is a look at three ways to get the current path:

$ perl -MCwd=cwd,getcwd -MFile::Spec -wE 'say for cwd(), getcwd(), File::Spec->rel2abs(".")'

The cwd call returns the symlink path — the way the shell sees it. The getcwd call returns the real path with the symlink resolved.

Now look at the third. That's from File::Spec. A LOT of code uses File::Spec to manipulate paths. If you ever want to compare the current directory against a path made absolute by File::Spec, you need to use getcwd.

I've found that getcwd is more consistent across platforms, whereas cwd can be implemented differently depending on your platform or if you have XS or pure-perl implementations.

I like consistency.

Sure, there are cases where the "shell view" of the current directory is more important and you might want to use cwd, but I find that is the exception, not the rule.

Consistency matters. Use getcwd.

Posted in perl programming | Tagged , , | Comments closed

Why installing Dist::Zilla is slow and what you can do about it

Despite my previous rant about Dist::Zilla haters and why you don't need Dist::Zilla to contribute, I recognize that there is one thing that does require Dist::Zilla: installing from a patched repo without waiting for a CPAN release.

Leaving aside whether that's really wise or not, I think it's the real frustration people are having with distributions that use Dist::Zilla.

That inspired me to explore why Dist::Zilla is slow to install and what could be done to improve it.

First and foremost, Dist::Zilla just has a lot of dependencies — over 170 of them. Downloading, untarring, building, testing and installing those takes time. Starting from a fresh Perl, if every distribution took only a second to install, it would still take nearly 3 minutes. Unfortunately, distributions aren't that quick to install. Some are damn slow.

My first experiment was finding out how long it took to install Dist::Zilla from the worst case sitution — a brand new perl installation.

I started with two cases:

  1. Installing with cpanminus, but using TAP::Harness::Restricted to avoid pod-related tests (which might otherwise cause non-functional test failures and prevent installation)
  2. Installing with cpanminus, but using the "-n" flag to skip all tests

In each case, starting from a clean perlbrew, I set up a local library to install modules. Then I bootstrapped cpanminus and (for #1), TAP::Harness::Restricted:

$ perlbrew lib create 18.2@case1
$ perlbrew use 18.2@case1
$ cpan App::cpanminus
$ cpanm TAP::Harness::Restricted

I created a similar, empty local library for case #2.

Installing TAP::Harness::Restricted in case #1 installs some distributions that Dist::Zilla deps also need, but I didn't include the time of that in my analysis. The majority of it is installing Capture::Tiny, which I timed separately as requiring ~ 40 seconds to install due to the heavy testing it does.

Testing was done like this:

# case #1
$ HARNESS_CLASS=TAP::Harness::Restricted time cpanm Dist::Zilla

# case #2
$ time cpanm -n Dist::Zilla

One thing I realized later (but will describe here) is that cpanminus installs META file information into the archlib path. I was curious how much overhead that added, so I added a third case (also with a clean local library): installing using CPAN.pm with TAP::Harness::Restricted.

To keep that from hanging in the middle of the run, I had to run it enabling default answers to prompts:

# case 3
$ PERL_MM_USE_DEFAULT=1 HARNESS_CLASS=TAP::Harness::Restricted time cpan Dist::Zilla

The results:

  • Case 1: ~16 minutes (cpanminus + TAP::Harness::Restricted)
  • Case 2: ~11 minutes (cpanminus without running tests)
  • Case 3: ~12 minutes (CPAN.pm + TAP::Harness::Restricted)

That was surprising! Comparing #1 and #3, cpanminus writing META files looks like it has about the same overhead as running tests in the first place. If cpanminus didn't do that, then case #2 might drop down to maybe 7 or 8 minutes. That would average around 3 seconds over the 170 dependencies, which seems plausible.

[Update: Miyagawa pointed out that I'm assuming that writing META is the cause of the slowdown and he's right. I suspect that it is a large part of it (it hits disk and executes a separate process), but there might be other reasons as well.]

That was the macro picture. Next I wanted to see how long individual distributions took to install so that I could see which ones were causing the biggest delay.

To profile installation timings, I hacked some timing output into cpanminus and then re-ran case #1. Not surprisingly, a handful of distributions were a huge chunk of the installation time.

The number after the distribution in the list below is the number of exclusive seconds required to download, unpack, configure, build, test and install (cpanminus' writing of META is excluded):

Moose-2.1202: 123
Module-Build-0.4204: 63
Dist-Zilla-5.012: 51
IO-Socket-SSL-1.966: 39
Capture-Tiny-0.23: 39
PPI-1.215: 26
DateTime-TimeZone-1.63: 24
File-Temp-0.2304: 21
DateTime-1.06: 21
Test-Harness-3.30: 16
DateTime-Locale-0.45: 16
MooseX-Role-Parameterized-1.02: 9
Net-SSLeay-1.58: 9
Test-Warn-0.24: 9
libwww-perl-6.05: 9
Test-Simple-1.001002: 7
Config-MVP-2.200006: 7
JSON-2.90: 7
Moose-Autobox-0.15: 6

In some cases, it looks like newer versions of dual-life core distributions are being pulled in when they might not need to be.

For example, Test::File::ShareDir requires a newer Module::Build than ships with Perl v5.18.2 for configuration, but doesn't seem (at first glance) to use any of its features. Switching to ExtUtils::MakeMaker would shave 8% or so off Dist::Zilla's worst-case installation time (assuming tests are run).

Likewise, Tree::DAG_Node requires a very new File::Temp for testing. Is that really necessary? Maybe not.

Of course, these are worst case results. In many real-world cases, you might already have Moose, LWP, DateTime and other modules installed and the installation burden will be less.

So what should you do if you need to install Dist::Zilla?

If you like tests, install TAP::Harness::Restricted and use CPAN.pm like this:

$ cpan TAP::Harness::Restricted
$ PERL_MM_USE_DEFAULT=1 HARNESS_CLASS=TAP::Harness::Restricted cpan Dist::Zilla

If you don't mind installing things without tests, use cpanminus like this:

$ cpanm -n Dist::Zilla

In either case, it's probably going to take about 10 minutes.

Go for a walk, go get a cup of your favorite beverage, take a bathroom break, or whatever. When you get back, Dist::Zilla should be ready for you.

If you really can't wait because $job depends on the fix, you can always just patch a tarball from CPAN instead of the repo
Despite the complaint that Dist::Zilla requires "half of CPAN", that's actually only about 0.6% of the nearly 30k distributions on CPAN
Because capturing output portably can break in so many ways
Posted in dzil, perl programming, toolchain | Tagged , , , | Comments closed

Dist::Zilla haters, stop your whining

Some people just love to hate. And some of them love to blog their hate.

Dist::Zilla seems to rub some people wrong way. Here are some of the typical complaints I've seen or heard:

  • It's good for authors but not contributors
  • I have to install half of CPAN to contribute
  • There's no Makefile.PL or Build.PL in the code repository
  • I can't install it from github

Well, sure. It is good for authors.

It was written by Ricardo Signes (RJBS), who is possibly the most prolific CPAN author to date. According to the CPAN Report, Ricardo released 230 distributions in 2013. Oh, and did I mention that he is the Perl Pumpking, too?

If you look at heavy Dist::Zilla users, you'll find a who's who of very active and involved CPAN contributors. These are people who spend a lot of time publishing code for the benefit of the broader Perl community.

So here's my problem with whining about how their use of Dist::Zilla makes it hard to contribute:

You're telling some extremely prolific CPAN contributors to be less productive for your convenience.

That's asinine!

You ought to be thanking them for finding a tool that lets them give so much of their time to the Perl community. You ought to be bending over backwards to do it their way, even if that means a few extra minutes of your time.

You sure as hell shouldn't be wasting any of their time or morale complaining about how they manage their code.

That said, there are ways to mitigate Dist::Zilla contributor-shock and I've been encouraging Dist::Zilla users to make such changes. One huge help is providing better documentation for how to contribute.

Here's all it takes for most of my own distributions (note, no Dist::Zilla required):

    $ git clone git://github.com/dagolden/...whatever...
    $ cd whatever
    $ cpanm --installdeps .
    # hack, hack, hack
    $ prove -l

If that's too hard for you, I'm not sure I want your contributions anyway.

Maybe bitching about Dist::Zilla will make some potential new adopters think twice. Or maybe not.

Do you think people would rather listen to the guy releasing 230 distributions a year to CPAN or to the guy complaining about how he did it?

Posted in dzil, perl programming | Tagged , , , | Comments closed

Help test IO::Socket::IP for Perl v5.20

Do you want good IPv6 support in the Perl core?

The Perl 5 Porters intend to add IO::Socket::IP to the Perl 5 core for Version 20, coming later this year. IO::Socket::IP makes IPv4/IPv6 transparent networking easy.

It aims to be a drop-in replacement for IO::Socket::INET (with some caveats), so that most existing code merely needs to do s/IO::Socket::INET/IO::Socket::IP/ to gain IPv6 support.

Preliminary tests have been favorable, but P5P would like more testing to see how well it works as a drop-in replacement in real-world situations. You can help in one of two ways:

The hard, but good way

Take some networking code you've written and replace IO::Socket::INET with IO::Socket::IP.
If you find any problems, report them to the IO::Socket::IP bug queue.

This is the best test, but requires the most work from you.

The easy, but risky way

Install Acme::Override::INET from CPAN. This replaces your IO::Socket::INET with a thin wrapper around IO::Socket::IP.

THIS IS RISKY, because it affects every Perl program you run, so be sure you're willing to take the risk.

I've been running it for a while on my day-to-day Perl and haven't had any problems so far. Other Porters, including Ricardo Signes and Nicholas Clark are also using it.

If you find any problems, report them to the IO::Socket::IP bug queue.

This is super easy and fairly comprehensive since it affects everything you do. But you have to accept the risk of breakage.

[If you want to remove the override, you should be able to delete the modified IO::Socket::INET from your sitelib path and Perl will resume using IO::Socket::INET in your core library path.]

Mention that you're helping

If no one reports any bugs, does that mean that lots of people tried it and no one had problems? Or does it mean that no one bothered to try?

If you test IO::Socket::IP (either way above), then please add yourself to this ticket.

Thank you!

Posted in p5p, perl programming | Tagged , , , | Comments closed

The xdg channel — Thanksgiving missive

While I haven't been blogging much, I have been busy coding. To riff from Damian's "Conway Channel" talks, this blog post summarizes the various (mostly new) CPAN modules I've been working on.

::Tiny and not so ::Tiny

I appear to be one of the leading proponents of "::Tiny" modules. I love the Unix-like small-tools philosophy. Sometimes, though, they can be too tiny, and need extension for situations that need extra features and/or can handle more dependencies.

  • Class::Tiny is my response to the excessive minimalism of Object::Tiny. When you just need read-write accessors with lazy defaults and maybe BUILD/DEMOLISH, Class::Tiny gives it to you in about 120 lines of code.
  • HTTP::Tiny::UA extends HTTP::Tiny. HTTP::Tiny is in the Perl core and Christian and I consider it nearly feature-complete. I hope HTTP::Tiny::UA can become common ground for user-agent extensions that are consistent with the HTTP::Tiny philosophy and use HTTP::Tiny as the underlying transport.
  • Path::Tiny is not new, but it gets steady improvements. Lately, I've been sorting out Windows and volumes. One of these days, I hope to get around to tackling some big changes to file moving, copying and renaming (maybe by the QA hackathon next year).

Embellishing the Moose

Roles are one of the best features of Moose and Moo. I wrote two roles I thought worth sharing.

  • MooseX::Role::Logger provides a Log::Any-based logger. I think Log::Any is a great idea and underappreciated. I've taken over maintenance and hope to someday soon ship a new release that is even more flexible than it is today.
  • MooseX::Role::MongoDB provides an API for using MongoDB::MongoClient and associated databases/collections. It provides lazy-instantiation, caching and fork-safety.

A MongoDB Framework

You either love MongoDB or you hate it. Or both at the same time. MongoDB's document-centric data model is different than you're used to and everything I found on CPAN was too complex or was doing it wrong.

  • Meerkat is a framework that uses Moose objects as projections of the document state maintained in the database. I think it makes it easy use the right conceptual model in a Perl-ish way. Of course, it uses MooseX::Role::MongoDB under the hood.

Living with failure

Perl's poor excuse for an exception system is painful, so it falls to CPAN to provide improvements. Here are my latest two attempts to provide better tools.

  • failures makes creating and using exception classes extremely easy. Other than relying on Class::Tiny, it's implemented in about 70 lines of code.
  • Try::Tiny::Retry extends Try::Tiny to make it easy to retry a code block on error. It defaults to exponential-backoff, but is easily customizable.

CPAN minus archive equals index

Without an index, CPAN is just a distributed file store.

  • CPAN::Common::Index is a common library for accessing several types of CPAN indexes. I hope someday it will be something that CPAN clients will use.


If I didn't use Dist::Zilla, I couldn't possibly be as prolific as I am. So some fraction of my time is spent adding to the the Dist::Zilla ecosystem. In addition to helping make Dist::Zilla safe for encodings, I churned out a few new plugins.

Pod::Spell gets used by my Dist::Zilla spell checking plugins. I merged in the word list from Pod::Wordlist::Hanekomu, improved wordlist matching with Lingua::EN::Inflect and made some other algorithm improvements.

More for the core

I kept pushing some core modules forward in various ways, mostly just applying patches or fixing bugs.

  • HTTP::Tiny got some minor bug fixes
  • File::Temp got some dependency management and Travis CI smoking
  • CPAN::Meta got some fixes to validation and a couple new features

YAML::Tiny isn't really core, but it is the basis for CPAN::Meta::YAML, so I count it in the same category. Working with Ingy, Karen Etheridge and Jim Keenan, we fixed encoding, overhauled the test suite and added test coverage.

Code review

Inspired by rjbs's code-review practices, I've started gradually cleaning up and re-releasing old distributions of mine.

I for Incomplete

There are a number of other projects that I've started or just conceived that I haven't finished. They may yet see the light of day in the future.

  • A "tiny" URI module
  • A better benchmarking library, with statistical rigor for non-parametric timing distributions with unequal variance
  • Some extensions for Data::Faker
  • A module providing a standard way to safely evaluate $VERSION lines parsed from modules

What you can do

First, if any of these are interesting to you, please try them out and let me know what you think.

Second, if you're not in the habit of releasing code to CPAN, consider starting. When you write some library, take an extra second or two to think about how it could be generalized for others and ship it.

Give thanks for CPAN by giving back.

Posted in perl programming | Tagged , , | Comments closed

© 2009-2015 David Golden All Rights Reserved