Perl Toolchain Summit 2018 report

I'm back from another incredible Perl Toolchain Summit – my first in a couple years. As usual, it was an amazing experience: getting dedicated time to work with incredible contributors on code at the heart of Perl's community and ecosystem.

This year, we were back where we started ten years ago: Oslo, Norway. Oslo is a beautiful city, and I spent a couple days wandering around recovering from jet lag while getting ready for the summit.

My main goal for the summit was to follow through on a decision we'd reached at a toolchain summit several years ago: automatically approving PAUSE ID requests instead of holding them for manual moderation. My plan was to implement reCAPTCHA v2 on the ID request page and automatically approve applications if it validated.

Making that happen required me to shave some yaks. A lot of yaks.

Day 1: Thursday

Back in 2015, Kenichi Ishigaki (charsbar) converted PAUSE to run on Plack instead of modperl, which made it much easier to run PAUSE locally on a laptop for development testing. Unfortunately, I discovered that the PAUSE README describing how to get a working local installation was out of date. So my first order of business was discovering how to do it and updating the README. I parked myself at a table next to Andreas Koenig and Kenichi and had two experts ready to help.

My quest involved learning how to install mysqld and nginx via macports and configure them. After beating them into submission over a couple hours, I had a self-signed TLS reverse proxy running against the Plack-ified PAUSE code. I later found an obscure config option that let PAUSE run locally without TLS and was able to ditch nginx and test PAUSE directly with Plack.

Once I had PAUSE running locally it was straightforward to get the reCAPTCHA rendering on the page. Andreas asked me to protect it with a feature flag so we could turn it off if we had concerns, so I did that, too. In the waning bit of the day, I wrote the backend code for verifying reCAPTCHAs. But actually wiring it up into the website was going to have to wait for Friday.

Day 2: Friday

The problem with working on PAUSE code is that it's old... really old. It's so old, the code I needed to work on was in a directory called "pause_1999". Many of the pages are rendered, validated, and do post-form processing in single subroutines, each often hundreds of lines long. The HTML generation is not templated -- snippets of HTML are pushed onto an array to be joined later. The HTML generating code is frequently interspersed with database SQL calls.

I didn't want to try to wire up reCAPTCHA without refactoring the existing user registration into distinct, reusable units of work, so that took much of Friday. Take "Render HTML for submitted ID request"... and put it in a subroutine. Take "Send one time password email"... and put it in a subroutine. An so on, and so on. Eventually, just before the end of the day, I had all the pieces I needed and reCAPTCHA-validated user registration was running on my local PAUSE! I cleaned up my work in some rebases, got Ricardo Signes to code-review it and sent Andreas a pull request.

Towards the end of the day, Merijn Brand (Tux) said he had some available time to help out anyone who needed it, so I asked him to be fresh set of eyes to try my README for setting up a local PAUSE web server. He promptly found several typos and thinkos, which I fixed up on Saturday.

As the day wound down, Ricardo and I discussed ideas for consolidating the business logic code for PAUSE module permissions management -- a project would wind up being my second major deliverable from the toolchain summit.

Day 3: Saturday

While I was working on PAUSE reCAPTCHA, charsbar was nearing completion of a more ambitious project he started in 2017: converting PAUSE to run on top of the Mojolicious web framework. On Saturday, he and I discussed how to get my work into his branch... which largely turned out to be him taking my PR and just splicing it by hand into his work. Thank you, charsbar!

Andreas had some concerns about reCAPTCHA abuse, so I implemented a simple, server-side rate limiter. After a pre-set number of user registrations in a day, reCAPTCHA would be disabled and the legacy, manual moderation process would be used instead.

In the existing PAUSE code, the approving PAUSE admin's ID was recorded in new user records. In an auto-approval world, that doesn't apply, so I created a dummy 'RECAPTCHA' PAUSE account to server as the "approver" for such accounts.

At this point on Saturday, we were entering into the home stretch and everyone was hard at work trying to ensure they could finish what they'd started.

For me, I was ready to start one last project: the PAUSE permissions manager. The problem I was trying to solve was that the database code for module permissions checks and modification was in SQL statements scattered throughout the code base. We wanted to centralize that logic -- initially as a pure lift-out refactoring, so I created a class for it and began the painstaking process of lifting out each piece and testing each change.

Along the way, I discovered some subtle expectations around database handle management and localization of error handling. I based my code on a branch that Ricardo was working on to refactor state management across various PAUSE modules. So, in addition to the lift out, I made sure that every database call was using the same, centralized handle management that Ricardo had put in place. That made some unexpected test failures go way.

I also had a startling discovery about PAUSE permissions error handling: in order to effect an idempotent insert (i.e. upsert-like logic), inserts were run with exceptions turned off and errors ignored, so that unique key constraint error could be ignored. Of course, that silently ignores any other errors, too! While I preserved that logic in the lift-out, I've bookmarked it as an area for future work.

Saturday night, the summit local organizer team, Salve Nilsen and Stig Palmquist, invited us to hang out at their heavily-graffitied hacker-space, hackeriet.no.

Day 4: Sunday

Most of Sunday was spent finishing up the PAUSE permissions refactor, which was uneventful, if dull. But the code was much more DRY afterwards, and I was happy to see it merged the same day.

Throughout the toolchain summit, I'd been applying some pull requests from my long backlog. On Sunday, since I didn't want to start any new major work for PAUSE, I tackled the backlog with intensity, shipping over a half-a-dozen minor updates.

Over the whole summit, I shipped new versions of twelve modules: Capture::Tiny, Data::GUID::Any, DateTime::Tiny, Dist::Zilla::Plugin::BumpVersionAfterRelease, Dist::Zilla::Plugin::OSPrereqs, HTTP::Tiny::UA, Session::Storage::Secure, TAP::Harness::Restricted, Task::BeLike::DAGOLDEN, Tie::Handle::Offset, Time::Tiny, and Types::Path::Tiny.

Closing thoughts and thanks

I hadn't been to a toolchain summit in a couple years and being back was a great reminder of why it's so valuable, both to the community and to me personally.

For the community, having so many high-caliber people able to spend dedicated time on the infrastructure of Perl is a hugely effective way of getting things done and making the most of volunteer time. Having the right people in the room means that almost no question is too obscure to get an answer from at least one of the attendees.

For big projects, like PAUSE or MetaCPAN, having the key developers face-to-face also helps with high-bandwidth discussions about change. Changing these sites is risky, and being able to talk and plan F2F means decisions happen much faster than on email and IRC the rest of the year.

For me, personally, I felt much more energized about the Perl ecosystem and came out of the summit with renewed interest in contributing to the modernization of PAUSE.

Such a wonderful event would be impossible without help of the organizers and the support of the Perl Toolchain Sponsors. Thank you very much to Salve, Stig, Philippe, Laurent, Neil, and NUUG Foundation, Teknologihuset, Booking.com, cPanel, FastMail, Elastic, ZipRecruiter, MaxMind, MongoDB, SureVoIP, Campus Explorer, Bytemark, Infinity Interactive, OpusVL, Eligo, Perl Services, and Oetiker+Partner.

Posted in perl programming, toolchain | Tagged , , , | Comments closed

Response to "Our Adventures in Logging"

This is a quick response to Our Adventures in Logging on blogs.perl.org, because the comment system there is broken and hateful.

The author had three proposals, and I'll comment on each one.

LAEP 1: Pass hashref as last argument of log functions on to supporting adapters

The idea of Log::Any is that it ought to work with any backend, without having to interrogate backends for capabilities, etc., because putting that kind of logic into logging slows things down and logging needs to be lightweight.

As you discovered, customizing the proxy is the right way to turn structured data into a string to send to the backend. The docs for Log::Any::Proxy give this example:

# format with String::Flogger instead of the default
use String::Flogger;
use Log::Any '$log', formatter => sub {
    my ($cat, $lvl, @args) = @_;
    String::Flogger::flog( @args );
};

If you look at the capabilities of String::Flogger, you'll see that you can throw hashrefs at it and get JSON out. If you don't like String::Flogger, you can put in your own formatter. Plus, by serializing during the formatting step, you remain compatible with all backends, whether terminal, file, or something more custom that throws data at ElasticSearch or whatever.

LAEP 2: Expose an API to Log::Any for modules to add (localized) context data

I think this is a good idea, though perhaps not quite in the way described. The 'prefix' feature is conceptually similar. It's hard right now with formatter, but not impossible.

Something along these lines might work, but dealing properly with scope is tricky.

our $context = {};

use JSON::MaybeXS;
use Log::Any '$log', formatter => sub {
    my ($cat, $lvl, $msg, $data) = @_;
    return "$msg " . encode_json( { %$context, %$data } );
};

Getting better, general, context tracking into the proxy would be a good thing.

LAEP 3: Change the default Adapter from Null to Stderr

This already exists. See "Setting an alternate default logger":

use Log::Any '$log', default_adapter => 'Stderr';

Summary

I hope you find this useful feedback. I'm pleased to see people using Log::Any and happy to discuss. Please feel free to email if you'd like to get into more specific details.

Posted in perl programming | Tagged , | Comments closed

A discussion of DBIx-Class governance and future development

Acting in my capacity as an administrator for PAUSE, I've been mediating a dispute over the future disposition of primary permissions for the DBIx::Class namespace on CPAN. I recently posted a message to the mailing list for DBIx::Class titled "IMPORTANT: A discussion of DBIC governance and future development".

I am reprinting it in full below in the hope that doing so will help this message reach DBIx::Class users who are not on the mailing list. I encourage such users to read the message and join the mailing list to participated in the conversation and express their interests.

Subject: IMPORTANT: A discussion of DBIC governance and future development

Hello, DBIC community.

I apologize in advance for the length of this email, but I urge everyone that uses DBIC to read it fully as it relates to the future of this important module.

For those who don't know me, I'm DAGOLDEN on CPAN and I've joined this list in my capacity as a PAUSE [1] administrator.

For those on the list who aren't familiar with CPAN administration, PAUSE is the service that authors use to upload modules to CPAN. Among other functions, it generates the index that maps modules names to downloadable tarballs – e.g. "DBIx::Class" to "RIBASUSHI/DBIx-Class-0.082840.tar.gz" on a CPAN mirror.

PAUSE also maintains a permissions model [2] for each module namespace with two levels: "primary maintainer" (also called "first come") and "co-maintainer" (aka "co-maint"). Primary maintainers can grant and revoke co-maint permissions. Both levels can upload tarballs to PAUSE, triggering an update to the index.

Over the past several weeks, I've been the PAUSE administrator selected to mediate a dispute over future disposition of primary permissions for the DBIx::Class namespace.

The dispute was triggered by Peter Rabbitson's "Traffic pattern changes ahead" [3] email to this list on September 6, in which he said:

I have also firmly selected who will be getting the DBIx::Class
namespace first-come, the transfer of which will also happen
somewhere around the end of September.

Because the identity of the new primary maintainer was neither disclosed nor discussed with Matt Trout (the founder of the DBIC project, current co-maintainer and also PAUSE administrator) or other co-maintainers, several private conversations between ensued between Matt, Peter and others about this declaration.

On September 15, Peter notified PAUSE administrators via the modules@perl.org mailing list of an "Upcoming PAUSE permissions dispute" [4]. Separately, Matt notified PAUSE administrators privately with his own concerns about a possible dispute (his email was later disclosed and I'll link to it later).

On September 21, I privately emailed all DBIC maintainers (CPAN authors ABRAXXA, ARODLAND, FREW, ILMARI, JROBINSON, MSTROUT, and RIBASUSHI) on behalf of PAUSE administrators with our collective view of how this dispute would be best resolved. Peter asked that any discussion be public, so I reposted it to the modules@perl.org mailing list as "Message from PAUSE Admins to DBIx::Class maintainers [resend]" [5]

I urge everyone to read that thread in full as well. For reference, it includes a copy [6] of Matt's previously private email to PAUSE administrators.

Importantly, the thread summarizes PAUSE administrators' position on the dispute, which I repost verbatim here:

  1. Given the importance of DBIC to the broader Perl community (i.e. way "upriver" <http://neilb.org/2015/04/20/river-of-cpan.html>), we’d like to see a more open discussion between existing maintainers about what happens next in terms of DBIC leadership and control of primary permissions.
  2. The best outcome from our perspective would be for a successor to be decided by consensus of existing maintainers.
  3. Given a dispute among maintainers, the only outcome that is absolutely unacceptable to PAUSE admins would be a unilateral permissions transfer decision.
  4. We really hope the DBIC maintainers and/or community can resolve this internally.

In the ensuing discussion, Peter disclosed additional details about his plans for the future of DBIC in the "Future plans" section of this email [7]:

I am still planning to wrap up the remaining pieces, including some
unannounced initiatives to get the project into the best shape possible
to survive its leaderlessness.

I am still planning to remove all co-maint perms and handover the
first-come to a yet-undisclosed person. Given no clear line of
succession, and the incredibly high stakes wrt compatibility, the only
responsible thing to do is to select a single spot of responsibility and
provide all possible support and infrastructure for a proper
project-freeze.

In another email [8], Peter suggested raising these issues explicitly on the DBIC mailing list:

As suggested in an earlier email: the PAUSE admins (as the only
legitimate concerned party at this point) would likely benefit having
this question asked in a wider forum ( the DBIC mailing list and/or
other channels ). Essentially someone has to trigger a "vote of no
confidence", otherwise this entire exchange is just a time consuming farce.

On behalf of the PAUSE administrators, we would therefore like to invite Peter to describe in more detail his plans for a "project freeze" and the role he envisions for a successor maintainer. We invite Matt, other co-maintainers, and the DBIC community at large to add their thoughts about the specifics of the plan or about the situation in general.

Given public and private discussions to date, we believe the DBIC community should consider questions such as:

  • How should the future governance of the DBIC project be decided?
  • Who should or shouldn't be involved in future governance?
  • Should the project be "frozen" or should development continue?
  • If "frozen", what specifically would a "freeze" entail? Would there be exceptions?
  • If not "frozen", what principles should govern development? (Cathedral vs Bazaar [9] and/or New Jersey Style vs MIT Style [10])

We believe these discussions, if had openly, honestly and constructively, will lead to the best resolution of this dispute for the DBIC community.

Thank you for reading this far, and I look forward to reading the community's views on these matters.

Sincerely,
David Golden, PAUSE Administrator

[1] http://pause.perl.org/
[2] http://perladvent.org/2013/2013-12-08.html
[3] http://lists.scsys.co.uk/pipermail/dbix-class/2016-September/012187.html
[4] http://www.nntp.perl.org/group/perl.modules/2016/09/msg96115.html
[5] http://www.nntp.perl.org/group/perl.modules/2016/09/msg96139.html
[6] http://www.nntp.perl.org/group/perl.modules/2016/10/msg96178.html
[7] http://www.nntp.perl.org/group/perl.modules/2016/10/msg96174.html
[8] http://www.nntp.perl.org/group/perl.modules/2016/10/msg96182.html
[9] https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar
[10] https://en.wikipedia.org/wiki/Worse_is_better

Posted in perl programming | Tagged , , , | Comments closed

Comparison of Class::Tiny and Object::Simple

Yuki Kimoto recently posted about the latest release of Object::Simple, billed as "the simplest class builder". Since I've also written a "simple" OO framework called Class::Tiny, I thought I'd point out similarities and differences.

(I'm not going to address Object::Simple's origins from or differences from Mojo::Base.)

Similarities

Single file, minimal dependencies

Both Object::Simple and Class::Tiny are single-file OO frameworks with no no-core dependencies on recent perls. According to the "sloccount" tool, Object::Simple is 98 lines. Class::Tiny is 135.

Class::Tiny does require some dependencies on older Perls for deep @ISA introspection and global destruction detection.

Accessor generation with lazy defaults

Both frameworks allow you to specify accessors and provide either scalar or code-reference defaults for them. Defaults are evaluated on first use. The underlying generated code is extraordinarily similar and accessor speeds are generally comparable (at least with Class-Tiny-1.05 which has some optimizations to remove scopes).

Read-write accessors

Both offer read-write accessors, which I think is the only sensible choice when providing only a single style.

Differences

Mutator return style

Class::Tiny mutators return the value just set, which is consistent with the values returned by accessors. Object::Simple mutators return the invocant, which allows chaining.

BUILD/BUILDARGS/DEMOLISH

Class::Tiny supports the BUILD/BUILDARGS/DEMOLISH methods just like Moose and Moo do. Object::Simple does not.

Notably, Class::Tiny supports an interoperability convention that allows Moo or Moose classes to inherit from a Class::Tiny class without calling BUILD methods more than once.

Constructor speed

Because Class::Tiny does some extra validation, plus provides BUILD/BUILDARGS support, its constructor is about 3x slower than Object::Simple, which has a two-line constructor.

Extraneous methods in @ISA

Class::Tiny classes inherit from Class::Tiny::Object, which provides only new, BUILDALL and DESTROY methods. Object::Simple classes typically inherit from Object::Simple, which provides import, new, attr, class_attr and dual_attr methods.

Unknown constructor arguments

Class::Tiny ignores unknown attributes in constructor arguments (without error, just like Moose/Moo). Object::Simple will include them in the constructed object.

Subclassing

Class::Tiny relies on users to set inheritance with @ISA or base/super/parent pragmas. Object::Simple additionally offers an import flag "-base" which sets the superclass. If the superclass is not Object::Simple, the superclass is loaded.

Introspection

Class::Tiny provides a mechanism for getting a list of class attributes and default values for attributes. Object::Simple does not.

strict/warnings export

Object::Simple turns on strict and warnings in the caller when the "-base" flag is used. Class::Tiny does not.

Closing thoughts

Object::Simple is, indeed, simple. It's not much more than syntactic sugar for generating accessors with defaults.

That said, I think it's too simple. If you really need minimal overhead and maximum speed just bless a hash reference into a class and directly access the members. If you want minimalism and default values, you can get there with eager defaults like this:

sub new {
    my $class = shift;
    return bless { name => "Jane", data => {}, @_ }, $class;
}

Once you start subclassing, I think you'll want BUILD/DEMOLISH support to properly order construction and teardown and Object::Simple doesn't give it to you.

Even if you don't plan to subclass, might that be something your downstream users might want to do? Providing BUILD/DEMOLISH support makes it easy for downstream users to have well-structured construction and teardown.

Yes, you can create custom constructors, but that defeats the syntactic simplicity of Object::Simple. Plus, if you have custom constructors, you'll need custom destructors and a mechanism for ensuring they get called in order. Very soon, you'll have re-invented the semantics of BUILD/DEMOLISH. So why not start with a framework that already provides that for you?

I think Object::Simple fits a very narrow use-case: people who want lazy defaults, don't want to subclass and are willing to add a dependency to avoid some typing.

For general use, I still think Moo is the best all-around choice unless you know for sure that you need the introspection and meta-class hackery that Moose offers.

If Moo is too "heavy" for me for some project, I'll use Class::Tiny. If Class::Tiny is too "heavy" (?!?), then I'll just roll my own class and avoid the dependency entirely.

Admittedly, I'm biased, but I can't think of a situation where I'd actually use Object::Simple as it stands.

If Object::Simple added BUILD/DEMOLISH support, then it might be a decent alternative – a different flavor of simple class builder for those who like its particular API choices (e.g. mutator chaining). Until then, I think it's too niche to put in my toolbox.

Posted in perl programming | Tagged , , | Comments closed

Stand up and be counted: Annual MongoDB Developer Survey

If you use Perl and MongoDB, I need your help. Every year, we put out a survey to developers to find out what languages they use, what features they need, what problems they have, and so on.

We have very few Perl responses. ☹️

Be an ally! Take the MongoDB Developer Experience Survey.

Camel

Posted in perl programming | Tagged , , , | Comments closed