Goodbye Path::Class, hello Path::Tiny

I like Path::Class, but it's clunky and slow. So I wrote Path::Tiny to scratch my itch.

It's smaller (roughly half the lines of code), comes in a single file, and is generally faster. Among other things, it has lots of handy UTF-8 input and output methods.

The downside is that it's less portable and less extensible, but let's be honest, most of us are developing only for Unix or Windows anyway. And when was the last time you subclassed Path::Class for something? I'll bet never. YAGNI.

Here's the synopsis:

use Path::Tiny;
 
# creating Path::Tiny objects
 
$dir = path("/tmp");
$foo = path("foo.txt");
 
$subdir = $dir->child("foo");
$bar = $subdir->child("bar.txt");
 
# reading files
 
$guts = $file->slurp;
$guts = $file->slurp_utf8;
 
@lines = $file->lines;
@lines = $file->lines_utf8;
 
$head = $file->lines( {count => 1} );
 
# writing files
 
$bar->spew( @data );
$bar->spew_utf8( @data );
 
# reading directories
 
for ( $dir->children ) { ... }
 
$iter = $dir->iterator;
while ( my $next = $iter->() ) { ... }

It does require a very new File::Spec that fixes some ugly, tricky bugs, but, otherwise, it's core only for any recent Perl.

Check it out!

Update: If you use Moose, there is also MooseX::Types::Path::Tiny.

Update 2: I didn't mention it before, but note that stringifying the Path::Tiny returns a (possibly cleaned up) copy of the original path.

Posted in perl programming | Tagged , , | 8 Responses

My second week of Dancer, now with queues and transactional email

A couple weeks ago, I wrote about my initial efforts with Dancer, Xslate and Bootstrap. Last week, I added the ability to send password reset emails. In the process, I've learned how to write Dancer plugins.

Designing a password reset system

Since I've never done it before, I decided to reinvent the wheel and make it work the way I think it should.

I broke the problem down into three big chunks:

  1. generate and store a random token
  2. send the token to a user's registered email address
  3. receive the token and prompt a password change

Then I made a couple additional architectural decisions.

First, I decided to make the token system generic, as I expect there will be other things I'll use tokens for, such as email address confirmation. Regardless of the action that happens when the token is received, the first two steps above are nearly identical. All I need to do is make sure that tokens have a type and can store whatever additional data is needed to complete the action.

Second, I decided to break email sending out of the web app itself. Instead, an email is generated within the app and dropped into an asynchronous message queue. A separate program monitors the queue and sends the emails. In addition to insulating the web app from latency from sending email, a message queue makes unit testing a lot easier. All I need to do is see if the web app dropped the right message into the queue. I don't have to mock up an email delivery system.

Creating the token model

I decided it was sufficient to have a token be a URL-friendly, base64-encoded random value associated with a username, a "type" (e.g. password reset), an expiration, and an arbitrary "value" field.

For what it's worth, here's the trivial code for generating a random value:

use MIME::Base64 qw/encode_base64url/;
use Data::Entropy::Algorithms qw/rand_bits/;

$token = encode_base64url( rand_bits(192) ),

One of the reasons for having a generic "value" field is that I'm using MongoDB for my data store, so any JSON-serializable data can go in there "for free". If I wasn't using a document-based data store, I'd have to think more carefully about the value field semantics, or I'd have to serialize to/from the field with JSON or Sereal or something like that.

Speaking of MongoDB, I've been pretty pleased with Mongoose as a MongoDB->Moose mapper and the related Dancer::Plugin::Mongoose. I can specify my model classes and their associated database connection parameters directly in the Dancer config.yml file. If I were using a relational database, I'd probably look into Dancer::Plugin::DBIC, instead.

Loose coupling

Since I didn't want to send the email directly from the application, I needed a message queue. Since I'm using MongoDB and have other (backend) reasons for using it for other message queues, I decided to use it here as well. Otherwise, I might have explored Amazon SQS, Gearman, or Redis. Generally, my approach is to use a small number of versatile tools that I can develop expertise in rather than spread my expertise across a giant toolbox. This, of course, is why Perl is my #1 tool.

I had already written MongoDBx::Queue, so that was done. What I needed was to get Dancer to use it. A Dancer plugin for it would need to do a few useful things:

  1. Gather config data for (one or more) queues
  2. Instantiate a singleton for the life of the app
  3. Extend the Dancer DSL to provide access.

Again, I prefer loose coupling and frameworks, so instead of writing Dancer::Plugin::MongoDBx::Queue, instead I wrote Dancer::Plugin::Queue, which is a generic queue interface. Then I wrote Dancer::Plugin::Queue::MongoDB to implement the generic mechanism using MongoDBx::Queue.

Other Dancers who might favor other message queue systems just need to write similar implementation plugins and then message queues become an interchangeable component, just like template systems and session management. Loose coupling for the win!

Here is a slightly simplified version of the resulting code:

# generate reset token
my $token = schema("token")->new(
  user => $user->username,
  type => 'p',            # password reset type
);
$token->save;

# queue the reset email
queue("mx_out")->add_msg(
  {
    to      => $user->email,
    from    => 'support@example.com',
    subject => 'Did you forget your password?',
    body    => template(
      'emails/password_reset',
      {
        username    => $user->username,
        token_url   => uri_for( '/confirm/' . $token->token ),
      },
    ),
  }
);

Once the reset email data goes into the queue, it waits for a separate worker process to retrieve the message and send it. At the moment, I'm using the Postmark email service to send my transactional emails. The worker is a pretty short Perl program that polls the message queue, retrieves new messages and hands them off via WWW::Postmark.

While I was at it...

When working on Dancer::Plugin::Queue, I realized what I was doing was similar to a lot of other plugins I had examined. In the case of D::P::Queue, I had the added step of creating a role to define the generic interface, but leaving that aside, a lot of plugins are doing this:

  1. Loading a class
  2. Loading some config options
  3. Creating a singleton

It's ridiculous to do that for any particular CPAN module you want to use within Dancer, so I wrote it generically as Dancer::Plugin::Adapter.

If I weren't committed to using WWW::Postmark via a message queue, this is how I could use it directly within a Dancer app with Dancer::Plugin::Adapter:

In the config.yml:

plugins:
  Adapter:
    postmark:
      class: WWW::Postmark
      options: POSTMARK_API_TEST

In the application:

use Dancer::Plugin::Adapter;

get '/send_email' => sub {
  eval {
    service("postmark")->send(
      from    => 'me@domain.tld',
      to      => 'you@domain.tld, them@domain.tld',
      subject => 'an email message',
      body    => "hi guys, what's up?"
    );
  };
  return $@ ? "Error: $@" : "Mail sent";
};

As long as the module needs only static data to initialize, Dancer::Plugin::Adapter does all the repetitive work. In my (not so) humble opinion, that makes a lot of useful CPAN modules trivial to use as singletons within Dancer. Enjoy!

Pulling it together

Once I could generate and send the reset email, I needed a handler to respond to someone clicking on the reset link. Most of the work was doing some rudimentary validation on the submitted token -- ensuring it was valid, not expired, and so on.

I decided to treat a password reset token as a one-shot, password-equivalent-login and existing-password-revocation. I also track that the login happened via token, so that the password change form and logic can skip requiring the current password.

Here is a simplified version of that code:

get '/confirm/:token' => sub {
  unless ( params->{token} =~ /^[a-zA-Z0-9_=-]{32}$/ ) {
    return template 'error' => { error => "Invalid token" };
  }

  my $token = schema("token")->find_token( params->{token} );
  if ( $token ) {
    $token->delete; # one-shot token, so delete from database
  }
  else {
    return template 'error' => { error => "Token not found" };
  }
  
  my $user = schema("user")->find_user( $token->user );
  unless ($user) {
    return template 'error' => { error => "Token has invalid user" };
  }

  if ( time() > $token->expiration ) {
    return template 'error' => { error => "Token has expired" };
  }

  if ( $token->type eq 'p' ) { # password reset
    session user  => $user->username; # treat as logged in
    session token => 1;               # note they arrived via token
    $user->scramble_password;
    $user->save;
    redirect '/change_password';
  }
  else {
    return template 'error' => { error => "Token not recognized" };
  }
};

You can see how the token confirmation logic is nearly completely generic. I can add additional token types as needed, just with different logic for each.

Summary

After another week of work, the application continues to take shape. Here's what I got done:

  • Wrote and shipped a message queue plugin system and MongoDB implmentation
  • Wrote and shipped a generic CPAN module adapter plugin
  • Added a password reset feature that generates reset tokens and emails users
  • Added a password reset token handler

It's still only the rough outline of an application, but Dancer feels less foreign and I'm about ready to get past basic user-account housekeeping and into the real feature set of the application.

Posted in perl programming | Tagged , , , | 10 Responses

My first week of Dancer, Xslate and Bootstrap

As I mentioned last week, I've started working with Dancer in earnest. This week, I climbed three learning curves at the same time: Dancer, Xslate and Bootstrap.

Skeleton before template

I started off with the incredibly-handy "Dancer Cowbell" template by A. Gordon, which is a complete skeleton app that brings together Dancer, Template::Toolkit, Bootstrap, and Font Awesome.

It comes with scripts to download Bootstrap, etc., from various sources. Unfortunately, the day I tried it, GitHub (the source of Bootstrap's custom downloads) was f*cked, so I wound up cobbling together the various bits from other Bootstrap experiments I'd downloaded in the past. Not fun, but not A. Gordon's fault.

Xslate is xcellent

My next challenge was converting it from Template::Toolkit to Xslate. I'd been intrigued by Xslate's design, particularly its speed. Xslate is fast, at least if you have a C compiler and a persistent application. It also has some interesting template composition capabilities, more akin to "roles" than just bare includes.

For example, here's what one of my view templates looks like:

:cascade wrapper with macros, header, footer

: around title -> {
  About
:}

: around pagestyle -> {
  : stylesheet("about")
:}

: around content -> {
    <h1>About my project</h1>
    <p>Blah blah blah...</p>
:}

The interesting thing is that the "cascade" command combines four other templates, the HTML wrapper, some header HTML, some footer HTML, and macro definitions (the "stylesheet()" function). Then my page template just defines things to replace at various places in the cascade. The "title" and "pagestyle" blocks get inserted into the HTML header section. The "pagestyle" is defined using a macro from the macro sheet (and maps to a stylesheet link with a static path to "about.css" for page-specific layout). And the "content" block gets dropped into the HTML body section in the wrapper.

What about that "header" -- where does that go? Check out the header template:

: before content -> {
  <div class="masthead row">
    <!-- Bootstrap navigation would go here -->
  </div>
  <hr>
:}

See the "before content" directive? It does what you would think, composes the header before the content block in the wrapper. The footer uses an "after" directive the same way.

That's a long digression on Xslate, but I'm really happy with it so far. My view templates are stripped down to the bare minimum of information related to that view, yet can stuff information into any other spot in the composed template. It took a while to wrap my head around what that means.

I think about it like this: the wrapper template defines where things go, but doesn't actually pull them in the way a template would "include" another template traditionally. It's purely a framework of placeholders. Then, other templates customize those placeholders, saying if they go before, after or replace them.

If I want a different header for some page, I don't need a different wrapper to include it (which would duplicate things in the wrapper I might need to then keep in sync), I just need to compose a different header template in the page template cascade. Cool!

But what about Dancer?

Even before I fully wrapped my head around Xslate, I got down to business with Dancer. I spent a while reading various Dancer tutorials:

I decided to start with the static pages, and created simple routes with matching template names.. This was trivially easy, the "hello world" of web development.

get '/' => sub {
  template 'index';
};

for my $p (qw/about faq tos privacy help/) {
  get "/$p" => eval qq|sub { template '$p' }|;
}

I intentionally did not use Dancer's 'auto_page' config option, which will match simple routes to matching templates in the view directory because I don't want all the decomposed xslate templates to become valid routes.

After copying and pasting some examples from the Bootstrap site, I finally got the static pages looking loosely like what I envisioned (albeit with boilerplate text), and I could click around the navigation and the whole thing started looking like a real site. Woohoo!

Users and passwords and hashes, oh my!

With the static shell done, I turned to the next logical task: user registration and login/logout. I already had a half-done user model from some backend prototyping, but it had no concept of password (hashed, of course), so that had to be added.

I spent a little while getting up to speed on bcrypt options on CPAN. There was an interesting Dancer plugin, Dancer::Plugin::Passphrase that I considered, but ultimately I decided that the controller really shouldn't be handling password hashing and that it should live within the model.

I considered Authen::Passphrase and loved everything about it except for all its dependencies -- it gives you tools for every reasonable (and unreasonable) password hashing scheme, even if you only need one. Ugh.

So then I looked at how Dancer::Plugin::Passphrase was using Crypt::Eksblowfish::Bcrypt to see if I could just crib that... and then I decided I was spending way too much time in the weeds and chose to use Authen::Passphrase (BlowfishCrypt algorithm) after all.

Fold, spindle, mutiliate

Forms and form building modules looked like another tarpit and I decided to go around. I worked up simple registration and login forms by hand using some Bootstrap sample code, skipped all validation for now (naughty!), and created some POST handlers. Eventually, I'll go back and look to form builders and validators, but it's easy to lose a day browsing CPAN and I didn't have a day to waste.

In case any other HTML newbie is out there doing things by hand, make sure your inputs have a "name" property if you want them submitted. For whatever reason the code samples omitted them and I spent a long while wishing I had hair to tear out in frustration when my form handling code didn't work. (Yes, a decent form builder would have saved me this embarrassment. Oh, well.)

With that fix, user registration worked.

Doing sessions the not-so-easy way

By definition, "logging in" means a change in state, so I needed sessions to manage state. And of course, Dancer had plenty of session managers to choose from. Dancer::Session::Cookie was the first I tried. I've been intrigued by the concept of maintaining state with clients instead of a server and wanted to give it a try. (No, I won't get drawn into arguments over that here. Trust that I've read the relevant papers about it and have a sense of pros and cons and caveats.)

It worked great, right up until I wanted to log out. I couldn't log out without manually deleting the cookie!. WTF? Turns out it's a bug in how Dancer::Session::Cookie does session destruction. I filed a pull request with a trivial fix and then switched to the other quick-and-dirty developer-friendly option, keeping state with files. I picked Dancer::Session::JSON. Boom! Login and out working.

[After spelunking in the Dancer codebase, I also took some time to explore what the not-yet-released Dancer 2 code is doing for sessions, filed half a dozen issues to address problems I saw, and also filed a pull request to improve the abstraction model.]

Now that I had logged-in and logged-out states, I started tweaking the header templates to change the navigation depending on the users' logged-in or logged-out status. E.g. toggling between "login" and "logout" links and so on.

After that, it's still not much of a site, but it's definitely not "Hello World" anymore.

What's next

I'm now starting to work on email verification and password reset. Both of those have a similar process model requiring a token to be emailed to a user and taking action based on the resulting click. I signed up with Postmark and tested it out with WWW::Postmark. I'll use that for emailing the tokens. At first, it will probably be direct from the app, but eventually, I'll move that into an offline process.

Once that's done, I'll be turning to the "logged in" experience, connecting the site up to the backend data.

Summary

I'm feeling really good about how quickly things came together. After starting from almost zero, here's what I've got after my first week with Dancer, Xslate and Bootstrap:

  • Static pages
  • User registration
  • User login/logout (with navigation changes based on state)
  • Password change
  • Well-factored templates
  • Half a dozen or issues filed or patches sent for things discovered along the way

See you on #dancer...

Posted in perl programming | Tagged , , , , | 11 Responses

Wallflower no more

I've done a lot of things with Perl. I started with some log file analysis, went on to data munging and numerical analysis, learned to write modules, started contributing to the toolchain, packaged Strawberry Perl, got into CPAN Testers, and eventually started hacking on the Perl core.

One thing I've never done much of is web development.

That ends now.

I know a lot of the theory, but haven't really practiced.

I wrote a photo album site for family pictures years and years ago and it still works, but it's like "baby Perl" web dev -- not much more than a tutorial. Now I have real reasons to learn web development seriously for something substantial.

As implied in the title of this post, I've decided to use Dancer. Of the various frameworks I've looked at, it fits my brain best at first glance.

I tend to like minimalism, like my addiction to ::Tiny modules, and Dancer seems to be (while not Tiny) at least fairly minimalist. It also has a strong community, which is another thing I really value.

I expect future posts will be at least in part about my experiences with it.

See you on the dance floor...

Posted in perl programming | Tagged , , , , | 4 Responses

Why PERL_UNICODE makes me SAD

When I first got a bug report that Capture::Tiny was breaking under PERL_UNICODE=SAD, I though it would be an easy thing to fix. I was so wrong... I had no idea what a rabbit hole I was in for.

What the heck is PERL_UNICODE?

Unless you're American, you've probably heard of Unicode. Even if you're American, hopefully by now you've realized that a lot of the world uses languages that require more than the ASCII character set. And if you use Perl, you might be aware that Perl has remarkably good Unicode support. (See the Unicode Support Shootout slides.)

The PERL_UNICODE environment variable provides a default for the -C command line argument to the Perl interpreter, which can set UTF-8 translation layers on various filehandles (and command line arguments).

Specifically, PERL_UNICODE=SAD means that Perl should add the :utf8 layer to the Standard IO handles, to the Argument list, and should be the Default for any other handles opened as well.

Is PERL_UNICODE a good idea?

Maybe. One the one hand, if you work in a world that is exclusively ASCII or Unicode I/O, then you can make a lot of input and output "just work".

That strength is also the weakness. PERL_UNICODE has a global effect!

Can you be sure that every module you use is ready to have :utf8 on any handles they open? Are you sure that any modules that reopen standard handles set them back correctly later? Turning on :utf8 globally is a huge bet, with odds that get worse the larger your dependency chain is.

[I can tell you from experience that almost no code on CPAN properly understands how to record the layers on a handle and reapply them to another. Capture::Tiny does, except when it's actually impossible, since tied handles can't report layers correctly.]

Capture::Tiny and PERL_UNICODE walk into a bar...

The bug report I got for Capture::Tiny regarded a failure in one particular test file, when PERL_UNICODE=SAD was set globally in the environment. As I dug into the bug report, it became clear that the bug was being triggered only under these conditions:

  • Perl prior to v5.12
  • PERL_UNICODE=D
  • STDIN closed
  • Capture::Tiny trying to tee() output

The good news was that newer Perls were unaffected. The bad news was that I couldn't figure out why it was happening.

Not only was it breaking under those conditions, it was weird.

Down the rabbit hole

One of the strange things happening was that a "no output" capture test was capturing the contents of the utf8.pm file in the Perl core. WTF? Something about PERL_UNICODE was loading utf8.pm, which winds up on file descriptor 0, confusing Capture::Tiny. Sticking require utf8; early in the test code "fixed" that problem.

Even after that fix, it looked like the test was leaking a filehandle. Something else was grabbing file descriptor 0 in the middle of a tee() and not letting go.

Given that leak, it wasn't just a matter of taking into account the global presence of :utf8 layers -- something more fundamental was going wrong.

Knowing when to punt

Reading Perl release notes and grepping through Perl core commit logs wasn't giving me any insight into what changed. Git bisection of the core turned into a huge headache. I quickly got to the point where I decided I was spending more time on this than the problem was worth.

Since the issue was a real corner case and only on very old Perl's, I decided to document it as a known issue, bypass the failing tests under the triggering condition, and ship a new release to CPAN. Oh, well.

Lessons to learn from this

Be careful with global effects! It might seem like an easy fix, but you put your entire codebase at risk. It's much smarter to fix your code locally where you do I/O. Even the open pragma is a better choice than PERL_UNICODE, since you can limit the scope of change to the parts of your code that are actually doing I/O.

The real insight I got from this is how important it is to test under production conditions. If you do use PERL_UNICODE=SAD in production, it's a very good idea to do your development and testing with that set as well. It will help you find modules that aren't happy with it.

Finally, this is a great example of why upgrading Perl is a good idea. Hundreds (thousands?) of bugs have been fixed since 5.8 or 5.10. The longer you wait to upgrade, the longer you'll have to suffer them.

Summary

  • PERL_UNICODE has a global effect, applying :utf8 to layers automatically
  • Global effects can have unexpected side effects
  • Avoid global effects if you can
  • If you must use global effects, test your dependencies under the same conditions
  • Upgrade your Perl
Posted in perl programming | Tagged , , | 5 Responses

© 2009-2013 David Golden All Rights Reserved