How to trim PDF margins and edit metadata

I often download academic articles as PDFs to read later. I regularly find two really annoying problems:

  1. Huge margins make the PDF nearly unreadable on my Kindle Fire
  2. Title and authors are missing from PDF metadata, making them harder to find later via search

Today, I found an answer to the first and wrote an answer to the second.

For fixing huge margins, I found the free briss tool, which shows a composite image of all odd and even pages and lets you set a crop box around them.

briss in action cropping PDF margins

briss in action cropping PDF margins

To fix the metadata, I did a bit of Googling for tools, then realized I could whip up a tool with Perl and PDF::API2 faster than hunting for an existing one.

This program reads PDF metadata, opens an editor with the data in JSON format, and takes the result and saves it to a new PDF.

#!/usr/bin/env perl
use v5.10;
use strict;
use warnings;

use JSON::MaybeXS;
use PDF::API2;
use Path::Tiny;

die "Usage: $0 <infile> <outfile>\n" unless @ARGV == 2;

my ( $infile, $outfile ) = @ARGV;

unless ( $infile || -r $infile ) {
    die "Input file '$infile' can't be read\n";
}

my $pdf = PDF::API2->open($infile);
my $json = JSON::MaybeXS->new( utf8 => 1, pretty => 1 );
my $temp = Path::Tiny->tempfile;
$temp->spew( $json->encode( { $pdf->info } ) );

if ( $ENV{EDITOR} ) {
    system( $ENV{EDITOR}, $temp )
      and die "Error editing temp file: $!\n";
}
else {
    die "No EDITOR environment variable set.\n";
}

$pdf->info( %{ $json->decode( $temp->slurp ) } );
$pdf->saveas($outfile);
This entry was posted in hacks, perl programming and tagged , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.

© 2009-2017 David Golden All Rights Reserved