The last time I wrote about bundling vendored modules in Perl, I was a little annoyed by the fact that the tools I was using did not allow me to bundle a specific dependency. They made it very easy to bundle all of them. But every time I’ve needed to do this in the past, I’ve always only needed to bundle some dependencies (the ones that I’m vendoring for whatever reason), being happy to fetch all the rest from upstream public sources.

I said as much that time:

If you only want to bundle some dependencies with your project you’ll have to manually remove all the ones you don’t care about (from both vendor/cache and your 02packages.details.txt.gz).

As it turns out, doing this is much simpler than I thought.

Honorable mentions

Considering the TIMTOWTDI Perl ethos, it should come as no surprise that there are a bunch of tools already available on CPAN to do more or less what I was looking for. So before going through what I actually ended up using, let’s go through some of the alternatives I considered… and why I didn’t use them.

Carton

This was the tool I initially used in my previous post. It is great for what it does, but does not allow you to bundle specific distributions. As far as I’m concerned, it’s the best tool for this sort of thing… unless you’re trying to do what I wanted.

Pinto

I’ve heard about Pinto in a couple of places, and I really wanted to like it. However, it’s not only massive, it also seems to unfortunately be rather unmaintained.

I tried installing Pinto on a fresh Perl install and it meant installing a whooping 175 distributions. And looking through the list of dependencies, that hardly comes as a surprise: the list includes heavy hitters like Moose, LWP::UserAgent, DBIx::Class, Plack, and Starman.

It might be that it’s the right tool for some task, but not for this.

CPAN::Mini

CPAN::Mini is a fairly minimal distribution that provides a minicpan command line utility to create a local clone of CPAN. Pair this with something like CPAN::Mini::Inject and its mcpani and you can now create ether a CPAN mirror or a superset of CPAN.

If I wanted to create a DarkPAN that served all the code in CPAN as well as some extra distributions, these two are probably the tools I’d go with.

Still, this was a little more than what I had bargained for, since I only wanted those extra non-CPAN modules, skipping what could already be obtained upstream.

And the winner is…

CPAN::Repository

In the end, CPAN::Repository did everything I wanted, and with a fairly small number of dependencies.1 The absolute minimal way to do this is with something like the following:

perl -MCPAN::Repository -E '
    CPAN::Repository->new( dir => "vendor/cache" )
        ->add_author_distribution(@ARGV)
' SOMEAUTHOR Some-Distro-1.337.tar.gz

This will generate the following file structure:

vendor/
└── cache
    ├── authors
    │  ├── 01mailrc.txt
    │  ├── 01mailrc.txt.gz
    │  └── id
    │      └── S
    │          └── SO
    │              └── SOMEAUTHOR
    │                  └── Some-Distro-1.337.tar.gz
    └── modules
        ├── 02packages.details.txt
        ├── 02packages.details.txt.gz
        └── 02STAMP

which can then be used following the same instructions as in my initial post.

The use case

What motivated this was work on a codebase that depends on some CPAN distributions that are currently undermaintained.

When possible, the first choice is of course to move away from those dependencies to other alternatives. Maybe even alternatives maintained by yourself or the organisation you work for.

But in some cases, moving away from a dependency can be too much work, or you might be unwilling or unable to commit to maintaining a new distribution (and having two unmaintained distributions instead of one helps no one).

In this case, the solution I was investigating was whether we could create a fork of the distributions in question, and go through the regular release cycle… except without a publicly indexed release.

This has all the good things we know that come with a proper release: version numbers that can be tracked and maintained using your regular cpanfile; auditable change logs; etc. But all of that without the commitment to third parties.

To this end, I ended up writing a small script that would wrap around those CPAN::Repository calls and make things a little more extensible and robust.

The script takes a mandatory list of possible URIs (which can either point to external repositories on providers like Github or GitLab, or directly to distribution tarballs) and some options to define how that distribution should be indexed. It adds some checks to encourage the use of proper releases, but these are designed to be easily skippable in case you feel like your foot could use a bullet-hole.

And that’s pretty much it! All I need now is to convince the rest of my team that this is a reasonable way to go. But even if we end up not using this, I’m sure this will come in handy.

All in all, I’m glad I got a chance to take another stab at this problem, and that the solution was as simple as it was.

  1. Installing on a fresh Perl like with Pinto, it installed 56 distributions in total. This is not tiny but it’s also not the end of the world.