Two weeks ago, I was invited to meta::hack v2, the second annual
MetaCPAN hackathon. As the
primary maintainer of CPAN Testers, I went to
continue improving the integration of CPAN Testers data with MetaCPAN
and generally improve the performance of CPAN Testers to the benefit of
the entire Perl ecosystem.
Continue reading CPAN Testers at meta::hack v2...
Would you like to help CPAN Testers during
meta::hack
v2? Join
us on IRC in #cpantesters-discuss on
irc.perl.org,
join our mailing list on
lists.perl.org,
or e-mail me directly at doug@preaction.me.
With meta::hack
v2 only
two weeks away, I’ve written down my todo list for the hackathon. With
another brand-new machine graciously provided by
ByteMark, who have been hosting CPAN
Testers for years, this year’s hackathon will involve more devops tasks
to improve reliability and stability of the various parts of the
project.
The new server will be the host for CPAN Testers backend
processes, the
processes that turn the raw incoming data into the various reports used
by the websites and downstream systems. It will also be the new home for
the CPAN and BackPAN
mirrors that CPAN Testers uses for data, and provides to external users
as part of CPAN’s mirrors list.
Continue reading Help CPAN Testers During meta::hack v2...
This year I had one goal for CPAN Testers:
Replace the current Metabase API with
a new API that did not write to Amazon SimpleDB. The current
high-availability database that raw incoming test reports are written is
Amazon SimpleDB behind an API called Metabase.
Metabase is a highly-flexible data
storage API designed to work with massive, unstructured data sets and
still allow for sane organization and storage of data. Unfortunately,
Amazon SimpleDB is as it says on the tin: Simple. Worse, it's expensive:
Like most Amazon services, it charges for usage, so there's a huge
incentive for CPAN Testers to use it as little as possible (which made
some of the code quite obtuse).
So, I made a plan to excise the Metabase. Since we already cached every
raw test report locally in the CPAN Testers MySQL database, I planned to
write a new Metabase API that wrote directly to the cache, and then
adjust the backend processing to read from the cache. I spent the better
part of a month working through all the Metabase APIs, how the data was
stored in the database, and how to translate between a simple JSON
format and the serialized Metabase objects. However, some proper schema
design prevented me from finishing this project: A single NOT NULL
column could not be changed to allow nulls very easily, it being a 600GB
table. The one time where a well-designed schema was a bad thing!
But then Garu, author of
cpanm-reporter and
CPAN::Testers::Common::Client
came up with an idea to make a new test report format. These new reports
would have to be stored in a new place, and I discovered that MySQL had
recently started building some rich JSON
tooling. Making
a new JSON test report format and storing it in our new
high-availability MySQL cluster seemed like a perfect solution for
storing our raw test reports.
After a few weeks of discussion, I finally realized that it would be an
easier task to make a backwards-compatible Metabase API write to the new
test report MySQL table, even though it increased the amount of work
that needed to be done:
- Complete the new test report format schema (Garu)
- Write the new backwards-compatibility Metabase API (Me)
- Write a new test report processor that writes to the old Metabase
cache tables (Joel Berger)
- Write a migration script from the old Metabase cache tables to the new
test report JSON object (?)
With that plan, I headed for Lyon.
Continue reading 2017 Perl Toolchain Summit...
At tonight's Chicago Perl Mongers Office
Hours, Ray came up
with an interesting problem. While testing all of CPAN for CPAN
Testers, how do you detect when a test is
hanging and kill it before it takes down the entire machine? How do you
simply kill a test that is taking too long? And how do you do it without
having a wholly separate watchdog program?
Ray's using
Parallel::ForkManager
to execute testing jobs in parallel across multiple Perl installs. There
are a few ways we could implement timeouts, including
IPC::Run's timeout
function, or
the alarm
Perl
built-in, but these must all be implemented in the child process. It'd
be nicer if we could use the parent process to watch its own children.
Continue reading Timeout for Parallel::ForkManager...
As part of the MetaCPAN hackathon,
meta::hack, I was invited to work
on the CPAN Testers integration. CPAN Testers
is a community of CPAN users who send in test reports
for CPAN modules as they are uploaded. MetaCPAN
adds a summary of those test reports to every CPAN distribution to help
you determine which module you'd most like to use. For quite a few
months, this integration was broken, and the nature of the current
integration (a SQLite database) means it is not as generally useful as
it could be.
So, I decided that the best way to improve the CPAN Testers / MetaCPAN
integration was to build a new CPAN Testers
API. This API uses the CPAN Testers
schema to expose CPAN
Testers data using a JSON API. This API is built using the Mojolicious
web framework, and an OpenAPI
specification (using
Mojolicious::Plugin::OpenAPI.
Continue reading CPAN Testers Has a New API...