2017 Perl Toolchain Summit

Tags:

This year I had one goal for CPAN Testers: Replace the current Metabase API with a new API that did not write to Amazon SimpleDB. The current high-availability database that raw incoming test reports are written is Amazon SimpleDB behind an API called Metabase. Metabase is a highly-flexible data storage API designed to work with massive, unstructured data sets and still allow for sane organization and storage of data. Unfortunately, Amazon SimpleDB is as it says on the tin: Simple. Worse, it's expensive: Like most Amazon services, it charges for usage, so there's a huge incentive for CPAN Testers to use it as little as possible (which made some of the code quite obtuse).

So, I made a plan to excise the Metabase. Since we already cached every raw test report locally in the CPAN Testers MySQL database, I planned to write a new Metabase API that wrote directly to the cache, and then adjust the backend processing to read from the cache. I spent the better part of a month working through all the Metabase APIs, how the data was stored in the database, and how to translate between a simple JSON format and the serialized Metabase objects. However, some proper schema design prevented me from finishing this project: A single NOT NULL column could not be changed to allow nulls very easily, it being a 600GB table. The one time where a well-designed schema was a bad thing!

But then Garu, author of cpanm-reporter and CPAN::Testers::Common::Client came up with an idea to make a new test report format. These new reports would have to be stored in a new place, and I discovered that MySQL had recently started building some rich JSON tooling. Making a new JSON test report format and storing it in our new high-availability MySQL cluster seemed like a perfect solution for storing our raw test reports.

After a few weeks of discussion, I finally realized that it would be an easier task to make a backwards-compatible Metabase API write to the new test report MySQL table, even though it increased the amount of work that needed to be done:

  • Complete the new test report format schema (Garu)
  • Write the new backwards-compatibility Metabase API (Me)
  • Write a new test report processor that writes to the old Metabase cache tables (Joel Berger)
  • Write a migration script from the old Metabase cache tables to the new test report JSON object (?)

With that plan, I headed for Lyon.

Continue reading 2017 Perl Toolchain Summit...

Timeout for Parallel::ForkManager

Tags:

At tonight's Chicago Perl Mongers Office Hours, Ray came up with an interesting problem. While testing all of CPAN for CPAN Testers, how do you detect when a test is hanging and kill it before it takes down the entire machine? How do you simply kill a test that is taking too long? And how do you do it without having a wholly separate watchdog program?

Ray's using Parallel::ForkManager to execute testing jobs in parallel across multiple Perl installs. There are a few ways we could implement timeouts, including IPC::Run's timeout function, or the alarm Perl built-in, but these must all be implemented in the child process. It'd be nicer if we could use the parent process to watch its own children.

Continue reading Timeout for Parallel::ForkManager...

CPAN Testers Has a New API

Tags:

As part of the MetaCPAN hackathon, meta::hack, I was invited to work on the CPAN Testers integration. CPAN Testers is a community of CPAN users who send in test reports for CPAN modules as they are uploaded. MetaCPAN adds a summary of those test reports to every CPAN distribution to help you determine which module you'd most like to use. For quite a few months, this integration was broken, and the nature of the current integration (a SQLite database) means it is not as generally useful as it could be.

So, I decided that the best way to improve the CPAN Testers / MetaCPAN integration was to build a new CPAN Testers API. This API uses the CPAN Testers schema to expose CPAN Testers data using a JSON API. This API is built using the Mojolicious web framework, and an OpenAPI specification (using Mojolicious::Plugin::OpenAPI.

Continue reading CPAN Testers Has a New API...

Yuletide Logging

Tags:

'Twas a night before Christmas and on the ops floor
All the servers were humming behind the closed door
The app was deployed to the servers with care
In hopes that the customers soon would be there
When from out of the phone there arose such a clatter
I sprang out of my chair to see what was the matter
"The website is down!" said the boss with a shout
"We need to make money, so figure it out!"
I logged in to the server and looked all around
But the app had no logging; no reason was found
With no other choice, I called the developer
Who said "just restart it, I'm sure that'll fix 'er"
I ran the right service, up the app came
Only to come down again and again
If there but was a way to know what was wrong
I could fix it for sure, but no logging was found

Good logging is crucial for applications in production. In an emergency, you will want it to be as easy as possible to track down problems when they happen. With good logs you can ensure that minor bugs don't cause major downtime and data loss problems. Good logs can help track down security issues and can provide an auditable trail of changes to track down who did what and when.

Log::Any is a lightweight, generic API built for interoperable logging for CPAN modules. Much like DBI allows interoperable database interfaces, CHI allows interoperable caching interfaces, and PSGI allows interoperable web applications, Log::Any allows a CPAN module to produce logs that fit into your environment whether you just want to see logs on your terminal, you're using Log4perl to directly send e-mail alerts to your operations team, or you're using a local rsyslog daemon to transmit logs to an ElasticSearch instance via Logstash.

Continue reading Yuletide Logging...

meta::hack log

Tags:

Last week, I attended meta::hack, the MetaCPAN hackathon in Chicago. I'm the maintainer for CPAN Testers, the central database for CPAN users to send in test reports on CPAN distributions and one of MetaCPAN's data sources. I asked to join them so I could improve how MetaCPAN consumes CPAN Testers data, and ensure the stability and reliability of that consumption.

Here's a detailed log of what I was able to accomplish, and information on the new development of CPAN Testers.

Continue reading meta::hack log...