This year I had one goal for CPAN Testers:
Replace the current Metabase API with
a new API that did not write to Amazon SimpleDB. The current
high-availability database that raw incoming test reports are written is
Amazon SimpleDB behind an API called Metabase.
Metabase is a highly-flexible data
storage API designed to work with massive, unstructured data sets and
still allow for sane organization and storage of data. Unfortunately,
Amazon SimpleDB is as it says on the tin: Simple. Worse, it's expensive:
Like most Amazon services, it charges for usage, so there's a huge
incentive for CPAN Testers to use it as little as possible (which made
some of the code quite obtuse).
So, I made a plan to excise the Metabase. Since we already cached every
raw test report locally in the CPAN Testers MySQL database, I planned to
write a new Metabase API that wrote directly to the cache, and then
adjust the backend processing to read from the cache. I spent the better
part of a month working through all the Metabase APIs, how the data was
stored in the database, and how to translate between a simple JSON
format and the serialized Metabase objects. However, some proper schema
design prevented me from finishing this project: A single
column could not be changed to allow nulls very easily, it being a 600GB
table. The one time where a well-designed schema was a bad thing!
But then Garu, author of
came up with an idea to make a new test report format. These new reports
would have to be stored in a new place, and I discovered that MySQL had
recently started building some rich JSON
a new JSON test report format and storing it in our new
high-availability MySQL cluster seemed like a perfect solution for
storing our raw test reports.
After a few weeks of discussion, I finally realized that it would be an
easier task to make a backwards-compatible Metabase API write to the new
test report MySQL table, even though it increased the amount of work
that needed to be done:
- Complete the new test report format schema (Garu)
- Write the new backwards-compatibility Metabase API (Me)
- Write a new test report processor that writes to the old Metabase
cache tables (Joel Berger)
- Write a migration script from the old Metabase cache tables to the new
test report JSON object (?)
With that plan, I headed for Lyon.
Continue reading 2017 Perl Toolchain Summit...
At tonight's Chicago Perl Mongers Office
Hours, Ray came up
with an interesting problem. While testing all of CPAN for CPAN
Testers, how do you detect when a test is
hanging and kill it before it takes down the entire machine? How do you
simply kill a test that is taking too long? And how do you do it without
having a wholly separate watchdog program?
to execute testing jobs in parallel across multiple Perl installs. There
are a few ways we could implement timeouts, including
timeout function, or
built-in, but these must all be implemented in the child process. It'd
be nicer if we could use the parent process to watch its own children.
Continue reading Timeout for Parallel::ForkManager...
As part of the MetaCPAN hackathon,
meta::hack, I was invited to work
on the CPAN Testers integration. CPAN Testers
is a community of CPAN users who send in test reports
for CPAN modules as they are uploaded. MetaCPAN
adds a summary of those test reports to every CPAN distribution to help
you determine which module you'd most like to use. For quite a few
months, this integration was broken, and the nature of the current
integration (a SQLite database) means it is not as generally useful as
it could be.
So, I decided that the best way to improve the CPAN Testers / MetaCPAN
integration was to build a new CPAN Testers
API. This API uses the CPAN Testers
schema to expose CPAN
Testers data using a JSON API. This API is built using the Mojolicious
web framework, and an OpenAPI
Continue reading CPAN Testers Has a New API...
'Twas a night before Christmas and on the ops floor
All the servers were humming behind the closed door
The app was deployed to the servers with care
In hopes that the customers soon would be there
When from out of the phone there arose such a clatter
I sprang out of my chair to see what was the matter
"The website is down!" said the boss with a shout
"We need to make money, so figure it out!"
I logged in to the server and looked all around
But the app had no logging; no reason was found
With no other choice, I called the developer
Who said "just restart it, I'm sure that'll fix 'er"
I ran the right service, up the app came
Only to come down again and again
If there but was a way to know what was wrong
I could fix it for sure, but no logging was found
Good logging is crucial for applications in production. In an emergency,
you will want it to be as easy as possible to track down problems when
they happen. With good logs you can ensure that minor bugs don't cause
major downtime and data loss problems. Good logs can help track down
security issues and can provide an auditable trail of changes to track
down who did what and when.
Log::Any is a lightweight, generic API built for interoperable
logging for CPAN modules. Much like
DBI allows interoperable database interfaces,
CHI allows interoperable caching
interfaces, and PSGI allows interoperable web
applications, Log::Any allows a CPAN module to produce logs that fit
into your environment whether you just want to see logs on your
terminal, you're using Log4perl
to directly send e-mail alerts to your operations team, or you're using
a local rsyslog daemon to transmit logs to an
Continue reading Yuletide Logging...
Last week, I attended meta::hack, the MetaCPAN hackathon
in Chicago. I'm the maintainer for CPAN Testers, the
central database for CPAN users to send in test reports on CPAN distributions
and one of MetaCPAN's data sources. I asked to join them so I could improve how
MetaCPAN consumes CPAN Testers data, and ensure the stability and reliability
of that consumption.
Here's a detailed log of what I was able to accomplish, and information on the
new development of CPAN Testers.
Continue reading meta::hack log...