This year I had one goal for CPAN Testers:
Replace the current Metabase API with
a new API that did not write to Amazon SimpleDB. The current
high-availability database that raw incoming test reports are written is
Amazon SimpleDB behind an API called Metabase.
Metabase is a highly-flexible data
storage API designed to work with massive, unstructured data sets and
still allow for sane organization and storage of data. Unfortunately,
Amazon SimpleDB is as it says on the tin: Simple. Worse, it's expensive:
Like most Amazon services, it charges for usage, so there's a huge
incentive for CPAN Testers to use it as little as possible (which made
some of the code quite obtuse).
So, I made a plan to excise the Metabase. Since we already cached every
raw test report locally in the CPAN Testers MySQL database, I planned to
write a new Metabase API that wrote directly to the cache, and then
adjust the backend processing to read from the cache. I spent the better
part of a month working through all the Metabase APIs, how the data was
stored in the database, and how to translate between a simple JSON
format and the serialized Metabase objects. However, some proper schema
design prevented me from finishing this project: A single
column could not be changed to allow nulls very easily, it being a 600GB
table. The one time where a well-designed schema was a bad thing!
But then Garu, author of
came up with an idea to make a new test report format. These new reports
would have to be stored in a new place, and I discovered that MySQL had
recently started building some rich JSON
a new JSON test report format and storing it in our new
high-availability MySQL cluster seemed like a perfect solution for
storing our raw test reports.
After a few weeks of discussion, I finally realized that it would be an
easier task to make a backwards-compatible Metabase API write to the new
test report MySQL table, even though it increased the amount of work
that needed to be done:
- Complete the new test report format schema (Garu)
- Write the new backwards-compatibility Metabase API (Me)
- Write a new test report processor that writes to the old Metabase
cache tables (Joel Berger)
- Write a migration script from the old Metabase cache tables to the new
test report JSON object (?)
With that plan, I headed for Lyon.
Continue reading 2017 Perl Toolchain Summit...
I was linked to this article after a discussion that was triggered by
a Tweet: https://twitter.com/shadowcat_mst/status/852265380156510214
In this article, the author describes a group called "weird nerds",
later renamed "hackers", and goes through some of the reasons why this
group is rejecting new members of their community (namely "brogrammers"
and "geek feminists", a false equivalence if ever there was one).
As someone who fits the author's idea of a hacker (the classical
definition of hacker, not someone who breaks into computers), and yet
has never felt like part of the hacker community, there are a lot of
things in here that are bad, but I'll comment for now on a couple
Continue reading Nerds Rejecting Nerds...
At tonight's Chicago Perl Mongers Office
Hours, Ray came up
with an interesting problem. While testing all of CPAN for CPAN
Testers, how do you detect when a test is
hanging and kill it before it takes down the entire machine? How do you
simply kill a test that is taking too long? And how do you do it without
having a wholly separate watchdog program?
to execute testing jobs in parallel across multiple Perl installs. There
are a few ways we could implement timeouts, including
timeout function, or
built-in, but these must all be implemented in the child process. It'd
be nicer if we could use the parent process to watch its own children.
Continue reading Timeout for Parallel::ForkManager...
Like all subjective decisions in technology, which log level to use is
the cause of much angry debate. Worse, different logging systems use
different levels: Log4j has 6 severity
Syslog has 8 severity
levels. While both
lists of log levels come with guidance as to which level to use when,
there's still enough ambiguity to cause confusion.
When choosing a log level, it's important to know how visible you want
the message to be, how big of a problem it is, and what you want the
user to do about it. With that in mind, this is the decision tree
I follow when choosing a log level:
Continue reading Choosing a Log Level...
As part of the MetaCPAN hackathon,
meta::hack, I was invited to work
on the CPAN Testers integration. CPAN Testers
is a community of CPAN users who send in test reports
for CPAN modules as they are uploaded. MetaCPAN
adds a summary of those test reports to every CPAN distribution to help
you determine which module you'd most like to use. For quite a few
months, this integration was broken, and the nature of the current
integration (a SQLite database) means it is not as generally useful as
it could be.
So, I decided that the best way to improve the CPAN Testers / MetaCPAN
integration was to build a new CPAN Testers
API. This API uses the CPAN Testers
schema to expose CPAN
Testers data using a JSON API. This API is built using the Mojolicious
web framework, and an OpenAPI
Continue reading CPAN Testers Has a New API...