Performance tuning of lintian, take 2

The other day, I wrote about our recent performance tuning in lintian.  Among other things, we reduced the memory usage by ~33%.  The effect was also reproducible on libreoffice (4.2.5-1 plus its 170-ish binaries, arch amd64), which started at ~515 MB and was reduced to ~342 MB.  So this is pretty great in its own right…

But at this point, I have seen what was in “Pandora’s box”. By which, I mean the two magical numbers 1.7kB per file and 2.2kB per directory in the package (add +250-300 bytes per entry in binary packages).  This is before even looking at data from file(1), readelf, etc.  Just the raw index of the package.

Depending on your point of view, 1.7-2.2kB might not sound like a lot.  But for the lintian source with ~1 500 directories and ~3 300 non-directories, this sums up to about 6.57MB out of the (then) usage at 12.53MB.  With the recent changes, it dropped to about 1.05kB for files and 1.5kB for dirs.  But even then, the index is still 4.92MB (out of 8.48MB).

This begs the question, what do you get for 1.05kB in perl? The following is a dump of the fields and their size in perl for a given entry:

lintian/vendors/ubuntu/main/data/changes-file/known-dists: 1077.00 B
  _path_info: 24.00 B
  date: 44.00 B
  group: 42.00 B
  name: 123.00 B
  owner: 42.00 B
  parent_dir: 24.00 B
  size: 42.00 B
  time: 42.00 B
  (overhead): 694.00 B

With time, date, owner and group being fixed sized strings (at most 15 characters).  The size and _path_info fields being integers, parent_dir a reference (nulled).  Finally, the name being a variable length string.  Summed the values take less than half of the total object size.  The remainder of ~700 bytes is just “overhead”.

Time for another clean up:

  • The ownership fields are usually always “root/root” (0/0).  So let’s just omit them when they satisfy said assumption. [f627ef8]
    • This is especially true for source packages where lintian ignores the actual value and just uses “root/root”.
  • The Lintian::Path API has always had a “cop-out” on the size field for non-files and it happens to be 0 for these.  Let’s omit the field if the value was zero and save 0.17MB on lintian. [5cd2c2b]
    • Bonus: Turns out we can save 18 bytes per non-zero “size” by insisting on the value being an int.
  • Unsurprisingly, the date and time fields can trivially be merged into one.  In fact, that makes “time” redundant as nothing outside Lintian::Path used its value.  So say goodbye to “time” and good day to 0.36MB more memory. [f1a7826]

Which leaves us now with:

lintian/vendors/ubuntu/main/data/changes-file/known-dists: 698.00 B
  _path_info: 24.00 B
  date_time: 56.00 B
  name: 123.00 B
  parent_dir: 24.00 B
  size: 24.00 B
  (overhead): 447.00 B

Still a ~64% overhead, but at least we reduced the total size by 380 bytes (585 bytes for entries in binary packages).  With these changes, the memory used for the lintian source index is now down to 3.62MB.  This brings the total usage down to 7.01MB, which is a reduction to 56% of the original usage (a.k.a. “the-almost-but-not-quite-50%-reduction”).

But at least the results also carried over to libreoffice, which is now down to 284.83 MB (55% of original).  The chromium-browser (source-only, version 32.0.1700.123-2) is down to 111.22MB from 179.44MB (61% of original, better results expected if processed with binaries).

 

In closing, Lintian 2.5.34 will use slightly less memory than 2.5.33.

 

Posted in Debian, Lintian | 1 Comment

Performance tuning of lintian

For quite a while, Lintian has been able to create performance logs (--perf-debug --perf-output perf.log) that help diagnose where lintian spends most of its runtime.  I decided to make lintian output these logs on lintian.debian.org to help us spot performance issues, though I have not been very good at analysing them regularly.

At the beginning of the month, I finally got around to look a bit into one of them.  My findings on IRC triggered Mattia Rizzolo to create this following graph.  It shows the accumulated runtime of each check/collection measured in seconds.  Find these findings, we set out to solve some of the issues.  This lead to the following changes in 2.5.33 (in no particular order):

  • Increased buffer size in check/cruft.pm to reduce overhead [bc8b3e5] (S)
  • Reduced overhead in strings(1) extraction [b058fef] (P)
  • Reduced overhead in spell-checking [b824170] (S)
    • Also improves the performance of spellintian!
  • Removed a high overhead check that did not work [2c7b922] (P)

Legend: S: run single threaded (1:1 performance improvement).  P: run in parallel.

Overall, I doubt the changes will give a revolutionary change in speed, but it should improve the 3rd, 4th and 5th slowest parts in Lintian.

Beyond runtime performance, we got a few memory optimisations in the pipeline for Lintian 2.5.34:

  • Remove member from “non-dir” nodes in the Lintian path graph (2%) [6365635]
  • Remove two fields at the price of computing them as needed (~5%) [a696197 + 8dacc8e]
  • Merge 4 fields into 1 (~8%) [5d49cd2 + fb074e4]
  • Share some memory between package-based caches (18%) [ffc7174]

Combined these 6 commits reduce memory consumption in caches by ~33% compared to 2.5.33, when lintian processes itself.  In absolute numbers, we are talking about a drop from 12.53MB to 8.48MB.  The mileages can certainly vary depending on the package (mscgen “only” saw an ~25% improvement).  Nevertheless, I was happy to list #715035 as being closed in 2.5.34. :)

Posted in Debian, Lintian | 1 Comment

Introducing dak auto-decruft

Debian now have over 22 000 source packages and 45 500 binary packages.  To counter that, the FTP masters and I have created a dak tool to automatically remove packages from unstable!  This is also much more efficient than only removing them from testing! :)

 

The primary goal of the auto-decrufter is to remove a regular manual work flow from the FTP masters.  Namely, the removal of the common cases of cruft, such as “Not Built from Source” (NBS) and “Newer Version In Unstable” (NVIU).  With the auto-decrufter in place, such cruft will be automatically removed when there are no reverse dependencies left on any architecture and nothing Build-Depends on it any more.

Despite the implication in the “opening” of this post, this will in fact not substantially reduce the numbers of packages in unstable. :) Nevertheless, it is still very useful for the FTP masters, the release team and packaging Debian contributors.

The reason why the release team benefits greatly from this tool, is that almost every transition generates one piece of “NBS”-cruft.  Said piece of cruft currently must be  removed from unstable before the transition can progress into its final phase.  Until recently that removal has been 100% manual and done by the FTP masters.

The restrictions on auto-decrufter means that we will still need manual decrufts. Notably, the release team will often complete transitions even when some reverse dependencies remain on non-release architectures.  Nevertheless, it is definitely an improvement.

 

Omelettes and eggs: As an old saying goes “You cannot make an omelette without breaking eggs”.  Less so when the only “test suite” is production.  So here are some of the “broken eggs” caused by implementation of the auto-decrufter:

  • About 30 minutes of “dak rm” (without –no-action) would unconditionally crash.
  • A broken dinstall when “dak auto-decruft” was run without “–dry-run” for the first time.
  • A boolean condition inversion causing removals to remove the “override” for partial removals (and retain it for “full” removals).
    • Side-effect, this broke Britney a couple of times because dak now produced some “unexpected” Packages files for unstable.
  • Not to mention the “single digit bug closure” bug.

Of the 3, the boolean inversion was no doubt the worst.  By the time we had it fixed, at least 50 (unique) binary packages had lost their “override”.  Fortunately, it was possible to locate these issues using a database query and they have now been fixed.

Before I write any more non-trivial patches for dak, I will probably invest some time setting up a basic test framework for dak first.

 

Posted in Debian, Release-Team | Leave a comment

Intermission on the day of the Jessie release

During the “intermission” on the day of the Jessie release, Julien, Ivo, AJ and I spent some time improving Britney2.  Due to said intermission, we can proudly say that from the very first run for Stretch, Britney2 was:

  • running on Python3k.  Kudos to Julien and Ivo.
  • doing consistency checks of identical packages present in two more suites.  Kudos to AJ.
    • Said checking is mostly useful for catching silly mistakes in our test suite.  A very welcome change, as data inconsistency plus hash randomisation (default in Python3k) caused some of the tests to fail sporadically.
    • Also, many thanks to AJ for providing a patch for #709460.  Sadly, I do not remember if we managed to merge that prior to the first Stretch run.
  • outputting some statistics about the “package graph” and some performance counters from its installability tester.

The performance counters are mostly interesting for me, when I mess with the installability tester.  A couple of backtrack related numbers from the Britney run early today:

  • 77 078 times, Britney would create a “full restore point” and recurse.
    • In 10 of those cases, she would reject the guess, backtrack back to the restore point and move on to the next guess.
    • In the remaining 77 068 times, she would accept the candidate (and thereby solve the query).
      • NB: This number is not directly visible and has to be computed manually.  It is possible for Britney to do multiple “accept”-recursions for the same query.
  • 52 times, she would have exhausted all but one option.  In this case, she simply goes “all-in” and skips the restore point.
  • 54 618 times, she would accept the guess using a partial restore point without needing to recurse.
  • An (sadly) uncounted number of times, she would reject the guess using a partial restore point without needing to recurse.

Furthermore, about 82% of the  ~577 000 times Britney called “is_installable” in this run, the installability tester answered with a cached result.  I guess it was a trivial run. :)

Posted in Uncategorized | Leave a comment

The release of Debian Jessie from an RM’s PoV

It was quite an experience to partake in the Jessie release – and also a rather long “Saturday”.  This post is mostly a time line of how I spent my release day with doing the actual release.  I have glossed over some details – the post is long enough without these. :)

We started out at 8 (UTC) with a final “dinstall” run, which took nearly 2 hours.  It was going to take longer, but we decided to skip the synchronisation to “coccia.debian.org” (the server hosting the DD-accessible mirror of release.debian.org).

The release itself started with the FTP masters renaming the aliases of Squeeze, Wheezy and Jessie to oldoldstable, oldstable and stable respectively.   While they worked, the release team reviewed and double checked their work.  After an hour (~11), the FTP masters reported that the stable releases were ready for the final review and the SRMs signed the relevant “Release” files.

Then the FTP masters pushed the stable releases to our CD build server, where Steve McIntyre started building the installation images.  While Steve started with the CDs, the FTP masters and the release team continued with creating a suite for Stretch.  On the FTP/release side, we finished shortly before 12:30.  At this point, our last ETA from Steve suggested that the installation media would take another 11 and a half hours to complete.  We could have opened for mirror synchronisation then, but we decided to wait for the installation media.

At 12:30, there was a long “intermission” for the release team in the release process.  That was an excellent time to improve some of our tools, but that is for another post. :)

We slowly started to resume around 22:20, where we tried to figure out when to open for the mirror synchronisation to time it with the installation media.  We agreed to start the mirror sync at 23:00 despite the installation media not being completely done then.  They followed half an hour later, when Steve reported that the last CD was complete.

At this point, “all” that was left was to update the website and send out the press announcement.  Sadly, we were hit by some (minor) issues then.  First, I had underestimated the work involved in updating the website. Secondly, we had no one online at the time to trigger an “out of band” rebuild of the website.  Steve and I spent an hour and a half solving website issues (like arm64 and ppc64el not being listed as a part of the release).  Unsurprisingly, I decided to expand our the “release checklist” to be slightly more verbose on this particular topic.

My “Saturday” had passed its 16th hour, when I thought we had fixed all the website issues (of course, I would be wrong) and we would now just be waiting for the an automatic rebuild.  I was tempted to just punt it and go to bed, when Paul Wise rejoined us at about 01:25.  He quickly got up to speed and offered to take care of the rest.  An offer I thankfully accepted and I checked out 15 minutes later at 01:40 UTC.

That more or less covers the Jessie release day from my PoV.  After a bit of reflection inside the release team, we have found several points where we can improve the process.  This part certainly deserves its own post as well, which will also give us some time to flesh out some of the ideas a bit more. :)

Posted in Debian, Release-Team | 2 Comments

Jessie is coming the 2015-04-25

Indeed, we settled on a release date for Jessie – and pretty quick too.  I sent out a poll on the 28th of March and yesterday, it was clear that the 25th of April was our release date. :)

With that said, we still have some items left that needs to be done.

  • Finishing the release notes.  This is mostly pending myself and a few others.
  • Translation of the release-notes.  I sent out a heads up earlier today about what sections I believe to be done.
  • The d-i team got another release planned as well.
  • All the RC bugs you can manage to fix before the 18th of April. :)
Posted in Debian, Release-Team | 7 Comments

Imminent steep decline in RC bugs affecting Jessie – need more RC bug fixes

Earlier today, I posted a mail to debian-devel about how approximately 25 RC bugs affecting Jessie have been unblocked.  As mentioned, I planned to age some of them.  The expected result is that about 18 of them will migrate tonight and the remaining 7 of them will migrate tomorrow night.

After that, there are no more RC bugs waiting for the RT to unblock them!  The only remaining item on the list is cgmanager, for which we are requesting a t-p-u (maintainer already contacted about it).  If you want a release sooner, please have a look at the list of remaining RC bugs or/and start testing upgrades.

In other news, the glibc regression got fixed.  The new version of glibc has already been approved by us.  It is now waiting for the debian-installer team to testing it and approve it.

Posted in Debian, Release-Team | 1 Comment