Introducing dak auto-decruft

Debian now have over 22 000 source packages and 45 500 binary packages.  To counter that, the FTP masters and I have created a dak tool to automatically remove packages from unstable!  This is also much more efficient than only removing them from testing! :)

 

The primary goal of the auto-decrufter is to remove a regular manual work flow from the FTP masters.  Namely, the removal of the common cases of cruft, such as “Not Built from Source” (NBS) and “Newer Version In Unstable” (NVIU).  With the auto-decrufter in place, such cruft will be automatically removed when there are no reverse dependencies left on any architecture and nothing Build-Depends on it any more.

Despite the implication in the “opening” of this post, this will in fact not substantially reduce the numbers of packages in unstable. :) Nevertheless, it is still very useful for the FTP masters, the release team and packaging Debian contributors.

The reason why the release team benefits greatly from this tool, is that almost every transition generates one piece of “NBS”-cruft.  Said piece of cruft currently must be  removed from unstable before the transition can progress into its final phase.  Until recently that removal has been 100% manual and done by the FTP masters.

The restrictions on auto-decrufter means that we will still need manual decrufts. Notably, the release team will often complete transitions even when some reverse dependencies remain on non-release architectures.  Nevertheless, it is definitely an improvement.

 

Omelettes and eggs: As an old saying goes “You cannot make an omelette without breaking eggs”.  Less so when the only “test suite” is production.  So here are some of the “broken eggs” caused by implementation of the auto-decrufter:

  • About 30 minutes of “dak rm” (without –no-action) would unconditionally crash.
  • A broken dinstall when “dak auto-decruft” was run without “–dry-run” for the first time.
  • A boolean condition inversion causing removals to remove the “override” for partial removals (and retain it for “full” removals).
    • Side-effect, this broke Britney a couple of times because dak now produced some “unexpected” Packages files for unstable.
  • Not to mention the “single digit bug closure” bug.

Of the 3, the boolean inversion was no doubt the worst.  By the time we had it fixed, at least 50 (unique) binary packages had lost their “override”.  Fortunately, it was possible to locate these issues using a database query and they have now been fixed.

Before I write any more non-trivial patches for dak, I will probably invest some time setting up a basic test framework for dak first.

 

Posted in Debian, Release-Team | Leave a comment

Intermission on the day of the Jessie release

During the “intermission” on the day of the Jessie release, Julien, Ivo, AJ and I spent some time improving Britney2.  Due to said intermission, we can proudly say that from the very first run for Stretch, Britney2 was:

  • running on Python3k.  Kudos to Julien and Ivo.
  • doing consistency checks of identical packages present in two more suites.  Kudos to AJ.
    • Said checking is mostly useful for catching silly mistakes in our test suite.  A very welcome change, as data inconsistency plus hash randomisation (default in Python3k) caused some of the tests to fail sporadically.
    • Also, many thanks to AJ for providing a patch for #709460.  Sadly, I do not remember if we managed to merge that prior to the first Stretch run.
  • outputting some statistics about the “package graph” and some performance counters from its installability tester.

The performance counters are mostly interesting for me, when I mess with the installability tester.  A couple of backtrack related numbers from the Britney run early today:

  • 77 078 times, Britney would create a “full restore point” and recurse.
    • In 10 of those cases, she would reject the guess, backtrack back to the restore point and move on to the next guess.
    • In the remaining 77 068 times, she would accept the candidate (and thereby solve the query).
      • NB: This number is not directly visible and has to be computed manually.  It is possible for Britney to do multiple “accept”-recursions for the same query.
  • 52 times, she would have exhausted all but one option.  In this case, she simply goes “all-in” and skips the restore point.
  • 54 618 times, she would accept the guess using a partial restore point without needing to recurse.
  • An (sadly) uncounted number of times, she would reject the guess using a partial restore point without needing to recurse.

Furthermore, about 82% of the  ~577 000 times Britney called “is_installable” in this run, the installability tester answered with a cached result.  I guess it was a trivial run. :)

Posted in Uncategorized | Leave a comment

The release of Debian Jessie from an RM’s PoV

It was quite an experience to partake in the Jessie release – and also a rather long “Saturday”.  This post is mostly a time line of how I spent my release day with doing the actual release.  I have glossed over some details – the post is long enough without these. :)

We started out at 8 (UTC) with a final “dinstall” run, which took nearly 2 hours.  It was going to take longer, but we decided to skip the synchronisation to “coccia.debian.org” (the server hosting the DD-accessible mirror of release.debian.org).

The release itself started with the FTP masters renaming the aliases of Squeeze, Wheezy and Jessie to oldoldstable, oldstable and stable respectively.   While they worked, the release team reviewed and double checked their work.  After an hour (~11), the FTP masters reported that the stable releases were ready for the final review and the SRMs signed the relevant “Release” files.

Then the FTP masters pushed the stable releases to our CD build server, where Steve McIntyre started building the installation images.  While Steve started with the CDs, the FTP masters and the release team continued with creating a suite for Stretch.  On the FTP/release side, we finished shortly before 12:30.  At this point, our last ETA from Steve suggested that the installation media would take another 11 and a half hours to complete.  We could have opened for mirror synchronisation then, but we decided to wait for the installation media.

At 12:30, there was a long “intermission” for the release team in the release process.  That was an excellent time to improve some of our tools, but that is for another post. :)

We slowly started to resume around 22:20, where we tried to figure out when to open for the mirror synchronisation to time it with the installation media.  We agreed to start the mirror sync at 23:00 despite the installation media not being completely done then.  They followed half an hour later, when Steve reported that the last CD was complete.

At this point, “all” that was left was to update the website and send out the press announcement.  Sadly, we were hit by some (minor) issues then.  First, I had underestimated the work involved in updating the website. Secondly, we had no one online at the time to trigger an “out of band” rebuild of the website.  Steve and I spent an hour and a half solving website issues (like arm64 and ppc64el not being listed as a part of the release).  Unsurprisingly, I decided to expand our the “release checklist” to be slightly more verbose on this particular topic.

My “Saturday” had passed its 16th hour, when I thought we had fixed all the website issues (of course, I would be wrong) and we would now just be waiting for the an automatic rebuild.  I was tempted to just punt it and go to bed, when Paul Wise rejoined us at about 01:25.  He quickly got up to speed and offered to take care of the rest.  An offer I thankfully accepted and I checked out 15 minutes later at 01:40 UTC.

That more or less covers the Jessie release day from my PoV.  After a bit of reflection inside the release team, we have found several points where we can improve the process.  This part certainly deserves its own post as well, which will also give us some time to flesh out some of the ideas a bit more. :)

Posted in Debian, Release-Team | 2 Comments

Jessie is coming the 2015-04-25

Indeed, we settled on a release date for Jessie – and pretty quick too.  I sent out a poll on the 28th of March and yesterday, it was clear that the 25th of April was our release date. :)

With that said, we still have some items left that needs to be done.

  • Finishing the release notes.  This is mostly pending myself and a few others.
  • Translation of the release-notes.  I sent out a heads up earlier today about what sections I believe to be done.
  • The d-i team got another release planned as well.
  • All the RC bugs you can manage to fix before the 18th of April. :)
Posted in Debian, Release-Team | 7 Comments

Imminent steep decline in RC bugs affecting Jessie – need more RC bug fixes

Earlier today, I posted a mail to debian-devel about how approximately 25 RC bugs affecting Jessie have been unblocked.  As mentioned, I planned to age some of them.  The expected result is that about 18 of them will migrate tonight and the remaining 7 of them will migrate tomorrow night.

After that, there are no more RC bugs waiting for the RT to unblock them!  The only remaining item on the list is cgmanager, for which we are requesting a t-p-u (maintainer already contacted about it).  If you want a release sooner, please have a look at the list of remaining RC bugs or/and start testing upgrades.

In other news, the glibc regression got fixed.  The new version of glibc has already been approved by us.  It is now waiting for the debian-installer team to testing it and approve it.

Posted in Debian, Release-Team | 1 Comment

Partial rewrite of lintian’s reporting setup

I had the mixed pleasure of doing a partial rewrite of lintian’s reporting framework.  It started as a problem with generating the graphs, which turned out to be “not enough memory”. On the plus side, I am actually quite pleased with the end result.  I managed to “scope-creep” myself quite a bit and I ended up getting rid of a lot of old issues.

The major changes in summary:

  • A lot of logic was moved out of harness, meaning it is now closer to becoming a simple “dumb” task scheduler.  With the logic being moved out in separate processes, harness now hogs vastly less memory that I cannot convince perl to release to the OS.  On lilburn.debian.org “vastly less” is on the order of reducing “700ish MB” to “32 MB”.
  • All important metadata was moved into the “harness state-cache”, which is a simple YAML file. This means that “Lintian laboratory” is no longer a data store. This change causes a lot of very positive side effects.
  • With all metadata now stored in a single file, we can now do atomic updates of the data store. That said, this change itself does not enable us to run multiple lintian’s in parallel.
  • As the lintian laboratory is no longer a data store, we can now do our processing in “throw away laboratories” like the regular lintian user does.  As the permanent laboratory is the primary source of failure, this removes an entire class of possible problems.

There are also some nice minor “features”:

  • Packages can now be “up to date” in the generated reports.  Previously, they would always be listed as “out of date” even if they were up to date.  This is the only end user/website-visitor visible change in all of this (besides the graphs are now working again \o/).
  • The size of the harness work list is no longer based on the number of changes to the archive.
  • The size of the harness work list can now be changed with a command line option and is no longer hard coded to 1024.  However, the “time limit” remains hard coded for now.
  • The “full run” (and “clean run”) now simply marks everything “out-of-date” and processes its (new) backlog over the next (many) harness runs.  Accordingly, a full-run no longer causes lintian to run 5-7 days on lilburn.d.o before getting an update to the website.  Instead we now get incremental updates.
  • The “harness.log” now features status updates from lintian as they happen with “processed X successfully” or “error processing Y” plus a little wall time benchmark.  With this little feature I filed no less than 3 bugs against lintian – 2 of which are fixed in git.  The last remains unfixed but can only be triggered in Debian stable.
  • It is now possible with throw-away labs to terminate the lintian part of a reporting run early with minimal lost processing.  Since the lintian-harness is regular fed status updates from lintian, we can now mark successfully completed entries as done even if lintian does not complete its work list.  Caveat: There may be minor inaccuracies in the generated report for the particular package lintian was processing when it was interrupted.  This will fix itself when the package is reprocessed again.
  • It is now vastly easier to collect new meta data to be used in the reports.  Previously, they had to be included in the laboratory and extracted from there.  Now, we just have to fit it into a YAML file.  In fact, I have been considering to add the “wall time” and make a “top X slowest” page.
  • It is now possible to generate the html pages with only a “state-cache” and the “lintian.log” file.  Previously, it also required a populated lintian laboratory.

As you can probably tell, I am quite pleased with the end result.  The reporting framework lacks behind in development, since it just “sits there and takes care of itself”.  Also with the complete lack of testing, it also suffers from the “if it is not broken, then do not fix it” paradigm (because we will not notice if we broke until it is too late).

Of course, I managed to break the setup a couple of times in the process.  However, a bonus feature of the reporting setup is that if you break it, it simply leaves an outdated report on the website.

Anyway, enjoy. :)

Posted in Debian, Lintian | Leave a comment

Status on Jessie (December 2014)

Here is a slightly overdue status on Jessie.

  • We are not ready to set a date on a Jessie release yet.  Even if we were, it would be unlikely that said date would be in January.  Accordingly, it is safe to assume that the rapid automatic removals clause of the freeze policy will be applied before our release.  If your package (or a package you use) depend on any package in this list, then they are at risk.
  • It is unclear to me whether our CD1s are able to contain all the necessary packages.  This was a major issue for Wheezy.  So far I have only done some minor prodding here.  However, I really want this item done soon as it can easily take a month or more to resolve this.
  • While the release notes have been improved quite a bit, I am certain that there are more cases we can cover.  As an example, I am looking for input on #773386.
  • Despite the declining number of RC bugs, we still have some particular unhealthy ones left.  E.g. #759530.  There is also a view of some of the known unfixed Jessie blockers.
  • We have had a number of issues with APT and dpkg (e.g. trigger cycles, APT breaking under puppet) that caused upgrade failures or were a severe regression.  Most of these have been resolved now.  There are some trigger issues left and I have pushed for a fixed dpkg at the cost of possibly removing some packages. See #772866 (among others).

Stricter freeze policy per January 5th

The next timed change of the freeze policy will apply per January 5th.  After that date, we will only accept RC bugs fixes.  Which means it is final chance for translation updates.

More on RC bugs

In absolute numbers, the RC bugs have declined quite well.  We are below 150 now.  We lost quite a bit of traction in December compared to November. However, November was an extremely efficient month.  However, we still need the final push here.

Debian installer release pending

Yesterday, we received a list of packages that needed to be unblocked for d-i with a remark that a release of d-i might follow.  Based on what we have unblocked previously, it will likely contain some (improved?) UEFI support.

Pending Debian 7.8 release

While not directly relevant to Jessie, we also got a pending Wheezy release planned for the 10th of January. The window for getting changes into the 7.8 release closes this weekend.

Want to help?

  • File bugs against release-notes (source at [RN source]) and installation-guide for missing or outdated documentation.  Patches and drafts especially welcome.
  • Fix RC bugs – especially the known Jessie blockers.
  • Test upgrade paths and installation media.  Although in both cases, you may want to wait a bit (for the new dpkg to migrate and for the new debian-installer release respectively).
  • Consider offering your help to teams such as the CD team or debian-installer team.

Thank you,

[RN source]:

https://anonscm.debian.org/viewvc/ddp/manuals/trunk/release-notes/

svn co https://anonscm.debian.org/viewvc/ddp/manuals/trunk/release-notes/

Git Repo: http://anonscm.debian.org/cgit/users/jcristau/release-notes.git/

Posted in Debian, Release-Team | 3 Comments