There is nothing like (missing) iptables (rules) to make you use tor

I have been fiddling with setting up both iptables and tor on my local machine.  Most of it was fairly easy to do, once I dedicated the time to actually do it. Configuring both “at the same time” also made things easier for me, but YMMV.  Regardless, it did take quite a while researching, tweaking and testing – most of that time was spent on the iptables front for me.

I ended up doing this incrementally.  The major 5 steps I went through were:

  1. Created a basic incoming (INPUT) firewall – enforcing
  2. Installed tor + torsocks and aliased a few commands to run with torsocks
  3. Created a basic outgoing (OUTPUT) firewall – permissive
  4. Make the outgoing firewall enforcing
  5. Migrate the majority of programs and services to use tor.

Some of these overlapped time-wise and I certainly revisited the configuration a couple of times.  A couple of things, that I learned:

  • You probably want to have a look at “netstat --listen -put --numeric” when you write your INPUT firewall.
  • The tor developers have tried a lot to make things easy.  It is scary how often “torsocks program [args]” just works(tm).
    • That said, it does not always work.
  • Tor and iptables (OUTPUT) can have a synergy effect on each other.
    • Notably, when it is easier to just “torsocks” a program than adding the necessary iptables rules.
  • Writing iptables rules become a lot easier once:
    • You learn how to iptables’s LOG rule
    • You use sensible-editor + iptables-restore or something like puppet’s firewall module
Posted in Debian | Tagged , | 1 Comment

With 3 months of automatic decrufting in unstable

After 3 months of installing an automatic decrufter in DAK, it:

  • has removed 689 cruft items from unstable and experimental
    • average removal rate being just shy of 230 cruft items/month
  • has become the “top 11th remover”.
  • is expected to become top 10 in 6 days from now and top 9 in 10 days.
    • This is assuming a continued average removal rate of 7.6 cruft items per day

On a related note, the FTP masters have removed 28861 items between 2001 and now.  The average being 2061 items a year (not accounting for the current year still being open). Though, intriguingly, in 2013 and 2014 the FTP masters removed 3394 and 3342 items.  With the (albeit limited) stats from the auto-decrufter, we can estimate that about 2700 of those were cruft items.

One could certainly also check the removal messages and check for the common “tags” used in cruft removals.  I leave that as an exercise to the curious readers, who are not satisfied with my estimate. :)

Posted in Debian, Release-Team | 1 Comment

The gcc-5 transition is coming to testing tonight

Thanks to hard work of Adam, Julien, Jonathan, Matthias, Scott, Simon and many others, the GCC-5/libstdc++ transition has progressed to a state, where we are ready to migrate the bulk of it to testing.

It should be a mostly smooth ride.  However, there will a few packages that are going to be uninstallable in testing for a few days and some packages will be temporarily removed from testing.  If APT is unable to provide you with an upgrade for all of your packages, please try again in a few days.  We apologise for the inconvenience.

Currently, we expect about 36 binary packages to become temporarily uninstallable on amd64 and 34 on i386.  This involves Britney accepting at least 4800 “change items” on testing (some of these are removals).  Many thanks to Julien for providing a proposed set of hints and Adam extending them.

Update: We now got a list of the packages being removed and a list of packages becoming uninstallable.  It will be available on debian-devel within 20 minutes from now.

Posted in Debian | 1 Comment

I accidentally dak

So, yesterday, I “unbroke” dak – twice even! It is of course slightly less awesome that one of the broken parts was in code written by yours truly.  Anyhow:


Unbreaking the dak auto-decrufter

You may remember the auto-decrufter, which I added to dak.  As a safety measure, it bails out when in doubt about which removal breaks what package.  Turns out it was often in doubt, because the code had a bug.  Of course, nothing that could not be solved with a patch.

Thanks to Ansgar for merging this. :)

Unbreaking dak generate-releases

As a part of migrating apt-file to use APTs new acquire system (from experimental), I learned APT really likes having checksums for everything.  Now including checksums for both the compressed file and the uncompressed file.

Sadly, dak had optimised out the uncompressed checksums for Contents files, but even after removing that optimisation (and Ganneff unbreaking my dinstall breakage) some Contents files still did not have an checksum for the uncompressed Contents file.  After some sophisticated debugging (read: “printf-debugging”), I finally discovered the issue and submitted a patch.

Thanks to Ansgar and Ganneff for merging (and fixing my dinstall breakage).


Posted in Debian | 1 Comment

Performance tuning of lintian, take 2

The other day, I wrote about our recent performance tuning in lintian.  Among other things, we reduced the memory usage by ~33%.  The effect was also reproducible on libreoffice (4.2.5-1 plus its 170-ish binaries, arch amd64), which started at ~515 MB and was reduced to ~342 MB.  So this is pretty great in its own right…

But at this point, I have seen what was in “Pandora’s box”. By which, I mean the two magical numbers 1.7kB per file and 2.2kB per directory in the package (add +250-300 bytes per entry in binary packages).  This is before even looking at data from file(1), readelf, etc.  Just the raw index of the package.

Depending on your point of view, 1.7-2.2kB might not sound like a lot.  But for the lintian source with ~1 500 directories and ~3 300 non-directories, this sums up to about 6.57MB out of the (then) usage at 12.53MB.  With the recent changes, it dropped to about 1.05kB for files and 1.5kB for dirs.  But even then, the index is still 4.92MB (out of 8.48MB).

This begs the question, what do you get for 1.05kB in perl? The following is a dump of the fields and their size in perl for a given entry:

lintian/vendors/ubuntu/main/data/changes-file/known-dists: 1077.00 B
  _path_info: 24.00 B
  date: 44.00 B
  group: 42.00 B
  name: 123.00 B
  owner: 42.00 B
  parent_dir: 24.00 B
  size: 42.00 B
  time: 42.00 B
  (overhead): 694.00 B

With time, date, owner and group being fixed sized strings (at most 15 characters).  The size and _path_info fields being integers, parent_dir a reference (nulled).  Finally, the name being a variable length string.  Summed the values take less than half of the total object size.  The remainder of ~700 bytes is just “overhead”.

Time for another clean up:

  • The ownership fields are usually always “root/root” (0/0).  So let’s just omit them when they satisfy said assumption. [f627ef8]
    • This is especially true for source packages where lintian ignores the actual value and just uses “root/root”.
  • The Lintian::Path API has always had a “cop-out” on the size field for non-files and it happens to be 0 for these.  Let’s omit the field if the value was zero and save 0.17MB on lintian. [5cd2c2b]
    • Bonus: Turns out we can save 18 bytes per non-zero “size” by insisting on the value being an int.
  • Unsurprisingly, the date and time fields can trivially be merged into one.  In fact, that makes “time” redundant as nothing outside Lintian::Path used its value.  So say goodbye to “time” and good day to 0.36MB more memory. [f1a7826]

Which leaves us now with:

lintian/vendors/ubuntu/main/data/changes-file/known-dists: 698.00 B
  _path_info: 24.00 B
  date_time: 56.00 B
  name: 123.00 B
  parent_dir: 24.00 B
  size: 24.00 B
  (overhead): 447.00 B

Still a ~64% overhead, but at least we reduced the total size by 380 bytes (585 bytes for entries in binary packages).  With these changes, the memory used for the lintian source index is now down to 3.62MB.  This brings the total usage down to 7.01MB, which is a reduction to 56% of the original usage (a.k.a. “the-almost-but-not-quite-50%-reduction”).

But at least the results also carried over to libreoffice, which is now down to 284.83 MB (55% of original).  The chromium-browser (source-only, version 32.0.1700.123-2) is down to 111.22MB from 179.44MB (61% of original, better results expected if processed with binaries).


In closing, Lintian 2.5.34 will use slightly less memory than 2.5.33.


Posted in Debian, Lintian | 1 Comment

Performance tuning of lintian

For quite a while, Lintian has been able to create performance logs (--perf-debug --perf-output perf.log) that help diagnose where lintian spends most of its runtime.  I decided to make lintian output these logs on to help us spot performance issues, though I have not been very good at analysing them regularly.

At the beginning of the month, I finally got around to look a bit into one of them.  My findings on IRC triggered Mattia Rizzolo to create this following graph.  It shows the accumulated runtime of each check/collection measured in seconds.  Find these findings, we set out to solve some of the issues.  This lead to the following changes in 2.5.33 (in no particular order):

  • Increased buffer size in check/ to reduce overhead [bc8b3e5] (S)
  • Reduced overhead in strings(1) extraction [b058fef] (P)
  • Reduced overhead in spell-checking [b824170] (S)
    • Also improves the performance of spellintian!
  • Removed a high overhead check that did not work [2c7b922] (P)

Legend: S: run single threaded (1:1 performance improvement).  P: run in parallel.

Overall, I doubt the changes will give a revolutionary change in speed, but it should improve the 3rd, 4th and 5th slowest parts in Lintian.

Beyond runtime performance, we got a few memory optimisations in the pipeline for Lintian 2.5.34:

  • Remove member from “non-dir” nodes in the Lintian path graph (2%) [6365635]
  • Remove two fields at the price of computing them as needed (~5%) [a696197 + 8dacc8e]
  • Merge 4 fields into 1 (~8%) [5d49cd2 + fb074e4]
  • Share some memory between package-based caches (18%) [ffc7174]

Combined these 6 commits reduce memory consumption in caches by ~33% compared to 2.5.33, when lintian processes itself.  In absolute numbers, we are talking about a drop from 12.53MB to 8.48MB.  The mileages can certainly vary depending on the package (mscgen “only” saw an ~25% improvement).  Nevertheless, I was happy to list #715035 as being closed in 2.5.34. :)

Posted in Debian, Lintian | 1 Comment

Introducing dak auto-decruft

Debian now have over 22 000 source packages and 45 500 binary packages.  To counter that, the FTP masters and I have created a dak tool to automatically remove packages from unstable!  This is also much more efficient than only removing them from testing! :)


The primary goal of the auto-decrufter is to remove a regular manual work flow from the FTP masters.  Namely, the removal of the common cases of cruft, such as “Not Built from Source” (NBS) and “Newer Version In Unstable” (NVIU).  With the auto-decrufter in place, such cruft will be automatically removed when there are no reverse dependencies left on any architecture and nothing Build-Depends on it any more.

Despite the implication in the “opening” of this post, this will in fact not substantially reduce the numbers of packages in unstable. :) Nevertheless, it is still very useful for the FTP masters, the release team and packaging Debian contributors.

The reason why the release team benefits greatly from this tool, is that almost every transition generates one piece of “NBS”-cruft.  Said piece of cruft currently must be  removed from unstable before the transition can progress into its final phase.  Until recently that removal has been 100% manual and done by the FTP masters.

The restrictions on auto-decrufter means that we will still need manual decrufts. Notably, the release team will often complete transitions even when some reverse dependencies remain on non-release architectures.  Nevertheless, it is definitely an improvement.


Omelettes and eggs: As an old saying goes “You cannot make an omelette without breaking eggs”.  Less so when the only “test suite” is production.  So here are some of the “broken eggs” caused by implementation of the auto-decrufter:

  • About 30 minutes of “dak rm” (without –no-action) would unconditionally crash.
  • A broken dinstall when “dak auto-decruft” was run without “–dry-run” for the first time.
  • A boolean condition inversion causing removals to remove the “override” for partial removals (and retain it for “full” removals).
    • Side-effect, this broke Britney a couple of times because dak now produced some “unexpected” Packages files for unstable.
  • Not to mention the “single digit bug closure” bug.

Of the 3, the boolean inversion was no doubt the worst.  By the time we had it fixed, at least 50 (unique) binary packages had lost their “override”.  Fortunately, it was possible to locate these issues using a database query and they have now been fixed.

Before I write any more non-trivial patches for dak, I will probably invest some time setting up a basic test framework for dak first.


Posted in Debian, Release-Team | 2 Comments