For quite a while, Lintian has been able to create performance logs (
--perf-debug --perf-output perf.log) that help diagnose where lintian spends most of its runtime. I decided to make lintian output these logs on lintian.debian.org to help us spot performance issues, though I have not been very good at analysing them regularly.
At the beginning of the month, I finally got around to look a bit into one of them. My findings on IRC triggered Mattia Rizzolo to create this following graph. It shows the accumulated runtime of each check/collection measured in seconds. Find these findings, we set out to solve some of the issues. This lead to the following changes in 2.5.33 (in no particular order):
- Increased buffer size in check/cruft.pm to reduce overhead [bc8b3e5] (S)
- Reduced overhead in strings(1) extraction [b058fef] (P)
- Reduced overhead in spell-checking [b824170] (S)
- Also improves the performance of spellintian!
- Removed a high overhead check that did not work [2c7b922] (P)
Legend: S: run single threaded (1:1 performance improvement). P: run in parallel.
Overall, I doubt the changes will give a revolutionary change in speed, but it should improve the 3rd, 4th and 5th slowest parts in Lintian.
Beyond runtime performance, we got a few memory optimisations in the pipeline for Lintian 2.5.34:
- Remove member from “non-dir” nodes in the Lintian path graph (2%) 
- Remove two fields at the price of computing them as needed (~5%) [a696197 + 8dacc8e]
- Merge 4 fields into 1 (~8%) [5d49cd2 + fb074e4]
- Share some memory between package-based caches (18%) [ffc7174]
Combined these 6 commits reduce memory consumption in caches by ~33% compared to 2.5.33, when lintian processes itself. In absolute numbers, we are talking about a drop from 12.53MB to 8.48MB. The mileages can certainly vary depending on the package (mscgen “only” saw an ~25% improvement). Nevertheless, I was happy to list #715035 as being closed in 2.5.34. 🙂