Optimizing autodie

After converting Lintian to using autodie, I was a bit trouble by the effects it had on Lintian’s (start-up) performance.  This overhead accumulated to at least a 1s when checking packages.  Indeed, I am not the first to complain about this problem.

For those who have not heard of it, autodie is a Perl “pragma” that basically alters subs and built-ins from a “check return value for errors”-sub (e.g. like C) to “throws exception on error”-sub (e.g. like Python).  So, you can replace “open(…) or die” with  “open(…)” when autodie is in effect.

autodie achieves this by exporting a little sub that wraps the target sub (or built-in).  This little wrapper is compiled[1] into the importing package.  The problem with this solution is that it ends up compiling these subs once per importing package.

Secondly, all of these wrappers are wrapped in something called a “leak guard”.  As autodie is a “lexical pragma” it is not permitted to leak “across” file boundaries.  The leak guards take care of that by doing a runtime check of the caller and reverting to the original sub/built-in if a leak occurred.  These leak guards were also compiled into the importing package.

It is probably not much of a surprise to anyone that autodie spent quite a bit of its runtime compiling these wrappers and their leak guards.  Together with upstream, I have addressed some of these problems.

 

The first optimization was to allow some wrappers for built-ins to be reused that does not take “glob” or bareword arguments[2].  Secondly, we started using closures to generate the leak guards.  Combined these two effectively cut the load time in half for subsequent loads of autodie, though it did have a negative effect on the call overhead for these wrappers (~1s for 1M calls).  That landed in autodie v2.18.

The extra runtime overhead of v2.18 was basically caused by the leak guard being split into two (a small closure, which called the “real” leak guard with an extra argument).  We managed to merge these two parts into one without (too much) loss of code maintainability.  With that, the call overhead of these wrappers dropped by ~3s (for 1M calls) and thus the wrappers in autodie v2.20 were even faster than the ones in v2.17.

Beyond these improvements, I have sent upstream several patches (still pending review) that will improve the performance of autodie even further.  If merged, we are looking at another 1.5s (per 1M calls) removed and almost a factor 4 reduction on the load time of autodie.

This factor 4 comes from lazily compiling the wrapper on demand, so some of the price is moved to the call overhead.  That said, it is a net gain since it is highly unlikely that every module using autodie will use every single wrapped built-in (how often do you use both “read” and “sysread” in the same module)?

[1] It is compiled via a call to eval using some template code.

[2] For the rest of the built-ins, it could probably just use Symbol::qualify to qualify bareword filehandles.  Perhaps that will come in a future version of autodie.  🙂

 

Advertisements
This entry was posted in Debian. Bookmark the permalink.

One Response to Optimizing autodie

  1. Pingback: autodie 2.21 | nthykier

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s