FCGI::ProcManager
, so I wanted to share what I did to Perl and FastCGI this week.Recap:
mod_fcgid
loses its connection table during a “graceful reload” with Apache 2.4 (we have the version included with Ubuntu 17.04, so it’s Apache 2.4.25.) The current connections get broken, and we’ve reached a size where that can interrupt around 20 connections during one reload event. Which we invoke when deploying new code.Therefore, I wanted something more reliable, so I built something to use
mod_proxy_fcgi
to talk to a listening daemon.We started with a script (our code) named
handler.fcgi
, which used CGI::Fast
to manage the request loop. In this setup, Apache’s mod_fcgid
worked as process manager, spawning a new perl handler.fcgi
instance whenever a request was routed to that script and there wasn’t an idle worker to process it. For N concurrent workers, N users are forced to wait for the startup cost.To move the process management out of Apache’s hands, we need another process manager, and the plumbing to start it up when the system boots. I adapted
handler.fcgi
into daemon.fcgi
, and then built the daemon management from scratch. Let’s start with that.daemon.fcgi
Caveat: this is simplified. The OpenSocket parameters and the number of processes are configurable through the environment, using code like$ENV{FCGI_SOCKET_PATH} || ':9005'
but I wanted to make the code below more concise. Likewise, I’ve left out all of our actual preloads, because those are boring.First, we have our basic setup, and loading of the modules the daemon needs to use:
use 5.014;
use warnings;
use CGI ();
use FCGI ();
use FCGI::ProcManager ();
I’m very paranoid about keeping my scopes clean, thus the empty parentheses to forcibly prevent the modules from importing anything.
At this point in the code, there is an opportunity to preload immutable modules may be needed. I say “immutable,” because anything loaded at this point will not be reloaded during SIGHUP. It is already loaded in the managing parent, which is not restarted, so the new workers that are started will inherit the same code. Therefore, be very careful not to preload something here that will need to be reloaded gracefully!
Now, we get onto the business of starting the daemon. First, we open a listening socket, with a listen queue depth of 100 (this is the default in CGI::Fast so I just copied it for myself):
my ($socket, $pm, $req);
$socket = FCGI::OpenSocket(':9005', 100);
$req = FCGI::Request(\*STDIN, \*STDOUT, \*STDERR,
\%ENV, $socket, FCGI::FAIL_ACCEPT_ON_INTR);
This socket will be shared among children. Note that all of these arguments (passing the filehandle globs, the environment, and the
FAIL_ACCEPT_ON_INTR
flag) are important for correct operation.My working theory about the filehandles is that FCGI sets up the filehandles that are passed as the FCGI streams, and thus, passing
\*STDERR
sets up the STDERR handle to go to FCGI request’s error stream during requests. (Where it can end up in the Web server’s error log.) The filehandles don’t have to make sense, and don’t “become” the FCGI stream handles.All of that finishes preparing the parent, so at this point, we can call the
FCGI::ProcManager
to fork for us:$pm = FCGI::ProcManager->new({ n_processes => 5 });
$pm->pm_manage(); # forks, never returns in parent
From this point on, code will only execute in the context of a worker process. The manager gets “stuck” inside
pm_manage()
unless something goes bad and it has to call exit
, but even then, it still doesn’t return.What now remains is to write the main request loop:
require AppCode::DB; # preload, at run time
while ($req->Accept() >= 0) {
$pm->pm_pre_dispatch(); # defers signals
CGI->_clear_globals(); # prevent crosstalk
my $q = $CGI::Q = CGI->new;
# --- request processing goes here ---
$pm->pm_post_dispatch(); # acts on signals
}
Any code loaded between
pm_manage()
and the start of the while
loop will be loaded before any requests, and remain persistent between requests. It will also be reloaded when the daemon is reloaded via SIGHUP, because the new workers will re-process the require
statements. It’s vital to use require
here, and not use
, because the latter happens at compile time. Any use
statements here would still be processed in the parent, and those modules would not be reloaded in any workers.The calls to
pm_pre_dispatch
and pm_post_dispatch
are exactly as instructed in the FCGI::ProcManager
documentation. I looked inside their code, and they make it so that a “please shut down now” signal will be deferred until the request has been processed.The
CGI->_clear_globals()
line and the setting of $CGI::Q
(the default CGI object) are borrowed from the code of CGI::Fast
. The globals must be cleared, or else the worker can return the wrong response to the client, and really mess things up. For example, I started getting nested UI elements—instead of loading search results via AJAX, the search form would come back and be put in the page again.Starting the daemon
I wrote a systemd service file to start up the daemon. I’m not going to cover it here, because there are probably better systemd resources, and there are other init systems, too.Everything that was passed to
handler.fcgi
via FcgidInitialEnv—notably PERL5LIB—is now passed as an Environment setting at daemon startup.As noted above, there’s no special consideration for input/output/error streams, because they will be shadowed by the FastCGI request streams while processing requests. (The manager will still write a bit to them, about the worker process lifecycles.
systemd
just logs those messages.)Connecting Apache to the Daemon
In our VirtualHost block, we forward interesting URLs to the proxy:ProxyPassMatch ^/(.*)\.pl$ fcgi://localhost:9005/$1
Other environments most likely need a different regular expression.
Options, such as
enablereuse=on
, are also not shown here.Why CGI::Fast
isn’t involved anymore
I tried really hard to keep using CGI::Fast
because it had been working in the handler.fcgi
version. However, it didn’t quite allow me enough control to get it integrated with FCGI::ProcManager
.If environment variables like
FCGI_SOCKET_PATH
are set, then CGI::Fast
tries to open the socket to listen on. However, if there are multiple workers, only one of them can “win” this game, and the rest keep getting “socket already in use” errors and exiting. (Which, as a worker, means the manager tries to replace them, but it’s futile.)If the environment variables aren’t set, then
CGI::Fast
seems to think it’s going to receive a CGI request on STDIN, and the whole thing comes tumbling down when the request is actually entirely blank. (For unimportant reasons, I don’t need to handle /
in our app at work, so I don’t. It turns out that our app just crashes if such a request comes in.)I wasn’t able to figure out how to open a socket in the parent, start managing, and then have
CGI::Fast
wait for requests in the workers.Reflection
I worked really hard to stick to my initial plan, and overcome all obstacles. I ended up with a thing that would work great for unalteredCGI.pm
apps.Our app was probably amenable to
CGI::PSGI
and using Server::Starter
: as I rewrote things from slow CGI to FastCGI, I also rewrote their output so that they’re templated (not inline print
statements) and the response is only sent from the main request loop. Using CGI::PSGI
probably would have been a better outcome—we would have been able to preload and share more code in the parent, without losing graceful reload support.In the end, I also noticed startup costs are lower than I expected. In the days where we ran on
m1.small
instances, it would add a full second to the page load time to store sessions in DynamoDB, because Net::Amazon::DynamoDB
and its Moose
dependency would take that long to load. We rolled out, then reverted, that change, because it increased the request time by an order of magnitude. But now, a t2.medium
runs as fast as my Ivy Bridge desktop, able to preload that same code in around 0.34 seconds, and finish the entire set of preloads in 0.5 seconds.The change is still worth it for the improvement in user experience when we have to do a deployment, though (which, because FastCGI is persistent, always involves a graceful reload.)
No comments:
Post a Comment