Sunday, January 12, 2025

Systemd Allows Unknown Units in Before/After

Most of the time, my development virtual machine guest would boot and run perfectly fine.  Sometimes, though, the FastCGI service backing one of the websites would not be up and running.  It had a ConditionPathExists, and if the code to run the service wasn’t mounted, it wouldn’t start.

The intention was to allow colleagues to import a copy of this guest, then set up the mount to share the project from the host as they saw fit.  On their first boot, with no sharing, ConditionPathExists would prevent the FastCGI service from attempting to start, and therefore, systemd would not report that the system was degraded.  Another point about this system is that the sharing mechanism is unspecified: colleagues are free to use NFS (as I do), Plan9 file sharing, or the hypervisor’s shared-files mechanism.  The host paths are also unspecified, so there is no way I can set up the guest to expect specific sharing in advance.

In practice, sometimes NFS wasn’t ready in my guest before systemd was checking conditions for the FastCGI service.  The obvious answer was to add After=remote-fs.target to the FastCGI service.  I quickly added a drop-in to add this directive to my own post-configuration scripts.

However, that’s a local solution to a global problem.  My colleagues can’t benefit from that, and I should minimize the burden on them periodically setting up new guest images.  The fewer things they must remember, the better.

It turns out the answer was even simpler: I could skip the drop-in and add the After= line to the main service file. I added both remote-fs.target and the hypervisor’s guest services to the line, which means:

  1. In production, there are no remote filesystems to mount, nor guest services; there is no latency introduced.
  2. When using NFS or similar, systemd waits for the remote filesystem before starting the FastCGI service.
  3. With the hypervisor’s file sharing, the guest services mount the shared files before starting the FastCGI service.

My guest doesn’t actually have the guest services installed, but the FastCGI service starts up as intended.  Looking at systemctl list-units --all output, the guest services are (now) listed as not-found and inactive, which is pretty much what I would expect from a dangling reference.  systemd knows about it because I listed it in After, but since it’s not required by anything, the missing definition for it doesn’t cause any problems.

Sunday, January 5, 2025

Residual Config Without Config Files

apt makes a distinction between “removed” and “purged.” In both, the packages are uninstalled; in the former state, config files remain, and in the latter, those are also removed.  Actually, that’s not quite the whole story.

A package can have no configuration files, yet still be in ”residual config” state when removed.  This happens if a package defines a postrm maintainer script. These can have basically any shell commands in them, so their actions aren’t visible in any list-of-files.

The specific package I was looking into was a library, with a postrm script that ran ldconfig… during removal.  The package was being shown in residual-config state because it had a script.  Although that script would do nothing during purge, apt (and dpkg) can’t know that.

How to list residual-config packages: apt list 2>/dev/null | grep residual-config or dpkg -l | grep ^rc.

Listing configuration files: try one of these answers as this gets real complex, real fast.

Reading a postrm script: look at /var/lib/dpkg/info/{PACKAGE}[:{ARCH}].postrm (the ARCH component may not be present.)

Sunday, December 29, 2024

Scattered Notes on Dovecot’s userdb, passdb, and passwd-file

Dovecot can authenticate users using a passwd-like file.  This happens in two phases.  First, users are looked up in the passdb.  If the user is found and authenticated, then the user is looked up again in the userdb to get things like their UID/GID and home directory.

Now, this doesn’t allow for aliasing users in Dovecot.  If the login is user@example.com, then the defaults will lead to trying to find “user@example.com” in the passdb, then the userdb.  Failure to have these configured correctly can result in different errors:

  1. User not found in the passdb: authentication fails.  (Beware of fail2ban here.)
  2. User not found in the userdb: user can authenticate, but appears to have no mail!

For my own system, the virtual address needs to be resolved to a particular system user (aka Unix account.)  I also want to share the password files with Postfix for outbound email authentication.  This made Dovecot complicated: I want to log in as user@domain, then have that processed as user for both lookups in a file that is specific to the domain. I put the shortened user in the passwd-file, and now I have to configure passdb carefully:

# /etc/dovecot/local.conf snippet
passdb {
    args username_format=%n /local/auth/%d/passwd
    override_fields user=%n
    driver = passwd-file
}
userdb {
    args /local/auth/%d/passwd
    driver = passwd-file
}

This makes passdb do the first lookup using the short username, %n, with the args setting.  Then, that short username is returned by override_fields for use in later lookups.  After that, userdb can continue with no special settings; it will use the overridden user to look up the short name, and nothing special needs to happen.

I believe that the passwd-file can’t return a different username, because there’s only one username field (the first field), and it is also the lookup key.  This is what requires us to use override_fields for this scenario.

Sunday, December 22, 2024

Don’t Let HTTP/2 Nest

For some time, I had problems accessing a dev server with HTTP/2.  Asking cURL to use HTTP/1.1 worked fine, so that’s what I did for a long time.

Today, I found the root cause.  I had nginx set up as reverse-proxy/TLS termination (to emulate ALB), proxying requests to apache2Both of them had HTTP/2 enabled!  I needed to deactivate support in Apache, and since the system is Debian/Ubuntu based, that meant:

sudo a2dismod http2
sudo systemctl reload apache2

After that, everything worked.

The problem was that the client would connect to nginx with HTTP/2, and then the request would be sent to Apache. Apache's HTTP/2 module would include an Upgrade: h2, h2c header in the response.  Then nginx would dutifully copy this back to the client.  When cURL or PHP streams received this header, they would detect it as invalid: we can’t upgrade to HTTP/2 from inside HTTP/2.

That error-handling resulted in discarding the response body… but not the HTTP 200 status code, which was extremely puzzling.  How could this successful request have failed?  It failed during header processing, after processing the status and before accepting the body.  (I think browsers must ignore it?  Or maybe they don’t use HTTP/2 through a proxy, even with CONNECT requests?  I would have had to figure out the problem much sooner, if they had seen this Upgrade header and treated it as an error.)

The other weird thing about this is that Apache doesn't have TLS configured, but it still provided h2 as an option in its Upgrade header.  I don’t think that’s a reasonable configuration.  It’s especially not a reasonable default, but I’m not sure whether that’s Apache’s problem, Debian’s, or Ubuntu’s.

Tuesday, December 17, 2024

What I Learned Trying to Install Kubuntu (alongside Pop!_OS)

First and foremost, once again, this is clearly not a supported configuration that I tried to make.  I'm sure that if I wiped the drive and started afresh, things would have gone much better.  I just… wanted to push the envelope a bit.

Pop!_OS installs (with encryption) with the physical partition as a LUKS container, holding an LVM volume group, and the root filesystem is on a logical volume within.  The plan was hatched:

  • Create a logical volume for /home and move those files over to it
  • Create a logical volume for Kubuntu’s root filesystem
  • Install Kubuntu into the new volume, and share /home for easy switching (either direction)

Things immediately got weird.  The Kubuntu installer (calamares) knows how to install into a logical volume, but it doesn’t know how to open the LUKS container.  I quit the installer, unlocked the thing, and restarted the installer.  This let the installation proceed, up to the point where it failed to install grub.

Although that problem can be fixed, the whole installation ended up being irretrievably broken, all because booting Linux is clearly not important enough to get standardized. Oh well!

Sunday, December 8, 2024

Side Note: Firefox’s Primary Password is Local

When signing into Firefox Sync to set up a new computer, the primary password is not applied.  I usually forget this, and it takes a couple of runs for me to remember to set it up.

That’s not enough for a post, so here are some additional things about it:

The primary password protects all passwords, but not other data.  If someone can access Firefox data, bookmarks and history are effectively stored in the clear.

The primary password is intended to prevent reading credentials… and the Sync password is one of those credentials.  That’s why a profile with both Sync and a primary password wants that password as soon as Firefox starts; it wants to check for new data.

The same limitation of protections applies to Thunderbird.  If someone has access to the profile, they can read all historic/cached email, but they will not be able to connect and download newly received email without the primary password.

The Primary Password never times out.  As such, it creates a “before/after first unlock” distinction.  After first unlock, the password is in RAM somewhere, and the Passwords UI asking for it again is merely re-authentication.  Firefox obviously has the password saved already, because it can fill form data.

Some time ago, the hash that turns the primary password into an actual encryption key has been strengthened somewhat.  I believe it is now a 10,000-iteration thing, and not just one SHA-1 invocation.  The problem with upgrading it further is that the crypto is always applied; ”no password” is effectively a blank password, and the encryption key still needs to be derived from it to access the storage.  Mozilla understandably doesn’t want to introduce a noticeable startup delay for people who did not set a password.


Very recently (2024-10-17), the separate Firefox Sync authentication was upgraded.  Users need to log into Firefox Sync with their password again in order to take advantage of the change.

Sunday, December 1, 2024

Unplugging the Network

I ended up finding a use case for removing the network from something.  It goes like this:

I have a virtual machine (guest) set up with nodejs and npm installed, along with @redocly/cli for generating some documentation from an OpenAPI specification.  This machine has two NICs, one in the default NAT configuration, and one attached to a host-only network with a static IP.  The files I want to build are shared via NFS on the host-only network, and I connect over the host-only network to issue the build command.

Meaning, there is no loss of functionality to remove the default NIC (the one configured for NAT), but it does cut npm off from the internet.  That’s an immediate UX improvement: npm can no longer complain that it is out of date! Furthermore, if the software I installed happened to be compromised and running a Bitcoin miner, it has been cut off from its c2 server, and can’t make anyone money.

An interesting side benefit is that it also cuts off everyone’s telemetry, impassively.

I can’t update the OS packages, but I’m not sure that is an actual problem.  If the code installed doesn’t have an exploit payload already, there’s no way to get one later.  The vulnerability remains, but nothing is there to go after it.

Level Up

(Updated 2024-12-19: this section was a P.S. hypothetical on the original post. Later sections are added.)

It is actually possible to deactivate both NICs.  The network was used for only two things: logging in to run commands, and to (re)use the NFS share to get the files.

Getting the files is easy: they can be shared using the hypervisor’s shared-folders system.  Logging in to run commands can be done on the hypervisor’s graphical console.  As a bonus, if the machine has a snapshot when starting, it can be shut down by closing the hypervisor’s window and reverting to snapshot.

Now, we really have a network-less (and stateless) appliance.

Reconfigure

Before I made that first snapshot, I configured the console to boot with the Dvorak layout, because the default of Qwerty is pretty much why I use SSH when available for virtual machines.  But then, after a while, I got tired of being told that the list of packages was more than a week old, so I set out to de-configure some other things.

I cleared out things that would just waste energy on a system that would revert to snapshot: services like rsyslog, cron, and logrotate.  Then I trawled through systemctl list-units --all and cleared a number of timers, such as ones associated with “ua”, apt, dpkg, man-db, and update-notifier.  Any work these tasks do will simply be thrown away every time.

I took the pam_motd modules out of /etc/pam.d/login, too.  If Canonical doesn't want me to clear out the dynamic motd entirely, the next best thing is to completely ignore it.

After a reboot, I went through systemd-analyze critical-chain and its friend, systemd-analyze blame, and turned off more things, like ufw and apport.

With all that out of the way, I rebooted and checked how much memory my actual task consumed; it was apparently a hundred megabytes, so I pared the machine’s memory allocation down from 2,048 MiB to 512 MiB.  The guest runs with neither swap nor earlyoom, so I didn’t want to push it much farther, but 384 MiB is theoretically possible.

NFS

A small, tiny note: besides cutting off the Internet as a whole, sharing files from the hypervisor instead of NFS adds another small bit of security.  The NFS export is a few directories up, and the host has no_subtree_check to improve performance on the other guest that the mount is actually meant for.

Super theoretically, if the guest turned evil, it could possibly look around the entire host filesystem, or at least the entire export.  When using the hypervisor’s file sharing, only the intended directory is accessible to the guest kernel.