The Unix login database

The Unix login database has been a part of Unix since the 1970s. It existed in BSD version 1 in 1977 and in Research UNIX Version 7 in 1979. It is a binary database with (nowadays) three tables that stores information about interactive login sessions as they come and go on the system.

The BSDs and Linux operating systems all possess it.

The database files

The database on modern Unices and Linux operating systems comprises three files, each of which is a single database table. There are no index files, and each table file comprises a sequence of fixed-length database records.

The table of active login sessions

On FreeBSD and derivatives: /run/utx.active
On Linux, OpenBSD, and MacOS before 10.3: /run/utmp

This gives a snapshot of what login sessions are actually active at the moment. Entries are written here when login sessions begin and end. This table is non-persistent, first by dint of living on a filesystem that it not persistent, and second by dint of being emptied of all records as part of system startup and shutdown.

The table of active login sessions on FreeBSD and its derivatives

Operating systems such as FreeBSD have a simple system for this table. It is driven from a PAM plug-in module named pam_lastlog, which is invoked when PAM sessions are created and destroyed by the login program. The SSH server does much the same, but replicates what the PAM module does internally, itself, rather than using the PAM module.

When the PAM session is created the plug-in adds a "user" entry to this table giving the process ID of the PAM session manager process, which is the login program, or the process ID of the (unprivileged) SSH session manager process.
When the PAM session is destroyed, the PAM plug-in or SSH login session manager overwrites this entry with a "dead" entry.
The entire table is cleared at bootstrap and shutdown, as a side-effect of a utility that writes "boot" and "shutdown" events to the event log (discussed further on). This side-effect is actually built in to a BSD C library function that such utilities use.

Thus there are only two types of entry in this table; and they are created and destroyed symmetrically by the same process, the process that is performing the PAM, or SSH, session management. (Contrary to folklore, process #1 does not touch the login database at all.) Moreover, process IDs in this table are not those of the actual session leader processes in the login session, but of their various parent processes that spawned them.

The FreeBSD manual claims that it restricts indefinite table growth by re-using stale entries in the table wherever possible. In fact, it does not. The mechanism that it uses is somewhat thwarted by the PAM plug-in module assigning a new pseudo-random "id" value to each PAM session, meaning that existing stale "user" entries are not re-used in practice, only "dead" entries.

The table of active login sessions on Linux operating systems

Linux operating systems that have historically used the obsolete van Smoorenburg clone of the AT&T System 5 unix inittab mechanism have a more complex system for this table. It contains many types of entry, some of which are also obsolete. The system is further confounded by local TUI login via the login program, remote login via an SSH server, local GUI login, and GUI terminal emulators running "login" shells, all having significantly different behaviours from one another.

For real terminal or virtual terminal login services that invoke the login program, the modern behaviour (as opposed to the obsolete van Smoorenburg behaviour) is as follows:

Some service management systems write an "init" entry to this table for every TTY login service, containing the process ID of the main process spawned and a fake inittab "id", (usually) generating that "id" from the name of the terminal that the TTY login session is using. An existing stale record, if it exists, is looked up by inittab "id" and replaced; or a new entry is added to the table. The record may even contain a terminal device name (the "line"), as that information is sometimes knowable at this point, unlike in the case of the original van Smoorenburg clone where it was not.
The TTY login service may invoke a getty program of some kind. This rewrites that "init" entry as a "getty" entry before it performs any speed negotiation and issues its login prompt. The getty program doesn't know its (fake) inittab "id", that information not actually being passed to the getty program. But it does know its own process ID, which has been recorded in the "init" entry. So it looks up the entry to change, searching by process ID.
These "init" and "getty" entries are rare in practice for virtual terminals, however. For virtual terminal login, some service management systems try to avoid starting the TTY login services until the virtual terminal has been activated, whereupon log-on is likely to swiftly follow. Other systems do not bother writing an "init" entry at all, or even running a getty program of any kind; and instead they invoke the login program directly, after the comparatively smaller amount of initialization work needed by virtual terminals.
The login program, before it prompts for the password or (if invoked directly) account name, either rewrites an existing "getty" entry as a "login" entry, or creates a new "login" entry.
Once the login program has successfully logged a user in and is spawning the logged-in account's login shell as the login session's session leader process, it changes the "login" entry to a "user" entry, changing the process ID in the entry from its own to that of the spawned session leader process.
Although the login program should clean up the table entries as part of its other cleanup actions (such as closing its PAM session) when the spawned login session leader process terminates, it actually does not. Cleanup at logoff is unfortunately not symmetrical with the writing of the entries at logon. Instead, it is left to the program that spawned the login program in the first place to clean up entries, changing them to "dead".

For remote login services, the somewhat different behaviour is:

The SSH login session manager writes a "user" entry to this table, similar to FreeBSD. It writes no "init", "getty", or "login" entries. The process ID is, however, not the process ID of the (unprivileged) SSH session manager process but the process ID of the spawned session leader login shell. The inittab "id" is derived from the name of the slave device of the pseudo-terminal allocated for the login session, albeit in a way that produces rather odd results (e.g. "ts/0").
The SSH login session manager cleans up the entry when the spawned login session leader process terminates, changing it to "dead".

For GUI login services for subsystems like GNOME or MATE, the further different behaviour is:

The GUI login manager writes a "user" entry to this table. It writes no "init", "getty", or "login" entries. The terminal device is that of the virtual terminal being used by the X server. The process ID is, however, not the process ID of the X "session manager" process or of the X server but the process ID of the common parent of both. Moreover, the GUI login manager does not fill in an inittab "id", and always creates a new record in the table.
The GUI login session manager overwrites the entry with a "dead" entry when the login session ends.

For GUI terminal emulators running "login" shells, the yet further different behaviour is:

The terminal emulator, or a helper process that runs with elevated permissions, writes a "user" entry to this table. It writes no "init", "getty", or "login" entries. The inittab "id" is derived from the terminal device name, the terminal device is that of the pseudo-terminal slave, and the process ID is that of the session leader "login" shell.
The terminal emulator, or its helper, overwrites the entry with a "dead" entry when the emulated terminal session ends.

This mess results in a mess.

The GUI login session manager never re-uses any "dead" entries, and nothing else re-uses its "dead" entries; and so its terminated login sessions gradually accrue as more and more "dead" records in the table. This can be particularly messy if the system allocates virtual terminals to GUI login sessions on the fly as well as spawning TUI login services on demand, as it can result in both "dead" GUI login and "login" or "user" TUI login entries simultaneously for a single virtual terminal.
Not everyone derives a inittab "id" from a pseudo-terminal slave device name in the same odd way. The system for re-using stale "user" or "dead" entries relies upon the inittab "id"s matching, and so one ends up (depending from what softwares one actually runs) with login sessions not overwriting one another's old stale records for the same pseudo-terminal if they happen to re-use it, although they will re-use their own.

The log of login events

On FreeBSD and derivatives: /var/log/utx.log
On Linux, OpenBSD, and MacOS before 10.3: /var/log/wtmp

This is a binary log of all login session events that just grows indefinitely until trimmed. It it stored on persistent storage and is not wiped at either bootstrap or shutdown. With it, one can determine when a system was bootstrapped and shutdown, any major clock change events (major enough to affect a log of login sessions), and a sequence of logon and logoff events.

On FreeBSD and its derivatives, this contains "boot", "shutdown", "old time", "new time", "user", and "dead" entries.

The "user" and "dead" entries are written by the same BSD C library routine that writes them to the table of active login sessions, and thus by the same programs at the same points.
The "boot" and "shutdown" entries are written by a program that is run at bootstrap, before entering normal (a.k.a. "multi-user") mode, and as part of transitioning to shutdown mode. The BSD C library routine that it uses to write them also clears the active login sessions table as a side-effect, as aforementioned.
The "old time" and "new time" entries should be written by any program that makes major changes to the system clock, so that programs (and thus people) reading the login database can account for system clock changes when interpreting the timestamps on the various records.

On Linux operating systems, this contains "init", "getty", "login", "user", "dead", "boot", "old time", "new time", and "runlevel" entries.

The "init", "getty", "login", "user" and "dead" entries are written by the programs that write them to the table of active login sessions.
The "boot" and "runlevel" entries are written by a program that is run at bootstrap, before entering normal (a.k.a. "multi-user") mode, and as part of transitioning to shutdown mode.
The "old time" and "new time" entries should be written by any program that makes major changes to the system clock, so that programs (and thus people) reading the login database can account for system clock changes when interpreting the timestamps on the various records.

The program that is run at bootstrap, before entering normal (a.k.a. "multi-user") mode, and as part of transitioning to shutdown mode, is something such as utx, systemd-update-utmp, or login-update-utmpx.

The table of last account logins

On FreeBSD and derivatives: /var/log/utx.lastlogin
On Linux, OpenBSD, and MacOS before 10.3: /var/log/lastlog

This gives a snapshot of the last login session start time for all accounts on the system.

On FreeBSD and its derivatives, this contains just "user" entries. These entries are written by the same BSD C library routine that writes them to the table of active login sessions, and thus by the same programs at the same points. The table is not sorted nor sparse; rather, the BSD C library always searches for and overwrites any existing entry that has the same account name when writing to the table, appending new entries to grow the table by one record if it finds nothing to overwrite. (Login sessions for accounts with different names and the same user IDs will show up as separate entries in this table.) This makes it potentially slow to locate a specific account, which involves a table scan with account name comparisons, but fast to dump information on all accounts that have ever logged in.

On Linux operating systems this contains one record per system account and is indexed by account user ID. It is thus normally a very large but sparse file, with the empty never-written-to records for accounts that have never been logged into causing holes in the file. (Login sessions for accounts with different names and the same user IDs will supersede one another in this table, leaving just the latest one present.) This makes it fast to locate a specific account, the position of whose record can be directly indexed to, but slow to dump information on all accounts because it has to skip a lot of empty records.

The commands

On both sets of platforms, the who and the w commands (by default) read from the active login sessions table; and the last command reads from the log of login events. On FreeBSD and its derivatives, the lastlogin comand reads from the table of account statuses. On Linux operating systems, it is the lastlog command that reads from the table of account statuses.

On both sets of platforms, the "boot" entry for the current bootstrap can be found in both the active login sessions table and the log of login events, and so can be listed with either who -b or (since it always has an account name of "reboot") last reboot.

On FreeBSD and its derivatives, the "shutdown" entry for the preceding bootstrap can be found in the log of login events, and so can be listed with (since it always has an account name of "reboot") last reboot. On Linux operating systems, there is no "shutdown" entry type, but the log of login events does include "runlevel" events, of which system shutdown is considered to be one, that can be listed with who -r.

The active login sessions table and the log of login events are both created by tools that are run at bootstrap before login services are activated. These tools are login-update-utmpx, the FreeBSD utx, or the systemd systemd-update-utmp.