Jul 17 2021

Making a Window Manager

By Edward Wibowo

In pursuit of achieving a better understanding of window managers, I created my own rendition of a tiling window manager: critwm.

At the time of writing, critwm is by no means a perfectly functioning window manager; it is riddled with bugs and impractical solutions. However, I would say it is somewhat usable and contains the core features of a window manger with some extra bells and whistles.

Extra features that can be found in critwm include:

Compile-time configuration
Dynamic layout switching
Multiple monitor support (with Xinerama)
Inter-process communication support

This blog post is not a tutorial of any sort. It is more of an overview of some of the things I learned throughout the process of writing critwm.

Backend

Beneath the layers of superfluous features such as layout management and IPC is a backend communicating with the X11 display protocol. X11 is a windowing system commonly found on systems running Linux and provides a thorough framework for manipulating and retrieving information about windows on a screen. Xlib can be used to interact with the display protocol, providing a myriad of useful functions. The size, position, and state of every window can be manipulated using Xlib.

However, Xlib is originally written in C so critwm, being written in Rust, relies on bindings. More specifically, x11-dl. There are slight deviations in terms of the usage of Xlib functions due to Rust’s particular intricacies; however, for the most part, it is a one-to-one mapping to build a window manager.

Handling X11 Events

The main functionality of the backend is to read, interpret, and act upon events received from the X11 server. A Rust-like way of interpreting these events can be done through pattern matching:

pub fn handle_event(&mut self) -> CritResult<()> {
    // ...
    let mut event: xlib::XEvent = unsafe { mem::zeroed() };
    unsafe { (xlib.XNextEvent)(self.display, &mut event) };
    match event.get_type() {
        xlib::KeyPress => {
            // ...
        }
        xlib::MotionNotify => {
            // ...
        }
        // ...
        _ => {}
    }
    Ok(())
}

Listing 1: X11 Event handling using Rust pattern matching.

The function handle_event forms the core of the backend and is the gateway to the majority of the window manager’s functionality. A list of the possible event types can be found here. A basic window manager does not need to respond to every event.

Handling User-Invoked Events

In addition to X11 events, the window manager should be able to respond to user events as well. For example, the user may want to kill a client, toggle floating mode, or change workspace. critwm handles this through a signal system, which is a stack of pending signals (user-invoked events).

A list of possible user events can be summarized in the following enum:

#[derive(Debug, Clone)]
pub enum Signal {
    Quit,
    KillClient,
    ToggleFloating,
    ToggleBar,
    SetLayout(usize),
    ChangeWorkspace(usize),
    MoveToWorkspace(usize),
    FocusMon(Dir),
    FocusStack(Dir),
    // A combination of FocusStack and FocusMon. Only focuses another monitor if there is no other
    // client to focus on the stack without looping.
    FocusDir(Dir),
}

Listing 2: Signal enum.

Some signals have an associated value to further specify the actions that should be taken. For instance, SetLayout has an associated usize to specify which layout to switch to.

These signals are then sent to a global stack defined in the following snippet:

lazy_static! {
    // SIGNAL_STACK stores global signals that are executed accordingly in the backend.
    // This system allows signals to be freely added and executed externally.
    pub static ref SIGNAL_STACK: Arc<Mutex<Vec<Signal>>> = Arc::new(Mutex::new(Vec::new()));
}

Listing 3: Global SIGNAL_STACK for user-invoked event handling.

The lazy_static! macro allows SIGNAL_STACK to be static and initialized at runtime. The usage of Arc and Mutex ensures thread safety so it can be manipulated by both the user and backend.

With access to SIGNAL_STACK, the backend’s job is to simply pop a signal off the stack and act upon it.

// Returns true if quit signal is handled.
pub fn handle_signal(&mut self) -> CritResult<bool> {
    if let Some(signal) = SIGNAL_STACK.lock().unwrap().pop() {
        match signal {
            Signal::Quit => {
                self.quit();
                return Ok(true);
            }
            Signal::KillClient => self.kill_client(),
            // ...
            Signal::FocusDir(direction) => self.focus_dir(direction),
        }
    }
    Ok(false)
}

Listing 4: Backend signal interpretation using Rust pattern matching.

Similar to the handle_event function in Listing 1, handle_signal is invoked in the main loop and simply polls for any pending signals.

Extended Window Manager Hints

A good indicator of a thorough X11-based window manager is the magnitude in which it handles Extended Window Manager Hints (EWMH). In essence, EWMH is a set of standards made specifically for the interaction between a window manager and the clients it manages. EWMH involves a set of protocols that should each be handled according to some guidelines. As a quick disclaimer, critwm does not handle all protocols and instead handles a selective number of them to provide a basic window managing experience.

EWMH protocols are prefixed with _NET_. Example of some protocols:

_NET_SUPPORTING_WM_CHECK
_NET_WM_WINDOW_TYPE_DIALOG
_NET_WM_STATE_FULLSCREEN

The names of the protocols are quite descriptive but more documentation about them can be found here.

These protocols are presented as atoms in code. Atoms are used to identify and differentiate between protocols and other properties beyond the scope of EWMH. These atoms are sourced from the X11 function XInternAtom and is called as the following:

fn get_atom(xlib: &xlib::Xlib, display: *mut xlib::Display, name: &str) -> xlib::Atom {
    unsafe {
        (xlib.XInternAtom)(
            display,
            CString::new(name).unwrap_or_default().into_raw(),
            xlib::False,
        )
    }
}

Listing 5: Atom fetching function.

The get_atom function in Listing 5 can thus be invoked like Self::get_atom(xlib, display, "_NET_WM_STATE_FULLSCREEN"), where the protocol name is given as the third argument (_NET_WM_STATE_FULLSCREEN).

With the aid of some helper functions, the state of a window can be queried:

if let Some(state) = self.get_atom_prop(self.clients[index].window, self.atoms.net_wm_state)
{
    if state == self.atoms.net_wm_state_fullscreen {
        self.toggle_fullscreen(index);
    }
}

Listing 6: Querying state of a window using atoms.

Layouts

One of the extra features of critwm is dynamic layout switching. Although critwm is mainly a tiling window manager (as it is the default window layout), it allows other layouts to be toggled during run time. For example, critwm supplies a floating mode: allowing windows to be freely spawned and moved around.

Fundamentally, each layout is a function that takes in information about the current clients and monitors and returns a vector of window geometries. The vector of window geometries represent the new position and sizes of the clients in the order the clients were passed. All layout functions are of the same function type identified as LayoutFunc:

pub type LayoutFunc =
    fn(usize, usize, &MonitorGeometry, &[Client], &BarStatus) -> Vec<WindowGeometry>;

Listing 7: Layout function type.

For example, the following is the float function, representing the logic behind the floating layout:

pub fn float(
    monitor_index: usize,
    workspace: usize,
    monitor_geometry: &MonitorGeometry,
    clients: &[Client],
    _bar_status: &BarStatus,
) -> Vec<WindowGeometry> {
    clients
        .iter()
        .map(|client| {
            let mut geometry = client.get_geometry().clone();
            // Set client position to current monitor if it is arrangeable and currently outside.
            if layouts::is_arrangeable(client, monitor_index, workspace)
                && !monitor_geometry.has_window(&geometry)
            {
                geometry.x = monitor_geometry.x;
                geometry.y = monitor_geometry.y;
            }
            geometry
        })
        .collect::<Vec<WindowGeometry>>()
}

Listing 8: Floating mode layout function.

Essentially, float doesn’t arrange any existing windows. It instead only polls for any clients that aren’t positioned correctly in its corresponding monitor. When found, it positions them at the top left corner of the monitor.

One interesting constituent to critwm layout functions is the function layouts::is_arrangeable(). critwm defines that a client should only be able to be manipulated if the client is:

Not fullscreen.
Not floating.
Not a dock-type window.
Present in the monitor being arranged.
Present in the workspace being arranged.

This can be condensed as the following function is_arrangeable:

fn is_arrangeable(client: &Client, monitor_index: usize, workspace: usize) -> bool {
    // Layouts should only modify the geometry of clients that are arrangeable.
    !client.fullscreen
        && !client.floating
        && !client.dock
        && client.monitor == monitor_index
        && client.workspace == workspace
}

Listing 9: Checking if a client is arrangeable.

Configuration

Compile-Time Configuration

critwm uses compile-time configuration. This means that users of the window manager can customize the functionality before the program is even compiled. At first, this may seem quite counter-intuitive because it requires recompilation every time a customization change is made; however, it provides the following benefits:

Less prone to run time crashes due to configuration faults.
Avoids the need of parsing a configuration file.
Custom functionality is more easily implemented. No abstraction layer is needed.
More Rust.

Nonetheless, a compile-time configuration has many flaws:

It requires the user to have some basic knowledge of a programming language.
Compiling can take a long time to complete.
Requires additional modifications to allow live reload.

Throughout the development of critwm, I added the following to my .xinitrc to allow live reloading:

while type critwm >/dev/null; do
    RUST_LOG=debug critwm 2>/tmp/critwm.log && continue || break
done

Listing 10: Shell snippet to allow live configuration reloading.

The 3 lines of shell script in Listing 10, at the end of ~/.xinitrc, means that quitting critwm would simply reload the executable and keep windows open. It does not close the X server. Manually killing critwm would close the X server ($ killall critwm).

Configuration File

The file ~/.config/critwm/config.rs is used if it exists when compiling. If it doesn’t, a default configuration is supplied by the file located in the source code ./src/config.def.rs.

critwm allows users to configure aspects such as workspaces, tags, borders, cursor warping, key binds, and layouts. For example, key binds are configured using a key map:

// ./src/util.rs
macro_rules! key {
    ($modifier:expr, $sym:expr, $action:expr) => {
        (
            Key::new($modifier, $sym as crate::util::XKeysym),
            Box::new(move || $action),
        )
    };
}

// ~/.config/critwm/config.rs
pub fn get_keymap() -> HashMap<Key, Action> {
    let mut keymap: Vec<(Key, Action)> = vec![
        key!(MODKEY, XK_space, util::spawn("dmenu_run")),
        key!(MODKEY, XK_Return, util::spawn(TERMINAL)),
        key!(MODKEY, XK_j, util::signal(Signal::FocusStack(Dir::Down))),
        key!(MODKEY, XK_k, util::signal(Signal::FocusStack(Dir::Up))),
        key!(MODKEY | ControlMask, XK_l, util::signal(Signal::FocusDir(Dir::Down))),
        key!(MODKEY | ControlMask, XK_h, util::signal(Signal::FocusDir(Dir::Up))),
        // ...
    ];
    // ...
    keymap.into_iter().collect::<HashMap<Key, Action>>()
}

Listing 11: Key map configuration.

This utilizes the key! macro which defines a tuple (Key, Action). As the tuple suggests, when the key is pressed, the action is executed in the form of a signal. The purpose of this macro is to avoid repetition of constructing a Key and closure inside a Box. The function get_keymap is then executed and the HashMap is returned to the backend where it is accessed on KeyPress X events.

Conditional Building

As mentioned before, either ~/.config/critwm/.config.rs or ./src/config.def.rs is included in the source code depending on the existence of the former file. To do this, Rust’s include! macro is used:

pub mod config {
    // Parse configuration from user's filesystem if custom_config.
    #[cfg(feature = "custom_config")]
    include!(concat!(env!("HOME"), "/.config/critwm/config.rs"));

    // Fallback to default config in src if not custom_config.
    #[cfg(not(feature = "custom_config"))]
    include!("config.def.rs");
}

Listing 12: Include macro for conditional building.

The include! macro is used to parse the Rust code in the given file. The configuration file included is solely dependent on the active feature.

To avoid having to specify which configuration to use, a build script is used to check the existence of the custom configuration file (~/.config/critwm/config.rs).

The following is the entire content of the build script build.rs:

use std::path::Path;

fn main() {
    let config_path = concat!(env!("HOME"), "/.config/critwm/config.rs");
    println!("cargo:rerun-if-changed={}", config_path);
    if Path::new(config_path).exists() {
        // If "$HOME/.config/critwm/config.rs" exists, pass custom_config option.
        // This means that this configuration file will be sourced instead of "src/config.def.rs".
        println!("cargo:rustc-cfg=feature=\"custom_config\"");
    }
}

Listing 13: Build script.

Inter-Process Communication

Inter-process communication (IPC) is a system allowing multiple programs to communicate together. critwm utilizes IPC to allow external programs to query the state of the window manager. Information such as the position of windows, current layout, current workspace, and more can be externally queried.

When critwm starts, it opens a socket with path /tmp/critwm_state.sock. This socket is then continuously listened to and polled for any streams. Of course, the sockets are managed simultaneously along with the window manager’s main functionality using tokio for asynchronous functionality.

pub async fn listen(&mut self) -> CritResult<()> {
    if self.socket_path.exists() {
        // If the program does not close properly, i.e. StateSocket::close is not run, the
        // socket_path may still exist. Preemptively delete it to avoid "Address is already in
        // use error."
        fs::remove_file(&self.socket_path).await.ok();
    }
    let state = self.state.clone();
    let listener = UnixListener::bind(&self.socket_path)?;
    self.listener = Some(tokio::spawn(async move {
        loop {
            match listener.accept() {
                Ok((mut stream, _)) => {
                    let mut state = state.lock().await;
                    if stream.write_all(state.last_state.as_bytes()).is_ok() {
                        info!("Pushed stream: {:?}", stream);
                        state.streams.push(Some(stream));
                    }
                }
                Err(e) => {
                    error!("Listener accept failed: {:?}", e);
                }
            }
        }
    }));
    Ok(())
}

Listing 13: Socket listening

The state of the backend is then written to all streams in a vector of UnixStream in the JSON file format. This is done easily using Serde which supplies traits and functions to both serialize and deserialize data. serde_json is used in conjunction with Serde to utilize the JSON file format. This means the fields of the backend struct can be translated and aggregated into a single JSON representation. The backend data is represented as a struct defining the API layer:

// Serialized backend.
#[derive(Serialize)]
pub struct Api<'a> {
    clients: &'a Vec<Client>,
    layouts: &'a Vec<Layout>,
    monitors: &'a Vec<Monitor<{ config::WORKSPACE_COUNT }>>,
    workspaces: &'a [&'a str; config::WORKSPACE_COUNT],
    current_client: &'a Option<usize>,
    current_monitor: usize,
}

impl<'a> From<&'a Backend<'a>> for Api<'a> {
    fn from(backend: &'a Backend<'a>) -> Self {
        Self {
            clients: &backend.clients,
            layouts: &backend.layouts,
            monitors: &backend.monitors,
            workspaces: &config::WORKSPACES,
            current_client: &backend.current_client,
            current_monitor: backend.current_monitor,
        }
    }
}

Listing 14: Backend serialization.

The Api struct derives Serialize and is thus able to be serialized to JSON by the socket manager:

pub async fn write(&mut self, backend: &Backend<'_>) -> CritResult<()> {
    if self.listener.is_some() {
        let api = Api::from(backend);
        let mut json = serde_json::to_string(&api)?;
        json.push('\n');
        let mut state = self.state.lock().await;
        if json != state.last_state {
            state.streams.retain(Option::is_some);
            for stream in &mut state.streams {
                if let Some(mut_stream) = stream.as_mut() {
                    if mut_stream.write_all(json.as_bytes()).is_err() {
                        stream.take();
                    }
                }
            }
            state.last_state = json;
        }
    }
    Ok(())
}

Listing 15: Socket writing.

Next Steps

Although the IPC system allows any external bar program to be used with critwm, a native bar would greatly simplify setting up the window manager. A native bar could be drawn and rendered as part of the critwm binary through Xlib functions, avoiding the dependency on IPC sockets as a whole.

Features such as the ability to focus a window as the primary window, tile resizing, and custom rules are yet to be implemented. There is also more room to implement additional layouts on top of the existing tiling and floating layouts. This is simply a matter of implementing a function of type LayoutFunc and adding it to the configuration file.

Since critwm isn’t a very usable window manager, here are some alternatives: