39

Modernizing swapping: virtual swap spaces

The swap/memory situation in linux has surprised me quite a bit coming from Windows.

Windows remains mostly fully responsive even when memory is being pushed to the limits and swapping gigabytes per second, while on linux when I ran a stress test that ate all the memory I had trouble even terminating the script

2 days agoNumerlor

There's two things that cause this. First, Windows has a variable swap file size, whereas Linux has a fixed size, so Windows can just fill up your drive, instead of running out of swap space. Second, the default behavior for the out-of-memory killer in Linux isn't very aggressive, with the default behavior being to over-commit memory instead of killing processes.

As far as I know, Linux still doesn't support a variable-sized swap file, but it is possible to change how aggressively it over-commits memory or kills processes to free memory.

As to why there differences are there, they're more historical than technical. My best guess is that Windows figured it out sooner, because it has always existed in an environment where multiple programs are memory hogs, whereas it wasn't common in Linux until the proliferation of web-based everything requiring hundreds of megabytes to gigabytes of memory for each process running in a Chrome tab or Electron instance, even if it's something as simple as a news article or chat client.

Check out this series of blog posts. for more information on Linux memory management: https://dev.to/fritshooglandyugabyte/series/16577

6 hours agodlcarrier

Windows "figured it out sooner" because it never really had to seriously deal with overcommitting memory: there is no fork(), so the memory usage figures of the processes are accurate. On Linux, however, the un-negotiable existence of fork() really leaves one with no truly good solution (and this has been debated for decades).

3 hours agoJoker_vD

fork is a massive feature, not a bug.

an hour agogoodpoint

Windows will also prioritise to keep the desktop and current focussed application running smoothly, the Linux kernel has no idea what's currently focused or what not to kill, your desktop shell is up there on the menu in oom situations.

3 hours agoChocolateGod

The same behavior exists as far back as NT4 Server, which does not provide a foreground priority boost by default.

32 minutes agop_ing

There are daemons (not installed by default) that monitor memory usage and can increase swap size or kill processes accordingly (you can ofc also configure OOM killer).

3 hours agokalaksi

> As far as I know, Linux still doesn't support a variable-sized swap file...

You can add (and remove) additional swapfiles during runtime, or rather on demand. I'm just unaware of any mechanism doing that automagically, though.

Could probably done in eBPF and some shell scripts, I guess?

5 hours agoLargoLasskhyfv

Huh? What does swap area size have to do with responsiveness under load? Linux has a long history of being unusable under memory pressure. systemd-oomd helps a little bit (killing processes before direct reclaim makes everything seize up), but there's still no general solution. The relevance to history is that Windows got is basically right ever and Linux never did.

Nothing to do with overcommit either. Why would that make a difference either? We're talking about interactivity under load. How we got to the loaded state doesn't matter.

3 hours agoquotemstr

In Linux the default swap behaviour is to also swap out the memory mapped to the executable file, not just memory allocated by the process. This is a relatively sane approach on servers, but not so much on desktops. I believe both Windows and macOS don't swap out code pages, so the applications remain responsive, at the of (potentially) lower swap efficiency

4 hours agonasretdinov

Don’t know about Macs, but on Windows executable code is treated like a readonly memory mapped file that can be loaded and restored as the kernel sees fit. It could also be shared between processes, though that is not happening that much anymore due to ASLR.

an hour agosuperjan

Oh yeah. Bug 12309 was reported now what, 20 years ago? It’s fair to say that at this point arrival of GNU Mach will happen sooner than Linux will be able to properly work under memory pressure.

2 hours agodemosito666

I've had that same experience. On new systems I install earlyoom. I'd rather have one app die than the whole system.

You'd think after 30 years of GUIs and multi-tasking, we'd have this figured out, but then again we don't even have a good GUI framework.

7 hours ago01HNNWZ0MV43FF

I used to use it but it's too aggressive. It kills stuff too quickly.

4 hours agoLtWorf

Like Linux / open source often, it depends on what you do with it!

The kernel is very slow to kill stuff. Very very very very slow. It will try and try and try to prevent having to kill anything. It will be absolutely certain it can reclaim nothing more, and it will be at an absolute crawl trying to make every little kilobyte it can free, swapping like mad to try options to free stuff.

But there are a number of daemons you can use if you want to be more proactive! Systemd now has systemd-oomd. It's pretty good! There's others, with other strategies for what to kill first, based on other indicators!

The flexibility is a feature, not a bug. What distro are you on? I'm kind of surprised it didn't ship with something on?

a day agojauntywundrkind

The annoying thing I've found with Linux under memory stress (and still haven't found a nice way to solve) is I want it to always always always kill firefox first. Instead it tends to either kill nothing (causing the system to hang) or kill some vital service.

5 hours agorwmj

Linux being... Linux, it's not easy to use, but it can do what you want.

1. Use `choom` to give your Firefox PIDs a score of +1000, so they always get reaped first

2. Use systemd to create a Control Group to limit firefox and reap it first (https://dev.to/msugakov/taking-firefox-memory-usage-under-co...)

3. Enable vm.oom_kill_allocating_task to kill the task that asked for too much memory

4. Nuclear option: change how all overcommiting works (https://www.kernel.org/doc/html/v5.1/vm/overcommit-accountin...)

4 hours ago0xbadcafebee

I'm not sure that I'd want the OS to kill my browser while I'm working within it.

Of course the browser is the largest process in my system, so when I notice that memory is running low I restart it and I gain some 15 GB.

Basically I am the memory manager of my system and I've been able to run my 32 GB Linux laptop with no swap since 2014. I read that a system with no swap is suboptimal but the only tradeoff I notice is that manual OOM vs less writes on my SSD. I'm happy with it.

5 hours agopmontra

There are two pillars to managing RAM with virtual memory: the obvious one is is writing one program's working set to disk, so that another program can use that memory. The other one - which isn't prevented by disabling swap - is flushing parts of a program which were loaded from disk, and reloading them from disk when next needed.

That second pillar is actually worse for interactivity than swapping the working set, which is why disabling swap entirely isn't considered optimal.

By far the best approach is just to have an absurd amount of RAM - which of course is a much less accessible option now than it was a year ago.

4 hours agorobinsonb5

You can bump /proc/$firefox_pid/oom_score_adj to make it likely target. The easiest way is to make wrapper script that bumps the score and then starts firefox. All children will inherit the score.

4 hours agodelamon

If using systemd-oomd, you can launch Firefox into it's own cgroup / systemd.scope, that has memory pressure control settings set to not kill it. ManagedOOMPreference=avoid.

https://www.freedesktop.org/software/systemd/man/latest/syst...

There's a variety of oom daemons. bustd is very lightweight & new. earlyoom has been around a long time, and has an --avoid flag. https://github.com/rfjakob/earlyoom?tab=readme-ov-file#prefe...

Your concerns are very addressable.

4 hours agojauntywundrkind

Yeah sytemd-oomd seems tuned for server workloads, I couldn't get it to stop killing my session instead of whichever app had eaten the memory.

Honestly on the desktop I'd rather a popup that allowed me to select which app to kill. But the kernel doesn't seem to even be able to prioritize the display server memory.

2 hours agoajb

> Windows remains mostly fully responsive even when memory is being pushed to the limits and swapping gigabytes per second

In my experience this is only on later versions of the NT Kernel and only on NVME (mostly the latter I think).

5 hours agoOnavo

Yeah I think SSD / NVME makes all the difference here - I certainly remember XP / Vista / Win 7 boxes that became unusable and more-or-less unrecoverable (just like Linux) once a swap storm starts.

4 hours agorobinsonb5

NT4 exhibits the same behavior under extreme load.

2 hours agop_ing

Is there a reason as to why Linux has not adopted straight up page compression like Windows and macOS?

3 hours agoChocolateGod

It has? Many distros have it enabled by default, too. The article discusses it quite a bit.

an hour agocreatonez

It did, there's two incompatible approaches, zram and zswap.

an hour agoGrayShade

OOM killers serve a purpose but - for a desktop OS - they're missing the point.

In a sane world the user would decide which application to shut down, not the OS; the user would click the appropriate application's close gadget, and the user interface would remain responsive enough for that to happen in a matter of seconds rather than minutes.

I understand the many reasons why that's not possible, but it's a huge failing of Linux as a desktop OS, and OOM killers are picking around the edges of the problem, not addressing it head-on.

(Which isn't to say, of course, that OOM killers aren't the right approach in a server context.)

4 hours agorobinsonb5

OpenBSD and the rest have a limits file where you can set RAM limits per user and sometimes per process, so is not a big issue.

On GNU/Linux and the rest not supporting dynamic swap files, you can swap into anything ensembling a file, even into virtual disk images.

Also set up ZRAM as soon as possible. 1/3 of the physical RAM for ZRAM it's perfect, it will almost double your effective RAM size with ease.

4 hours agoanthk

You can do this with cgroups but you aren't allowed to use cgroups if you use systemd, because it messes up systemd.

an hour agogzread

I think zswap is the better option because it's not a fixed RAM storage, it merely compresses pages in RAM up to a variable limit and then writes to swap space when needed, which is more efficient.

It worked very well with my preceding laptop limited to 4GB of RAM.