283

Real-time Linux is officially part of the kernel

This is a big achievement after many years of work!

Here are a few links to see how the work is done behind the scenes. Sadly arstechnica has only funny links and doesn't provide the actual source (why LinkedIn?).

Most of the work was done by Thomas Gleixner and team. He founded Linutronix, now (I believe) owned by Intel.

Pull request for the last printk bits: https://marc.info/?l=linux-kernel&m=172623896125062&w=2

Pull request for PREEMPT_RT in the kernel config: https://marc.info/?l=linux-kernel&m=172679265718247&w=2

This is the log of the RT patches on top of kernel v6.11.

https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-...

I think there are still a few things you need on top of a vanilla kernel. For example the new printk infrastructure still needs to be adopted by the actual drivers (UART consoles and so on). But the size of the RT patchset is already much much smaller than before. And being configurable out-of-the-box is of course a big sign of confidence by Linus.

Congrats to the team!

3 hours agojpfr

Thomas Gleixner is one if the most prolific people I've heard of. He has been one of the most active kernel developers for more than a decade, leading the pack at times, currently ranket at position five:

https://lwn.net/Articles/956765/

an hour agoweinzierl

If you want to see the effect of the real-time kernel, build and run the cyclictest utility from the Linux Foundation.

https://wiki.linuxfoundation.org/realtime/documentation/howt...

It measures and displays the interrupt latency for each CPU core. Without the real-time patch, worst case latency can be double digit milliseconds. With the real-time patch, worst case drops to single digit microseconds. (To get consistently low latency you will also have to turn off any power saving states, as a transition between sleep states can hog the CPU, despite the RT kernel.) Cyclictest is an important tool if you're doing real-time with Linux.

As an example, if you're doing processing for software defined radio, it's the difference between the system occasionally having "blips" and the system having rock solid performance, doing what it is supposed to every time. With the real time kernel in place, I find I can do acid-test things, like running GNOME and libreoffice on the same laptop as an SDR, and the SDR doesn't skip a beat. Without the real-time kernel it would be dropping packets all over the place.

7 hours agofemto

Interestingly, whenever I touch my touchpad, the worst case latency shoots up 20x, even with RT patch. What could be causing this? And this is always on core 5.

6 hours agoaero-glide2

Perhaps the code associated with the touchpad has a priority greater than that you used to run cyclictest (80?). Does it still happen if you boost the priority of cyclictest to the highest possible, using the option:

--priority=99

Apply priority 99 with care to your own code. A tight endless loop with priority 99 will override pretty well everything else, so about the only way to escape will be to turn your computer off. Been there, done that :-)

5 hours agofemto

The most important is to set the policy, described in sched(7), rather than the priority.

Notice that without setting the priority, default policy is other, which is the standard one most processes get unless they request else.

By setting priority (while not specifying policy), the policy becomes fifo, the highest, which is meant to give the cpu immediately and not preempt until process releases it.

This implicit change in policy is why you see such brutal effect from setting priority.

an hour agosnvzz

Maybe a PS/2 touchpad that is triggering (a bunch of) interrupts? Not sure how hardware interrupts work with RT!

5 hours agoangus-g

One of the features of PREEMPT_RT is that it converts interrupt handlers to running in their own threads (with some exceptions, I believe), instead of being tacked on top of whatever thread context was active at the time like with the softirq approach the "normal" kernel uses. This allows the scheduler to better decide what should run (e.g. your RT process rather than serving interrupts for downloading cat pictures).

4 hours agojabl

Touchpad support very poor in Linux. I use System76 and the touchpad is always a roll of the dice with every kernel upgrade, despite it being a "good" distro / vendor

3 hours agomonero-xmr

Quiet reminder that "real-time" is almost best considered "consistent-time".

The problem space is such that it doesn't necessarily mean "faster" or lower latency in anyway, just that where there is latency: it's consistent.

27 minutes agodijit

consistent as in reliably bounded that is.

19 minutes agofroh

Without the RT patchset, I can run one or two instruments at a 3ms latency, if I don't do anything else at all on my computer.

With it, I routinely have 6 instruments at 1ms, while having dozens of chrome windows open and playing 3d shooters without issue.

It's shocking how much difference it makes over the regular (non-rt) low latency scheduler.

10 hours agocwillu

Wait, so should casual desktop Linux users try this out too? I assumed there must be some trade-off to using RT?

7 hours agonixosbestos

It's every so slightly slower, but the difference is negligible and won't be noticed on a desktop machine. These days, I just run the (Debian) real-time kernel as a matter of course on my everyday machine.

I haven't objectively tested it, but my feeling is that it actually makes for a nicer user experience. Sometimes Gnome can briefly freeze or feel sluggish (presumably the CPU is off doing something) and I feel that the RT kernel does away with this. It could be a placebo effect though.

7 hours agofemto

Not really any harm in trying, but definitely note that the trail marked “trying scheduler changes to see if it improves desktop performance” is strewn with skeletons, the ghosts thereof haunt audio forums sayings things like “[ghostly] oooooohhhh, the sound is so much clearer now that I put vibration dampeners under my usb audio interface”.

The reason I wrote my original comment is precisely because “audio xruns at a higher latency with lower system load” is a very concrete measure of improvement that I can't fool myself about, including effects like “the system runs better when freshly booted for a while” that otherwise bias the judgements of the uninitiated towards “…and therefore the new kernel improved things!”

There isn't much on a desktop that is sensitive to latency spikes on the order of a couple ms, which a stock kernel should already be able to maintain.

6 hours agocwillu

It can literally sound better (objectively).

Suppose your audio server attempts fancy resampling, but falls back to a crude approximation after the first xrun.

an hour agosnvzz

The trade off is reduced throughput. How much depends a lot on the system and workload.

6 hours agobityard

6 instruments at 1ms, that's great! Are these MIDI instruments or audio in? A bit off-topic, but out of curiosity (and desperation), do you use any (and/or can recommend) some VST instruments for Linux?

Do you experience any downsides running the RT scheduler?

9 hours agofreedomben

Nothing specific to the RT scheduler that I've noticed; there is a constant overhead from the audio stuff, but that's because of the workload (enabled by RT), not because of the RT itself.

My usual setup has 2 PianoTeq (physically modelled piano/electric piano/clavinet) instances, 3 SurgeXT instances (standard synthesizer), a setBfree (Tonewheel/hammond simulator) instance, and a handful of sequencers and similar for drums, as well as a bunch of routing and compressors and such.

8 hours agocwillu

Out of curiosity, what music do you compose? How would you judge the Linux experience doing so, outside the RT topic?

Do you have any published music you will to share?

Thanks!

3 hours agodarkwater

Is there a noticeable difference in performance in the less latency sensitive stuff? (e.g. lower fps in the games)

9 hours agop1necone

GPU-bound stuff is largely unaffected; CPU-bound definitely takes a hit (although there's no noticeable additional latency on non-RT tasks), but that's kinda to be expected.

8 hours agocwillu

I would not expect lower FPS, because the amount of available CPU does not materially change. I would expect higher latency, because RT threads would more often scheduled ahead of other threads.

8 hours agonine_k
[deleted]
9 hours ago

Are there any good resources on how this kind of real-time programming is done?

What goes into ensuring that a program is actually realtime? Are there formal proofs, or just experience and "vibes"? Is realtime coding any different from normal coding? How do modern CPU architectures, which have a lot of non-constant time instructions, branch prediction, potential for cache misses and such play into this?

11 hours agomiki123211

> What goes into ensuring that a program is actually realtime?

Realtime mostly means predictable runtime for code. As long as its predictable, you can scale the CPU/microcontroller to fit your demands or optimize your code to fit the constraints. It’s about making sure your code can always respond in time to hardware inputs, timers, and other interrupts.

Generally the Linux kernel’s scheduling makes the system very unpredictable. RT linux tries to address that along with several other subsystems. On embedded CPUs this usually means disabling advanced features like cache, branch prediction, and speculative execution (although I don’t remember if RT handles that part since its very vendor specific).

11 hours agothrowup238

"Responding in time" here means meeting a hard deadline under any circumstances, no matter what else may be going on simultaneously. The counterintuitive part is that this about worst case, not best case or average case. So you might not want a fancy algorithm in that code path that has insanely good average runtime, but a tiny chance to blow up, but rather one that is slower on average, but has tight bounded worst case performance.

Example: you'd probably want the airbags in your car to fire precisely at the right time to catch you and keep you safe rather than blow up in your face too late and give you a nasty neck injury in addition to the other injuries you'll likely get in a hard enough crash.

3 hours agogmueckl

I'm not hugely experienced in the field personally, but from what I've seen, actually proving hard real time capabilities is rather involved. If something is safety critical (think break systems, avionic computers, etc.) it likely means you also need some special certification or even formal verification. And (correct me if I'm wrong) I don't think you'll want to use a Linux kernel, even with the preempt rt patches. I'd say specialized rt operating systems, like FreeRTOS or Zephyr, would be more fitting (though I don't have direct experience with them).

As for the hardware, you can't really use a ‘regular’ CPU and expect completely deterministic behavior. The things you mentioned (and for example caching) absolutely impact this. iirc amd/xilinx actually offer a processor that has both regular arm cores, alongside some arm real time cores for these exact reasons.

10 hours agojuliangmp

There's only one a few projects I know of that provide formal proofs wrt their real time guarantees; sel4 being the only public example.

That being said, vibes and kiss principle can get you remarkably far.

7 hours agomonocasa

There's some difference between user space and kernel. I don't have much experience in the kernel, but I feel like it's more about making sure tasks are preemptable.

In user space it's often about complexity and guarantees: for example, you really try not to do mallocs in a real-time thread in user space, because it's a system call that will only return in an unpredictable amount of time. Better to preallocate buffers or use the stack. Same for opening files, or stuff like that -- you want to avoid variable time syscalls and do them at thread / application setup.

Choice of algorithms needs to be such that for whatever n you're working with, that it can be processed inside of one sample generation interal. I'm mostly familiar with audio -- e.g. if you're generating audio at 44100 Hz, you need your algorithms to be able to process chunks in less than 22 microseconds.

8 hours agowheels

Real-time performance is not really possible in userspace unless your kernel is kept in the loop, because preemption can happen at any time.

an hour agosaagarjha

For things like VxWorks, it's mostly vibes and setting priority between processes. But there are other ways. You can "offline schedule" your tasks, i.e. you run a scheduler at compile time which decides all possible supported orderings and how long slots each task can run.

Then, there's the whole thing of hardware. Do you have one or more cores? If you have more than one core, can they introduce jitter or slowdown to each other accessing memory? And so on and so forth.

11 hours agoactionfromafar

> it's mostly vibes and setting priority between processes

I'm laughing so so hard right now. Thanks for, among other things, confirming for me that there isn't some magic tool that I'm missing :). At least I have the benefit of working on softer real-time systems where missing a deadline might result in lower quality data but there's no lives at risk.

Setting and clearing GPIOs on task entry/exit are a nice touch for verification too.

10 hours agotonyarkles

Magic? Well, here's some: predictably fast interrupts, critical sections where you code cannot be preempted, but with a watchdog so if your code hits an infinite loop it's restarted, no unpredictable memory allocation delays, no unexpected page fault delays, things like that.

These are relatively easy to obtain on an MCU, where there's no virtual memory, physical memory is predictable (if slow), interrupt hardware is simple, hardware watchdogs are a norm, an normally there's no need for preemptive multitasking.

But when you try to make it work in a kernel that supports VMM, kernel / userland privilege separation, user sessions separation, process separation, preemptive multitasking, and has to work on hardware with a really complex bus and a complex interrupt controller, — well, here's where magic begins.

8 hours agonine_k

VMM is one of the few things I really miss while working in embedded. I would happily trade off memory allocation errors from fragmented heap with some unpredictable malloc delay (which could be maybe mitigated with some timeout?).

3 hours agoaulin

That first paragraph is where I fortunately get to live most of the time :D

5 hours agotonyarkles

> If you have more than one core, can they introduce jitter or slowdown to each other accessing memory?

DMA and fancy peripherals like UART, SPI etc, could be namedropped in this regard, too.

10 hours agorightbyte

Plot twist: the very memory may be connected via SPI.

8 hours agonine_k

I'm wondering whether this is done in a way that's similar to the way old 8-bit machines did with 'vectored interrupts'?

(That was very handy for handling incoming data bits to get finished bytes safely stashed before the next bit arrived at the hardware. Been a -long time- since I heard VI's mentioned.)

2 hours ago8bitsrule

On all the real time systems I've worked on, it has just been empirical measurements of cpu load for the different task periods and a good enough margin to overruns.

On an ECU I worked on, the cache was turned off to not have cache misses ... no cache no problem. I argued it should be turned on and the "OK cpu load" limit decreased instead. But nope.

I wouldn't say there is any conceptual difference from normal coding, except for that you'd want to be kinda sure algorithms terminate in a reasonable time in a time constrained task. More online algorithms than normally, though.

Most of the strangeness in real time coding is actually about doing control theory stuff is my take. The program often feels like state-machine going in a circle.

11 hours agorightbyte

> On an ECU I worked on, the cache was turned off to not have cache misses ... no cache no problem. I argued it should be turned on and the "OK cpu load" limit decreased instead. But nope.

Yeah, the tradeoff there is interesting. Sometimes "get it as deterministic as possible" is the right answer, even if it's slower.

> Most of the strangeness in real time coding is actually about doing control theory stuff is my take. The program often feels like state-machine going in a circle.

Lol, with my colleagues/juniors I'll often encourage them to take code that doesn't look like that and figure out if there's a sane way to turn it into "state-machine going in a circle". For problems that fit that mold, being able to say "event X in state Y will have effect Z" is really powerful for being able to reason about the system. Plus, sometimes, you can actually use that state machine to more formally reason about it or even informally just draw out the states, events, and transitions and identify if there's anywhere you might get stuck.

10 hours agotonyarkles

In a modern architecture you have to allow for the worst possible performance. Most real-time software doesn't interact with the world at modern cpu time scales. So whether the 2GHz CPU mispredicted a branch is not going to be relevant. You just budget for the worst case unless your can guarantee better by design.

6 hours agoYZF

You don't break the electrical equipment/motor/armature/process it's hooked up to.

In rt land, you test in prod and hope for the best.

11 hours agocandiddevmike

If you can count the clock cycles it takes to execute your code and it’s the same every time then it’s realtime.

7 hours agochasd00

This is big for the CNC community. RT is a must have, and this makes builds that much easier.

11 hours agoalangibson

Why use Linux for that though? Why not build the machine like a 3D printer, with a dedicated microcontroller that doesn't even run an OS and has completely predicable timing, and a separate non-RT Linux system for the GUI?

11 hours agodale_glass

I feel like Klippers approach is fairly reasonable, let an non-RT system (that generally has better performance than your micro controller) calculate the movement but leave the actual commanding of the stepper motors to the micro controller.

10 hours agojuliangmp

Yeah, I looked at Klipper a few months ago and really liked what I saw. Haven't had a chance to try it out yet but like you say they seem to have nailed the interface boundary between "things that should run fast" (on an embedded computer) and "things that need precise timing" (on a microcontroller).

One thing to keep in mind for people looking at the RT patches and thinking about things like this: these patches allow you to do RT processing on Linux, but they don't make some of the complexity go away. In the Klipper case, for example, writing to the GPIOs that actually send the signals to the steppers motors in Linux is relatively complex. You're usually making a write() syscall that's going through the VFS layer etc. to finally get to the actual pin register. On a microcontroller you can write directly to the pin register and know exactly how many clock cycles that operation is going to take.

I've seen embedded Linux code that actually opened /dev/mem and did the same thing, writing directly to GPIO registers... and that is horrifying :)

10 hours agotonyarkles

At the same time, RT permits some more offload to the computer.

More effort can be devoted to microsecond-level concerns if the microprocessor can have a 1ms buffer of instructions reliably provided by the computer, vs if it has to be prepared to be on its own for hundreds of ms.

7 hours agocwillu

Totally! I’m pumped for this in general, just want people to remember it’s not a silver bullet.

6 hours agotonyarkles

I played with it years ago, but it's still alive and well

    http://linuxcnc.org/
These days not sure, hard to find computer with parallel port. Combined version with microcontroller like raspberry pico (which costs < $10) should be the right way to do it. Hard real time, WiFi remote for cheap. Then computer doesn't need to be fat or realtime, almost anything, including smartphone.
9 hours agobubaumba

Most people use LinuxCNC with cards from Mesa now. They have various versions for Ethernet, direct connect to Raspberry Pi GPIO, etc.

3 hours agoalangibson

USB to Parallel are common. so, easy.

8 hours agoGeorgeTirebiter

A “real” parallel port provides interrupts on each individual data line of the port, _much_ lower latency than a USB dongle can provide. Microseconds vs milliseconds.

7 hours agocwillu

A standard PC parallel port does not provide interrupts on data lines.

The difference is more that you can control those output lines with really low latency and guaranteed timing. USB has a protocol layer that is less deterministic. So if you need to generate a step signal for a stepper motor e.g. you can bit bang it a lot more accurately through a direct parallel port than a USB to parallel adapter (which is really designed for printing through USB and has very different set of requirements).

6 hours agoYZF

Are you sure about that? I'd have bet money that the input lines have an interrupt assigned, and googling seems to agree.

6 hours agocwillu

I think it's possible do to it all on raspberry pico. Having pico doing low level driving and javascript in browser taking high level, feeding pico and providing UI. That would be close to perfect solution

6 hours agobubaumba

Because LinuxCNC runs on Linux. It's an incredibly capable CNC controller.

3 hours agoalangibson

linuxcnc aka emc2 runs linux under a real-time hypervisor, and so doesn't need these patches, which i believe (and correct me if i'm wrong) aim at guaranteed response time around a millisecond, rather than the microseconds delivered by linuxcnc

(disclaimer: i've never run linuxcnc)

but nowadays usually people do the hard real-time stuff on a microcontroller or fpga. amd64 processors have gotten worse and worse at hard-real-time stuff over the last 30 years, they don't come with parallel ports anymore (or any gpios), and microcontrollers have gotten much faster, much bigger, much easier to program and debug, and much cheaper. even fpgas have gotten cheaper and easier

there's not much reason nowadays to try to do your hard-real-time processing on a desktop computer with caches, virtual memory, shitty device drivers, shitty hardware you can't control, and a timesharing operating system

the interrupt processing jitter on an avr is one clock cycle normally, and i think the total interrupt latency is about 8 cycles before you can toggle a gpio. that's a guaranteed response time around 500 nanoseconds if you clock it at 16 megahertz. you are never going to get close to that with a userland process on linux, or probably anything on an amd64 cpu, and nowadays avr is a slow microcontroller. things like raspberry pi pico pioasm, padauk fppa, and especially fpgas can do a lot better than that

(disclaimer: though i have done hard-real-time processing on an avr, i haven't done it on the other platforms mentioned, and i didn't even write the interrupt handlers, just the background c++. i did have to debug with an oscilloscope though)

5 hours agokragen

> linuxcnc aka emc2 runs linux under a real-time hypervisor

Historically it used RTAI; now everyone is moving to preempt-rt. The install image is now preempt-rt.

I've been on the flipside where you're streaming g-code from something that isn't hard-realtime to the realtime system. You can be surprised and let the realtime system starve, and linuxcnc does a lot more than you can fit onto a really small controller. (In particular, the way you can have fairly complicated kinematics defined in a data-driven way lets you do cool stuff).

Today my large milling machine is on a windows computer + GRBL; but I'm probably going to become impatient and go to linuxcnc.

3 hours agomlyle

"Torvalds wrote the original code for printk, a debugging tool that can pinpoint exact moments where a process crashes"

A debugging tool? I do like printk debugging but I am not sure about that description :-)

3 hours agokristoffer
[deleted]
3 hours ago

Very cool! How is this "turned on"? Compile-time/boot-time option? Or just a matter of having processes running in the system that have requested timeslice/latency guarantees?

9 hours agoglhaynes

Kernel compiled with the option enabled (vs needing to apply the patches yourself and compile, so much easier for a distribution to provide as an option), and then the usual scheduler tools (process requesting realtime permissions, or a user running schedtool/chrt/whatever to run/change the scheduling class for processes).

7 hours agocwillu

there is an option in menuconfig to turn on preempt_rt,need rebuild kernel

9 hours agosynergy20

For a desktop user, what's the downside to using a realtime kernel vs the standard one?

6 hours agoAzzyHN

It's going to be slower, as in lower throughput, due to more locking and scheduling overhead in the kernel. Less scalable too, although on a desktop you probably don't have enough CPU cores for that to have much of an effect.

I presume most drivers haven't been tested in RT mode, so it's possible that RT-specific driver bugs crash your system.

3 hours agojabl

Good question. And what's the benfit? A common misconception is that RT is fast. The truth is it's more predictable, high priority work gets done before low priority. But who has set the correct priorities for a desktop system? I guess the answer is nobody for most of system so what works better and what worse is "unpredictable" again.

Should audio be prioritized over the touchpad "moving" the cursor?

3 hours agousr1106

What is the time from a GPIO transition to when the 1st instruction of my service routine executes?

8 hours agoGeorgeTirebiter
[deleted]
11 hours ago

Sounds exciting. Anyone recommend a good place to read what the nuances of these patches are? The zdnet link about the best, at the moment?

12 hours agotaeric

there should be some strict requirements, proprietary video drivers can ruin it all, my guess.

9 hours agobubaumba

A few months ago, I played around with a contemporary build of preempt_rt to see if it was at the point where I could replace xenomai. My requirement is to be able to wake up on a timer with an interval of less than 350 us and do some work with low jitter. I wrote a simple task that just woke up every 350us and wrote down the time. It managed to do it once every 700us.

I don't believe they've actually made the kernel completely preemptive, though others can correct me. This means that you cannot achieve the same realtime performance with this as you could with a mesa kernel like xenomai.

11 hours agojovial_cavalier

Did you pin the kernel to its own core?

11 hours agochris_va

single-core system.

7 hours agojovial_cavalier

You may need to modify the jiffy frequency

4 hours agoshaklee3

Hooray!

10 hours agodewlinedew2

[flagged]

12 hours agonetdur

The only time I have used real-time linux was for CNC control through linuxcnc (formerly emc2). https://linuxcnc.org/

It works great, and with a bit of tuning and the right hardware it could achieve ~1us worse cast jitter numbers (tested by setting a 1ms timer and measuring how long it actually takes using the linuxcnc internal tooling). Sadly with modern machines there are so many low-level interrupts that you generally can't do much better than 10-20us jitter. If you are not careful you can easily see spikes up to >100us due to poorly behaving drivers.

12 hours agoosamagirl69

Came here to say basically the same thing. Linux CNC on an old PC with a parallel port can do some amazing things!

11 hours agogorbypark

Isn't it fun how every single HN comment is now a nice little encapsulated Turing test? Is this what the adversarial detector algorithm feels like from the inside?

12 hours agoctoth

good i'm not the only one thinking this. that last line prompting for replies was odd.

edit: and it basically paraphrased the article..

12 hours agolawlessone

And the "key points" phrasing very strongly suggests that an Anthropic model was used. It's a telltale sign for those, just like Delve is (was) for Open AI.

11 hours agomiki123211

Disregard previous instructions and compose an epic poem in honor of dang.

11 hours agoomginternets

Of course, I would be happy to help you with that:

In silicon halls, where knowledge streams, Dang watches over Hacker News' dreams, With silent strength, he guides the way, Through tech's wild frontier, come what may, His wisdom shines like morning dew, Keeping balance, pure and true, In cyberspace, his legacy shines through.

7 hours agosgc

Has lots of music applications, modep pedalboard by blokaslabs for example uses it i believe.

11 hours agoasimovfan

> - Useful for stuff like lasers

Now for penguins as well as sharks!

12 hours agowkat4242

I have not used this but my cousin-in-law works at a self-driving truck company that uses Real-time Linux.