209

O(x)Caml in Space

Well, I might have been the first to put OCaml in space, specifically on low-Earth orbit aboard GHGSat-D in 2016. I designed the payload software as a collection of SystemD services talking over DBus, and it included a CCSDS-to-DBus bridge to talk to the platform (the thing that hosts the payload, controls and steers the satellite). The payload also did perform symmetric-key encryption of the resulting data, as per regulations.

I gave a talk about the payload software at the Paris OCaml users group.

The reason for selecting that archicture was that I didn't expect to write the whole payload software by myself, and I assumed that when some other developers join in they would, obviously, not want to use a weird language like OCaml, and so they could write their portion in C/C++/whatever and the system could still work. Of course that didn't happen.

I'd be surprised if the company still uses OCaml, as the standad tendency is to revert to "industry-standard" languages to get industry-standard problems. The whole processing and simulation toolchain was also written in OCaml.

Today there is little reason not to use Rust and it can cover both the processing side and the payload software. But people still insist on using C/C++. I'm OK with that as long as I can invoice them.

EDIT: Found my slides https://lambda-diode.com/static/data/GHGSat_OCaml.pdf

7 hours agorho_soul_kg_m3

Oh, hey Berké,

The GHGSat constellation's payload software is still mostly OCaml, although a limited amount of newer from scratch components are indeed in Rust. It's been working well and on 16 satellites now - but as you said the main challenge has been training developers to Ocaml and I doubt they would write new code in it now.

6 hours agogreenarc

> the main challenge has been training developers to Ocaml and I doubt they would write new code in it now

Why do I never hear about these kinds of opportunities? I have done some Ocaml, quite a bit of embedded systems, and these days I have to waste the years doing web development.

Where do I have to call to be considered for doing OCaml embedded systems?

4 hours agojavcasas

Right, I always find these kinds of statements about "we can't find talent in <'weird' language X>" a bit confusing because I personally know all kinds of people always desperate to find work in neat-lang be it Haskell, OCaml, whatever... But the opportunities never seem to be there.

And it was only 3-4 years ago (maybe less) that Rust was considered by hiring managers to be in that category, too. Ask me how I know.

I'm going to assume it really means that they can't find people who satisfy some other constraint (location, pay band, "required" degree, experience on some other system or in some industry, etc) and OCaml or whatever.

In any case, LLMs blunt this. Hell, please stop me from opening a tab and starting a new OCaml project right now.

2 hours agocmrdporcupine

Because in general, when they get the candidates that could fit the position, they get grilled in meaningless letcode interviews, or classical stuff like how many golf balls fit into a plane.

The pool is already small, and gets reduced even further.

an hour agopjmlp

If it ain't broke why fix it?

4 hours agoskyblock500

(author of the post here)

Hey Berké! I remember your talk very well (I was in the room), super interesting and it really got me thinking about this area!

Since then, the more I look into it, the more I see a fit with our MirageOS unikernel work. On the ground, you can paper over security and specialisation by throwing more machines (or money) at the problem. In orbit you cannot, so both the compile-time and the runtime guarantees have to be right!

4 hours agoeriangazag

Besides certification issues, it is a matter of culture.

That is why I say I see Rust main domains, environments where any form of automated resource management is not possible due to technical reasons, or (your point) it is a waste of time trying to convince people out of their beliefs.

Thanks for the presentation.

2 hours agopjmlp

> Today there is little reason not to use Rust and it can cover both the processing side and the payload software. But people still insist on using C/C++. I'm OK with that as long as I can invoice them.

Any reason _not_ to continue using ocaml besides being less popular?

If popularity/mindshare wasn't an issue, I find the development cycle with ocaml to be nicer in several ways compared to rust on a platform where stuff like python is already allowed (I wouldn't call a full-blown linux system, even with limited memory, "embedded").

4 hours agoffaser5gxlsll

Do you have a link to your talk? I'm also curious if you did any GHG measurements, or it was part of the control stack. We wrote the XenServer stack in OCaml back in 2004, and that made it into orbit in 2017 (I think it did, anyway: https://www.theregister.com/offbeat/2017/05/12/space-upstart...)

7 hours agoavsm

Yes see above.

OCaml was very much part of the GHG measurements. On the satellite it was controlling the cameras, acquiring the images, losslessly compressing them, encrypting them and transferring them to the platform controller using a clunky but mandated CSP-based file tranfer protocol. On the ground, OCaml was running almost the entire data processing chain, including spectroscopy, image corrections, retrievals and post-retrieval ad hoc bias corrections, as well as simulations.

I simply used an mmap()'d Bigarrays to do parallel processing (back then OCaml wasn't multi-core.)

At a later stage I replaced a few bits of code (e.g. some sparse matrix routines) with Fortran. The only processing-related part that wasn't OCaml (besides the shells scripts to glue the things together) was the image alignment algorithm which was written by someone else in C++. I even had a job scheduling system written in OCaml.

7 hours agorho_soul_kg_m3

Nice work! Did you ever open any open source any of it? Looking at your OCaml wishlist from back in 2017, some stuff has improved and some is on its way:

- Support for read-only BigArrays (or sections) : we're starting to switch to just using bytes/string in OCaml 5+ now, since the larger allocations go into malloc'ed pools and do not relocate, so they can be used as part of an FFI (without the Bigarray C value overhead)

- More support for floating-point numbers (exceptions, representation exploration): OxCaml has some of this now! https://oxcaml.org/documentation/miscellaneous-extensions/sm...

- Syntax for extended BigArray indexing: now supported in OCaml https://ocaml.org/manual/5.4/indexops.html#ss:multiindexing

- LaCaml remains too low-level (non-functional) and unreadable: still remains the case, but OxCaml's got initial support for SIMD https://oxcaml.org/documentation/simd/intro

- BigArray and floating-point I/O remains difficult (we would like: I/O to channels, efficient representation retrieval): much easier now with OCaml effects to build custom fast serialisers (see https://github.com/ocaml-multicore/eio)

- Native top-level: ocamlnat is (I think) shipped in OxCaml, but you can also run a wasm toplevel

6 hours agoavsm

Thanks. Regarding open-sourcing, well no, it's not up to me, and it would be kind of proprietary.

The size variants for floats and integers is definitely appreciated.

For the "read-only BigArrays": At the time I didn't know any Rust, but today that would simply be passing a mutable or immutable reference. Similar to the Fortran in/out designators in some way. I think that's pretty important when you have some complicated numerical code, sometimes with in-place modification.

Since there is a "zero_alloc checker", maybe a similar kind of annotation exists or could be added? Something like

  let foo (x : [@readonly]) = ... 
    x.{0} <- 1.23

  ^ Attempt to write to read-only array
2 hours agorho_soul_kg_m3

The big win here is having a GC by default, with the ability to reduce heap allocations (via stack) just by adding in more typing annotations.

    Switching to OxCaml with exclave_ stack_ annotations drops 
    p99.9 latency from 29 ns to 9 ns per packet on the dispatch
    hot path, and removes GC pressure entirely (394 minor GCs to
    zero over 25 million packets). Throughput is comparable [...]
I got a similar result with my 'httpz' stack a few months ago (https://anil.recoil.org/notes/oxcaml-httpz) which my website's been running on without drama. And, I gotta say, OxCaml's a surprisingly robust compiler for being packed full of bleeding edge extensions: not a single crash on my infra is attributable to a compiler bug (plenty of bad OCaml code, but not due to a compilation bug)
8 hours agoavsm

It is interesting seeing more and more GCed ecosystems become aggressive about allowing code to stack allocate more. Watching dotnet go through it since I think Core 2.1, or whenever they introduced Span<T>, Memory<T>, etc to get significant performance gains has been nice to track.

GCed languages do not have to be slow if you keep the garbage to only where it is necessary (or where you can allocate once and never collect).

2 hours agorunevault

This was pretty common in the 1980's-90's, for some strange reason, maybe due to Java and scripting languages, there is this mentality that having a GC means no stack allocations.

Lisp Machines dialects (Genera, TI, Xerox) had primitives for stack allocation.

Them we had Cedar, CLU, Oberon and all its descendants, Modula-2+, Modula-3, Eiffel, Sather, and probably others during the last century.

Ironically the final design for Valhala in Java seems to be quite close to Eiffel already had in 1986.

an hour agopjmlp

Yes, just like PTC and Aicas have been delivering real time GC with their embedded Java toolchains, microEJ, Astrobe with Oberon, and Meadow with their micro kernel + .NET.

Mentally only gets changed with people pushing against "this is how it has always been".

Also great to see the OCaml improvements, as my first ML was Caml Light.

2 hours agopjmlp

I think robustness is helped a lot by the fact that it’s the production compiler used at Jane Street

7 hours agoShoop

Yeah; all the really dangerous extensions are gated behind flags. But there's still a very significant number of optimisations available by default that just work well. I've taken to compiling my normal OCaml code with OxCaml these days to get a free speed boost (but buyer beware: the dependency management can be tricky; I have a giant monorepo to help out https://github.com/avsm/oxmono)

7 hours agoavsm

Nim does much the same. It prefers the stack, wraps dynamic heap types in value-semantic unique pointers by default, and avoids implicit copies wherever it can. I could see compiled languages trending in the stack-managed direction long term.

7 hours agonetbioserror

It's more like a seamless generalization of the "stack-managed" pattern since async contexts also usually manage resources but don't reside on a stack.

5 hours agozozbot234

I know that many garbage collected languages have ways of reducing gc pressure by minimizing classes, and pushing more things on the stack. I’ve even heard how languages like Java will allocate a massive amount of memory in the beginning, and then turn. Off the garbage collector for the whole day in high frequency trading scenarios.

Having never been in this situation, I wonder how difficult it is to bend a garbage collected language to behave like a non garbage collected one

6 hours agoDecabytes

It's always difficult to have GC and non-GC objects interact seamlessly. You have to allow GC object finalizers to drop non-GC data, and non-GC objects to register GC objects they might reference as temporary roots (keep them alive) or somehow allow the GC tracing pass to discover what they might be referencing. And you still can't involve non-GC objects in any cycles, they have to be neatly self-contained leaf-like or tree-like sections of your reference graph.

5 hours agozozbot234

Depends if the language supports value types and stack allocation or not.

Many GC languages do so.

The hard part is that the difference is part of the type system, and you might need to refactor some code moving between value and reference types.

an hour agopjmlp

CCSDS guides you to reinvent everything from scratch, I doubt memory safety is the biggest attack surface when you implement this stack. I dont know how big players implement networking for their satellites, but personally I would choose to fit something existing and battle-tested like TLS instead of reinventing data encryption, just look at those documents: https://www.google.com/search?client=firefox-b-lm&q=ccsds+en...

7 hours agodsab

(author of the post here)

Hey dsab! I agree, but CCSDS is what we have today. We need to support it properly first if we ever want to extend or transition away. It also doesn't help that there's no good open-source implementation of the whole stack, especially the SDLS part, which makes the transition even harder.

On the type-safety side, I found typed combinators really useful for describing parsing and serialising (see my earlier post on ocaml-wire[1]), and keeping the protocol logic pure (separate from I/O) makes the whole thing much easier to test and reason about. OCaml's fuzzing support pairs really well with types too. This is basically the nqsb-TLS approach [2], which has held up in ocaml-tls for a decade.

[1] https://gazagnaire.org/blog/2026-03-31-ocaml-wire.html [2] https://www.usenix.org/conference/usenixsecurity15/technical...

4 hours agoeriangazag

The TL;DR here (https://ccsds.org/Pubs/350x9g2.pdf) seems to be "AES GCM", but with lots of lots of legacy protocols due to older birds in the sky. DTLS or HTTP3 would seem to be a better choice these days...

7 hours agoavsm

What’s surprised me in the last few months is that agents are great at producing OCaml 5+ and OxCaml code, not much of which is out there in the training data. OxCaml’s strong types and modes seem to serve as great testable oracles to guide the agents.

I taught a course on concurrent programming based on OCaml 5 and OxCaml where almost all of the code in the teaching materials were vibe coded. I reviewed all of the code (because I was teaching it to a class of 50+ students) and frankly the agent writes better O(x)Caml (mostly) than me.

7 hours agokcsrk

I must confess to also using agents to do most of my OxCaml annotations: https://github.com/avsm/ocaml-claude-marketplace/tree/main/p...

There's not that much downside since the annotations only change the performance characteristics of the program, and the static type system rejects inconsistent annotations.

7 hours agoavsm

There is some bizarre facility with hindley-milner based languages embedded in LLMs, they're basically automatically good at even very new ones like gleam and nanolang. I have a never-released-anywhere hobby ML that compiles to lua and coding models can write it fine. Better than it writes python or php for sure and those have huge corpuses in the training data.

I don't even have good conjecture about why this is the case but right now all my assisted coding is in MLs for this reason.

2 hours agogiraffe_lady

I've long thought that Rust needs a similar algebraic effects system to OCaml 5, has anyone used both and compared how well they work for various use cases? Rust is of course more mature than OxCaml but if it's good enough for Jane Street...

6 hours agosatvikpendem

HN is currently obsessed with Rust vs Zig. OxCaml should be considered as an alternative to both. The argument for Rust is safety, while for Zig it's ergonomics, but OxCaml shows you can have safety and ergonomics together. In my little tinkering with it [1] I found it really easy to use.

[1]: https://noelwelsh.com/posts/a-quick-introduction-to-oxcaml/

7 hours agonoelwelsh

OxCaml is more of a competitor to Go, JS/Typescript or the Java/.NET ecosystems than these two other languages. It's also a temporary effort that's ultimately intended to feed into upstream Ocaml.

5 hours agozozbot234

I think that’s not true; vanilla OCaml is already a competitor to Go, etc. OxCaml is explicitly an effort to compete more with Rust (the “Ox” in the name is to evoke “oxidizing” = rusting)

5 hours agolegobmw99

> the “Ox” in the name is to evoke “oxidizing”

Hah, I was reading it as `0x`, a common prefix indicating hexadecimal, though I can't say my brain made any leap as to why "0xCAML" would be any more hex than standard.

3 hours agomichaelcampbell

Agreed with this. OxCaml still requires a runtime, so it's not suitable for some applications, like embedded systems, where e.g. Rust can be used. But it certainly can be used for many of the same applications. E.g. Bun, which has been on the home page recently, could easily be written in OxCaml.

4 hours agonoelwelsh

Only for GC haters.

For the rest of us, languages with automatic resource management are perfectly usable in systems programming.

an hour agopjmlp

I'm not deeply familiar with reliability-focused languages, but as far as I know, Ada, Rust, and Haskell are the most prominent ones. What made OCaml a better choice here over those alternatives?

5 hours agoMaksadbek

(author of the post here)

Hey Maksadbek! Great question. It's a trade-off between speed of writing and trust in what you wrote, and OCaml (especially OxCaml) sits at a really good point on that curve.

Ada/SPARK has the strongest verification story and decades of space heritage, but the development cost is higher. Rust would work too, but I actively want a GC by default with the option to turn it off on the hot path. That is exactly what OxCaml's mode system gives you: zero minor GCs on the dispatch loop in the post, while the rest stays GC-managed. Haskell is great for type-driven design but its runtime cost-model is harder for low-jitter work.

Plus, the OCaml ecosystem gave me solid foundations on both fronts. For the protocol stack: MirageOS-style clean separation between wire serialisation, pure state-machine management and I/O, with ML modules and GADTs that map naturally onto protocol state machines. For the crypto: mirage-crypto for OCaml-facing primitives (fiat-crypto under the elliptic curves), and libcrux for ML-DSA-65 post-quantum signing. The CCSDS and BPv7/BPSec layers themselves I had to write from scratch (my earlier posts walk through how), and 20 years of OCaml muscle memory definitely helped!

4 hours agoeriangazag

Hey Thomas, thank you for detailed explanation! This sounds very interesting. I really wanted to learn something less complex than Rust/Haskell and fast enough for high performance computing and also reliable language. I should learn some OCaml :)

3 hours agoMaksadbek

>KC ended his talk speculating that OCaml 5.0 would go to the moon, due to its language features that would deliver C/Rust-like performance

That is quite an affirmation! I would likle to see OCaml being there.

6 hours agoDeathArrow

I'll have to take a look at OxCaml. I'm leery of "C-like" performance claims after Java has thoroughly failed to live up to a similar claim after thirty plus years of development... What it's actually achieved is about 50% C performance, IF you're willing to give it a huge heap, at least 2x the actually required memory.

Rust is clearly well positioned for deeply embedded work, and has actual C/C++ level performance. Given AI coding assistance, Rust is looking more and more approachable...and of course faster processors and compiler improvements will solve the compilation speed issue over time.

All that said, there's nothing wrong with a fast, safe language with ML syntax!

(One dark horse in all this is Mojo, which may provide Rust level safety with a more ergonomic language, and a much faster compiler...)

6 hours agoElectronCharge

With the recent direction Mojo has taken, that dark horse sounds more like a pipedream to me

3 hours agoskyblock500

nice

8 hours agochloe_liu23

nice

8 hours agoharrymatics

She (Jane Street) is not gonna notice you bro.

8 hours agohudsonhs

don’t look at the user that reposted the article