Notes by djb on using Fil-C

To summarize, he's sufficiently impressed with it that he's embarking on an attempt to rebuild an entire Debian system with it, and he's written some software (a GC shim library and build scripts) that are likely to be of interest to others who are attempting the same thing.

> I had originally configured the server phoenix with only 12GB swap. I then had to restart ./build_all_fast_glibc.sh a few times because the Fil-C compilation ran out of memory. Switching to 36GB swap made everything work with no restarts; monitoring showed that almost 19GB swap (plus 12GB RAM) was used at one point. A larger server, 128 cores with 512GB RAM, took 8 minutes for Fil-C plus 6 minutes for musl, with no restarts needed.

Yikes that’s a lot of memory! Filc is doing a lot of static analysis apparently.

I think that's the build of LLVM+Clang itself.

Yes, linking LLVM takes up a lot of memory. The documented guidance is to allow one link job per 15 GB of RAM [1].

[1] https://llvm.org/docs/CMake.html#frequently-used-llvm-relate...

And, fairly uniquely, LLVM has a LLVM_PARALLEL_LINK_JOBS setting that is distinct from the number of parallel jobs for everything else. I think I was using that 15 years ago.

I wish GCC had it. I have a quad core machine with 16 GB RAM that OOMs on building recent GCC -- 15 and HEAD for sure, can't remember whether 14 is affected. Enabling even 1 GB of swap makes it work. The culprit is four parallel link jobs needing ~4 GB each.

There are only four of them, so a -j8 build (e.g., with HT) is no worse.

Is that why the Rust toolchain can't be compiled on a 32-bit system?

It's part of the problem. Pretty sure though even rustc at this point needs more than 3GB of addressable memory.

[deleted]

For those who might miss it, the notes cite a new 64-bit version of cdb that supports exabyte databases

https://cdb.cr.yp.to

Also maybe of interest is that the new cdb subdomain is using pqconnect instead of dnscurve

The PQConnect documentation, specifically the document "INSTALL.md", describes the pq1 portion of the CNAME as a subdomain.

   Please update your DNS A/AAAA records for all domains on this server as follows:

   Existing record:
   Type    Name        Value
   A/AAAA  SUBDOMAIN   IP Address

   New Records:
   Type    Name        Value
   CNAME   SUBDOMAIN   pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.DOMAIN.TLD
   A/AAAA  pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX  IP Address
   TXT    pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.DOMAIN.TLD    p=42424
   TXT    ks.pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.DOMAIN.TLD    ip=IP ADDRESS;p=42425"

NS record for cdb.cr.yp.to is as follows

   2 cdb.cr.yp.to - streamlined DNSCurve:
   209 bytes, 1+1+1+2 records, response, authoritative, noerror
   query: 2 cdb.cr.yp.to
   answer: cdb.cr.yp.to 30 CNAME pq1jbw2qzb2201xj6pyx177b8frqltf7t4wdpp32fhk0w3h70uytq5020w020l0.yp.to
   authority: yp.to 3600 NS uz5jmyqz3gz2bhnuzg0rr0cml9u8pntyhn2jhtqn04yt3sm5h235c1.yp.to
   additional: uz5jmyqz3gz2bhnuzg0rr0cml9u8pntyhn2jhtqn04yt3sm5h235c1.yp.to 3600 A 131.193.32.108
   additional: uz5jmyqz3gz2bhnuzg0rr0cml9u8pntyhn2jhtqn04yt3sm5h235c1.yp.to 3600 A 131.193.32.109

> Also maybe of interest is that the new cdb subdomain is using pqconnect instead of dnscurve

This is not correct. There isn't a cdb subdomain because cdb.cr.yp.to doesn't have NS records, which is where DNSCurve fits in. If you have a DNSCurve resolver, then your queries for cdb.cr.yp.to will use DNSCurve and will be sent to the yp.to nameservers.

From there, if you have pqconnect, your http(s) connection to cdb.cr.yp.to will happen over pqconnect.

Maybe the confusion is because both DNSCurve and pqconnect encode pubkeys in DNS, but they do different things.

Here is DNSCurve:

  $ dig +short ns yp.to
  uz5jmyqz3gz2bhnuzg0rr0cml9u8pntyhn2jhtqn04yt3sm5h235c1.yp.to.

Here is pqconnect:

  $ dig +short cdb.cr.yp.to
  pq1htvv9k4wkfcmpx6rufjlt1qrr4mnv0dzygx5mlrjdfsxczbnzun055g15fg1.yp.to.
  131.193.32.108

Like CurveCP, pqconnect puts the pubkey into a CNAME.

RFC 1034 Domain Concepts and Facilities November 1987 [Page 8]

"A domain is identified by a domain name, and consists of that part of the domain name space that is at or below the domain name which specifies the domain. A domain is a subdomain of another domain if it is contained within that domain. This relationship can be tested by seeing if the subdomain's name ends with the containing domain's name. For example, A.B.C.D is a subdomain of B.C.D, C.D, D, and " "."

   1 cdb.cr.yp.to - regular DNS:
   124 bytes, 1+2+0+0 records, response, noerror
   query: 1 cdb.cr.yp.to
   answer: cdb.cr.yp.to 30 CNAME pq1jbw2qzb2201xj6pyx177b8frqltf7t4wdpp32fhk0w3h70uytq5020w020l0.yp.to
   answer: pq1jbw2qzb2201xj6pyx177b8frqltf7t4wdpp32fhk0w3h70uytq5020w020l0.yp.to 30 A 131.193.32.109

In the terminology of RFC1034, cdb.cr.yp.to, a CNAME, can be described as a subdomain of cr.yp.to and yp.to

(NB. The pq1 portion is not a public key, it is a hash of a server's long-term public key)

Correction: s/a CNAME/an alias/

[deleted]

Use of pqconnect at yp.to is probably old news but the cdb.cr.yp.to CNAME does appear to be new as of around 21 Oct

The notes on using Fil-C were submitted three days ago

https://news.ycombinator.com/item?id=45765718

s/CNAME/alias/

https://news.ycombinator.com/item?id=45663435 (discussed 11d ago)

Cool project! I take it the goal is that, overhead being acceptable, most C / C++ programmes don't actually "have to be" rewritten in something like Rust?

I wonder how / where Epic Games comes in?

Note that Fil-C is a garbage-collected language that is significantly slower than C.

It's not a target for writing new code (you'd be better off with C# or golang), but something like sandboxing with WASM, except that Fil-C crashes more precisely.

From the topic starter: "I've posted a graph showing nearly 9000 microbenchmarks of Fil-C vs. clang on cryptographic software (each run pinned to 1 core on the same Zen 4). Typically code compiled with Fil-C takes between 1x and 4x as many cycles as the same code compiled with clang"

Thus, Fil-C compiled code is 1 to 4 times as slow as plain C. This is not in the "significantly slower" ballpark, like where most interpreters are. The ROOT C/C++ interpreter is 20+ times slower than binary code, for example.

Cryptographic software is probably close to a best case scenario since there is very little memory management involved and runtime is dominated by computation in tight loops. As long as Fil-C is able to avoid doing anything expensive in the inner loops you get good performance.

  > best case scenario since there is very little memory management involved and runtime is dominated by computation in tight loops.

This describes most C programs and many, if not most, C++ programs. Basically, this is how C/C++ code is being written, by avoiding memory management, especially in tight loops.

This depends heavily on what problem domain you're talking about. For example, a DBMS is necessarily going to shuffle a lot of data into and out of memory.

I am professional database developer. We do not do what you are thinking we are doing. ;)

Most databases do almost no memory management at runtime, at least not in any conventional sense. They mostly just DMA disk into and out of a fixed set of buffers. Objects don't have a conventional lifetime.

[deleted]

It depends. Consider DuckDB or another heavily vectorized columnar DB: there's a big part of the system (SQL parser, storage chunk manager, etc.) that's not especially performance-sensitive and a set of tiny, fast kernels that do things like predicate-push-down-based full table scans, ART lookups, and hash table creation for merge joins. DuckDB is a huge pile of C++. I don't see a RIIR taking off before AGI.

But you know what might work?

Take current DuckDB, compile it with Fil-C, and use a new escape hatch to call out to the tiny unsafe kernels that do vectorized high-speed columnar data operations on fixed memory areas that the buffers safe code set up on behalf of the unsafe kernels. That's how it'd probably work if DuckDB were implemented in Rust today, and it's how it could be made to work with Fil-C without a major rewrite.

Granted, this model would require Fil-C's author to become somewhat less dogmatic about having no escape hatches at all whatsoever, but I suspect he'll un-harden his heart as his work gains adoption and legitimate use-cases for an FFI/escape hatch appear.

> DuckDB is a huge pile of C++. I don't see a RIIR taking off before AGI.

While I'm not a big fan of rewriting things, all of DuckDB has been written in the last 10 years. Surely a rewrite with the benefit of hindsight could reach equivalent functionality in less than 10 years?

the sqlite RIIR is going quite well: https://turso.tech/blog/beyond-the-single-writer-limitation-...

(sqlite is quite a bit smaller than DuckDB tho)

Is it? It's much less new.

for one, duckdb includes all of sqlite (and many other dependencies). it knows how to do things like efficiently query over parquet files in s3. it's expansive - a swiss army knife for working with data wherever it's at.

sqlite is a "self contained system" depending on no external software except c standard library for target os:

> A minimal build of SQLite requires just these routines from the standard C library:

> memcmp(), memcpy(), memmove(), memset(), strcmp(), strlen(), strncmp()

> Most builds also use the system memory allocation routines:

> malloc(), realloc(), free()

> Default builds of SQLite contain appropriate VFS objects for talking to the underlying operating system, and those VFS objects will contain operating system calls such as open(), read(), write(), fsync(), and so forth

Quoting from the appropriately named https://sqlite.org/selfcontained.html

as a very rough and unfair estimate between the two project's source, sqlite is about 8% the size of duckdb:

    $ pwd
    /Users/jitl/src/duckdb/src
    $ sloc .
    
    ---------- Result ------------
    
                Physical :  418092
                  Source :  317274
                 Comment :  50113
     Single-line comment :  46187
           Block comment :  3926
                   Mixed :  4415
     Empty block comment :  588
                   Empty :  55708
                   To Do :  136
    
    Number of files read :  2611
    
    ----------------------------
    $ cd ~/Downloads/sqlite-amalgamation-3500400/
    $ sloc .
    
    ---------- Result ------------
    
                Physical :  34742
                  Source :  25801
                 Comment :  8110
     Single-line comment :  1
           Block comment :  8109
                   Mixed :  1257
     Empty block comment :  1
                   Empty :  2089
                   To Do :  5
    
    Number of files read :  2
    
    ----------------------------

Oh, wow! I really had no idea!

Along with the sibling comment, microbenchmarks should not be used as authoritative data when the use case is full applications. For that matter, highly optimized Java or Go may be "1 to 4 times as slow as plain C". Fil-C has its merits, but they should be described carefully, just with any technology.

I replied to unwarranted (to my eye) call that Fil-C is significantly slower than plain C.

Fil-C has its drawbacks, but they should be described carefully, just with any technology.

I maintain that microbenchmarks are not convincing, but you have a fair point that GP's statement is unfounded, and now I've made a reply to GP to that effect.

Or JavaScript for that matter

What does "significantly" mean to you? To my ear, "significantly" means "statistically significant".

What language do people considering c as an option for a new project consider? Rust is the obvious one we aren't going to discuss because then we won't be able to talk about anything else, Zig is probably almost as well loved and defended, but it isn't actually memory safe, just much easier to be memory safe. As you say, c# and go, also maybe f# and ocaml if we are just writing simple c style stuff none of those would look all that different. Go jhs some ub related to concurrency that people run into, but most of these simple utilities are either single threaded or fine grained parallel which is pretty easy to get right. Julia too maybe?

In terms of GC quality, Nim comes to mind.

I keep ignoring nim for some reason. How fast is it with all the checks on? The benchmarks for it julia, and swift typically turn off safety checks, which is not how I would run them.

Since anything/0 = infinity, these kinds of things always depend upon what programs do and as a sibling comment correctly observes how much they interfere with SIMD autovectorization and sevral other things.

That said, as a rough guideline, nim c -d=release can certainly be almost the same speed as -d=danger and is often within a few (single digits) percent. E.g.:

    .../bu(main)$ nim c -d=useMalloc --panics=on --cc=clang -d=release -o=/t/rel unfold.nim
    Hint: mm: orc; opt: speed; options: -d:release
    61608 lines; 0.976s; 140.723MiB peakmem; proj: .../bu/unfold.nim; out: /t/rel [SuccessX]
    .../bu(main)$ nim c -d=useMalloc --panics=on --cc=clang -d=danger -o=/t/dan unfold.nim
    Hint: mm: orc; opt: speed; options: -d:danger
    61608 lines; 2.705s; 141.629MiB peakmem; proj: .../bu/unfold.nim; out: /t/dan [SuccessX]
    .../bu(main)$ seq 1 100000 > /t/dat
    .../bu(main)$ /t
    /t$ re=(chrt 99 taskset -c 2 env -i HOME=$HOME PATH=$PATH)
    /t$ $re tim "./dan -n50 <dat>/n" "./rel -n50 <dat>/n"
    225.5 +- 1.2 μs (AlreadySubtracted)Overhead
    4177 +- 15 μs   ./dan -n50 <dat>/n
    4302 +- 17 μs   ./rel -n50 <dat>/n
    /t$ a (4302 +- 17)/(4177 +- 15)
    1.0299 +- 0.0055
    /t$ a 299./55
    5.43636... # kurtosis=>5.4 sigmas is not so significant

Of course, as per my first sentence, the best benchmarks are your own applications run against your own data and its idiosyncratic distributions.

EDIT: btw, /t -> /tmp which is a /dev/shm bind mount while /n -> /dev/null.

In Julia, at least, bounds checks tend to be a pretty minor hit (~20%) unless the bounds check gets in the way of vectorization

A GC lang isn't necessarily significantly slower than C. You should qualify your statements. Moreover, this is a variant of C, which means that the programs are likely less liberal with heap allocations. It remains to be seen how much of a slowdown Fil-C imposes under normal operating conditions. Moreover, although it is indeed primarily suited for existing programs, its use in new programs isn't necessarily worse than, e.g., C# or Go. If performance is the deciding factor, probably use Rust, Zig, Nim, D, etc. .

WASM is a sandbox. It doesn't obviate memory safety measures elsewhere. A program with a buffer overflow running in WASM can still be exploited to do anything that program can do within in WASM sandbox, e.g. disclose information it shouldn't. WASM ensures such a program can't escape its container, but memory safety bugs within a container can still be plenty harmful.

At least WASM can be added incrementally. Fil-C is all or nothing and it cannot be used without rebuilding everything. In that respect a sandbox ranks lower in comprehensiveness but higher in practicality and that's the main issue with Fil-C. It's extremely impressive but it's not a practical solution for C's memory safety issues.

Filip of Fil-C is at Epic. Epic owns the copyright.

Can a program be written only partially in Fil-C? That is to say, can we link regular C and Fil+C object files in a single executable?

> There is no interoperability with Yolo-C (i.e. classic C). This is both a goal and the outcome of a non goal.

https://fil-c.org/runtime

(worth reading, i think all the stuff Fil writes is both super informative & quite entertaining.)

This is disappointing. I can write the networking parts in Rust and the rest of the program in C, but apparently can't do the same with Fil-C.

Is there a reason that some of the linked benchmarks, if I'm reading it right, have Fil-C running faster than C?[0] I assume it's just due to micro-benchmark variability but I'm curious. Some of them seem impossibly fast compared to C so I wonder if there are some correctness issue there.

[0] https://cr.yp.to/2025/20251028-filcc-vs-clang.html

Usually garbage collection does improve alot of benchmarks, just look at the hans boem gc benchmarks.

The two extreme outliers I see are labeled "aead/clx192q/opt,-O3" and "aead/schwaemm128128v2/opt,-Os" according to clicking on the points with devtools. aead/schwaemm128128v2/opt,-Os looks like it is almost at 0x. 1x is at about y = 659 and that test is at 769 out of I guess 780 based on the graph.

Back in the day, the cheat was to set up the GC so that the GC happened outside the timed portion of the benchmark. You know what's faster than the fastest GC? Not doing it.

Classic example: https://devblogs.microsoft.com/oldnewthing/20180228-00/?p=98...

For those, like me, that didn’t know what Fil-C is:

> Fil-C is a fanatically compatible memory-safe implementation of C and C++. Lots of software compiles and runs with Fil-C with zero or minimal changes. All memory safety errors are caught as Fil-C panics. Fil-C achieves this using a combination of concurrent garbage collection and invisible capabilities (InvisiCaps). Every possibly-unsafe C and C++ operation is checked. Fil-C has no unsafe statement and only limited FFI to unsafe code.

https://fil-c.org/

The posted article has a detailed explanation of djb successfully compiling a bunch of C and C++ codebases.

I guess to get on board with this, it is my understanding you have to accept the premise of a Garbage Collector in the runtime?

Note that it is a garbage collector designed and implemented by one of the most experienced GC experts on earth. He previously designed and implemented WebKit's state of the art concurrent GC, for example. So—yes, but don't dismiss it too quickly.

If that's all you need, the state of the art is very available already through the JVM and the .NET CLR, as well as a handful others depending on your use case. Most of those also come with decent languages, and great facilities to leverage the GC to its maximum.

But GCs aren't magic and you will never get rid of all the overhead. Even if the CPU time is not noticeable in your use case, the memory usage fundamentally needs to be at least 2-4x the actual working set of your program for GCs to be efficient. That's fine for a lot of use cases, especially when RAM isn't scarce.

Most people who use C or C++ or Rust have already made this calculation and deemed the cost to be something they don't want to take on.

That's not to say Fil-C isn't impressive, but it fills a very particular niche. In short, if you're bothering with a GC anyway, why wouldn't you also choose a better language than C or C++?

I don't understand the need to hammer in the point that Fil-C is only valuable for this tiny, teeny, irrelevant microscopic niche, while not even talking about what the niche is? To be clear, the niche is rebuilding your entire GNU/Linux userland with full memory safety and completely acceptable performance, tomorrow, without rewriting anything, right? Is this such a silly little idiosyncratic hobby?

So I don’t want to come off as dismissive of the effort - it’s certainly impressive!

The reason I’m not super excited is based on the widely publicized findings from Google and Microsoft (IIRC) about memory safety issues in their code: The vast majority is in new code.

As such, the returns on running the entire userspace with Fil-C may be quite diminished from the get-go. Those who need to guard against UB bugs in seriously battle-hardened C software in production are definitely a small niche.

But that doesn’t mean it isn’t also very useful as a tool during development.

Hmm, so if they're writing new memory unsafe code in C/C++, presumably to remain within their already established and entrenched C/C++ ecosystems, why isn't Fil-C interesting as a way to thwart memory safety issues in that new code?

Because every problem detected by Fil-C is already a serious problem in the existing code.

As a mitigation strategy, that becomes less interesting as the quality of that code increases, but you still pay the full cost regardless of whether there are actually any bugs.

That can certainly be valuable to you, but as a developer, the more interesting proposition is about how not to ship bugs in the first place.

As others have said, programs that have already been written are plainly not in the business of "not...[shipping] bugs in the first place". New code is new code; old code is old code.

It seems like there are constant updates for 20 year old packages on my Ubuntu systems. Ubuntu 20.04 Focal Fossa (first released April 2020) glibc had an update on 2025-05-28. Current stable updated glibc 2025-09-22. To say nothing about the rest of the packages in that operating system.

Oh, look at the time, a few more CVEs in C code, posted 3 hours ago to Hacker News: "X.Org Security Advisory: multiple security issues X.Org X server and Xwayland"

https://news.ycombinator.com/item?id=45790015

https://lists.x.org/archives/xorg-announce/2025-October/0036...

To torture the analogy: perhaps the "returns" are diminishing, but their absolute value is still a few million bucks, I'm happy to take those returns.

> The reason I’m not super excited is based on the widely publicized findings from Google and Microsoft (IIRC) about memory safety issues in their code: The vast majority is in new code

This makes perfect sense to me.

Which is why I don't at all understand the current fetish with rewriting things that have been working well for decades in Rust. Such as coreutils. Or apt.

It feels like an almost deliberate crippling of progress by diverting top talent into useless avenues, much like string theory in physics, or SLS/Artemis.

> It feels like an almost deliberate crippling of progress by diverting top talent into useless avenues, much like string theory in physics, or SLS/Artemis.

You don't have to be a "top talent" to rewrite old unix utilities. The hard part is writing it safely, which in Rust can be done without "top talent."

And then you end up with code 17 times slower than the C code it is replacing. When it didn't need replacing in the first place.

There's a contingent of rust fans that show up on every story about C – their premise is that C code is unsafe and most safety-critical C code should be rewritten in rust.

Fil-C is new and is a viable competitor to rust, that's why you're hearing all asides about tiny niches, unacceptable performance degradation, etc.

Hacker News is not a place where any one group brigrades a thread. There are people who prefer C who don't want a GC, people who prefer Rust who don't want C, people who prefer Rust who agree with Fil-C for legacy C, people who don't prefer C or Rust and may use languages with GC.... We all have interests and face people who denigrate them in bad faith. If you have specific objections to inaccurate statements in this thread, then state them. I'll do the same for any technology if I'm qualified to make statements on it.

> Fil-C is new and is a viable competitor to rust

I’ve no horse in the race here, but the Fil-C page talks about a 4x overhead from using it, which feels like it would make it less competitive

Currently measured worst case for some types or code.

I tried it on my primes micro-benchmark (http://hoult.org/primes.txt) and got a 2:1 slowdown on 13th gen i9.

It does a LOT of array access and updating, probably near to worst-case for code that isn't just a loop copying bytes.

The average slowdown is probably more in the same region as using Java or C# or for that matter C++ std::array or std:vector.

If you missed it, djb himself posted this cute graph of "nearly 9000 microbenchmarks of Fil-C vs. clang on cryptographic software (each run pinned to 1 core on the same Zen 4)":

https://cr.yp.to/2025/20251028-filcc-vs-clang.html

I've heard Filip has some ideas about optimizing array performance to avoid capability checks on every access... doing that thread safely seems like an interesting challenge but I guess there are ways!

[deleted]

There’s no Rust fans here, only GC skeptics. GC skeptics existed long before anyone dreamed of Rust and will survive Rust as well.

It’s a pretty reasonable objection too (though I personally don’t agree). C has always been chosen when performance is paramount. For people who prioritise performance it must feel a bit weird to leave performance on the table in this way.

And Jesus Christ, give it a rest with this “Rust fans must be thinking” stuff. It sounds deranged.

No, back in the day C was used for everything. Vim was not written in C because it needed to wring every last bit of performance out of text editing.

Rewriting everything in rust "for memory-safety" is a false tradeoff given the millions of lines of C code out there and the fact that rewrites always introduce new bugs.

Please, I’m begging you, stop talking about Rust. You’re shoehorning Rust into a discussion where it hasn’t been mentioned, just to hate on some imaginary people you think are pushing Rust here. No one is talking about that. You sound deranged and obsessed.

The vast majority of the conversation here is about GC and the performance implications of that. Please stick to the rest of the thread.

I almost always find that building Boehm GC as a malloc replacement (malloc() -> GC_malloc(), free() -> NOP), and then using LD_PRELOAD to get it used makes any random C/C++ program not only still work but also run faster.

Not only that, but you can then use GC_FREE_SPACE_DIVISOR to tune RAM usage vs speed to your liking on a program by program (or even instance by instance) basis, something completely impossible with malloc().

Lol there are right now 33 mentions of rust in this thread but go on..

I am a member of this niche – thank you for the flake!! https://discourse.nixos.org/t/radically-improving-nix-nixos-...

I think Fil-C is for people who are using software that has already been written, not for people who are trying to pick what language to write new software in. A substantial amount of software has, after all, already been written.

It's super fun to write C and C++ code in Fil-C because it's like this otherworldly crossover between Java and C/C++:

- Unlike Java, you get fantastic startup times.

- Unlike Java, you get access to actual syscall APIs.

- Unlike Java, you can leverage the ecosystem of C/C++ libraries without having to write JNI wrappers (though you do have to be able to compile those libraries with Fil-C).

- Like Java, you can just `new` or `malloc` without `delete`ing or `free`ing.

It's so fun!

I like C, have a probably unhealthy relationship with C++ where I am amazed by what it can do and then get unrealistic expectations it keeps failing to fulfill, and don't really like Java.

You know Julia Ecklar's song where she says that programming in assembler is like construction work with a toothpick for a tool? I feel like C, C++, or Java are like having a teaspoon instead. Maybe Java is a tablespoon. I'd rather use something like OCaml or a sane version of Python without the Mean Girls community infighting. I just haven't found it.

On the other hand, the supposedly more powerful languages don't have a great record of shipping highly usable production software. There's no Lisp or Ruby or Lua alternative to Firefox, Linux, or LLVM.

> Like Java, you can just `new` or `malloc` without `delete`ing or `free`ing.

Is your intention that people use the Fil-C garbage collector instead of free()? Or is it just a backstop in case of memory leak bugs?

Can the GC be configured to warn or panic if something is GCed without free()? Then you could detect memory leak bugs by recompiling with Fil-C - with less overhead than valgrind, although I’m guessing still more than ASan - but more choices is always a good thing.

Yeah, it panics when you use after free. [1]

But I'm not sure it's worth porting your code to Fil-C just to get that property. Because Fil-C still needs to track the memory allocation with its garbage collector. there isn't much advantage to even calling free. If you don't have a use-after-free bug, then it's just adding the overhead of marking the allocation as freed.

And if you do have a use-after-free bug, you might be better off just letting it silently succeed, as it would in any other garbage collected language. (Though, probably still wise to free when the data in the allocation is now invalid).

IMO, if you plan to use Fil-C in production, then might as well lean on the garbage collector. If you just want memory safety checking during QA, I suspect you are better off sticking with ASan. Though, I will note that Fil-C will do a better job at detecting certain types of memory issues (but not use-after-free)

[1] See the "Use After Free example on: https://fil-c.org/invisicaps_by_example

> Yeah, it panics when you use after free.

I wasn’t talking about use-after-free, I was talking about memory leaks - when you get a pointer from malloc(), and then you destroy your last copy of the pointer without ever having called free() on it.

Can the GC be configured to warn/panic if it deallocates a memory block which the program failed to explicitly deallocate?

Question , would this be a desirable outcome for drivers. As far as i can tell most kernel driver crashes are the ones that would benefit from such protection.Plus obviate the need to do full rewrites - if such a GC can protect from the faults and help with the recovery.Assuming the GC after recovery process is similar to erlang BEAM where a reload can bring back healthy state.

> Is your intention that people use the Fil-C garbage collector instead of free()? Or is it just a backstop in case of memory leak bugs?

Wow great question!

My intention is to give folks powerful options. You can choose:

- Compile your code with Fil-C while still maintaining it for Yolo-C. In that case, you'll be calling free(). Fil-C's free() behavior ensures no GC-induced leaks (more on that below) so code that does this will not have leaks in Fil-C.

- Fully adopt Fil-C and don't look back. In that case, you'll probably just lean on the GC. You can still fight GC-induced leaks by selectively free()ing stuff.

- Both of the above, with `#ifdef __FILC__` guards to select what you do. I think you will want to do that if your C program has a custom GC (this is exactly what I did with emacs - I replaced its super awesome GC with calls to my GC) or if you're doing custom arena allocations (arenas work fine in Fil-C, but you get more security benefit, and better memory usage, if you just replace the arena with relying on GC).

The reason why the GC is there is not as a backstop against memory leaks, but because it lets me support free() in a totally sound way with deterministic panic on any use-after-free. Additionally, the way that the GC works means that a program that free()s memory is immune to GC-induced memory leaks.

What is a GC-induced leak? For decades now, GC implementers like me have noticed the following phenomena:

- Someone takes a program that uses manual memory management and has no known leaks or crashes in some set of tests, and converts it to use GC. The result is a program that leaks on that set of tests! I think Boehm noticed this when evangelizing his GC. I've noticed it in manual conversions of C++ code to Java. I've heard others mention it in GC circles.

- Someone writes a program in a GC'd language. Their top perf bug is memory leaks, and they're bad. You scratch your head and wonder: wasn't the whole point of GC to avoid this?

Here's why both phenomena happen: folks have a tendency keep dangling pointers to objects that they are no longer using. Here's an evil example I once found: there's a Window god-object that gets created for every window that gets opened. And for reasons, the Window has a previousWindow pointer to the Window from which the user initiated opening the window. The previousWindow pointer is used in initialization of the Window, but never again. Nobody nulled previousWindow.

The result? A GC-induced leak!

In a malloc/free program, the call to previousWindow.destroy() (or whatever) would also delete (free()) the object, and you'd have a dangling pointer. But it's fine because nobody dereferences it. It's a correct case of dangling pointers! But in the GC'd program, the dangling program keeps previousWindow around, and then there's previousWindow.previousWindow, and previousWindow.previousWindow.previousWindow, and... you get the idea.

This is why Fil-C's answer to free() isn't to just ignore it. Fil-C strongly supports free():

- Freeing an object immediately flags the capability as being empty and free. No memory accesses will succeed on the object anymore.

- The GC does not scan any outgoing references from freed objects (and it doesn't have to because the program can't access those references). Note that there's almost a race here, except https://fil-c.org/safepoints saves us. This prevents previousWindow.previousWindow from leaking.

- For those pointers in the heap that the GC can mutate, the GC repoints the capability to the free'd singleton instead of marking the freed object. If all outstanding pointers to a freed object are repointable, then the object won't get marked, and will die. This prevents previousWindow from leaking.

> Can the GC be configured to warn or panic if something is GCed without free()?

Nope. Reason: the Fil-C runtime itself now relies on GC, and there's some functionality that only a GC can provide that has proven indispensable for porting some complex stuff (like CPython and Perl5).

It would take a lot of work to convert the Fil-C runtime to not rely on GC. It's just too darn convenient to do nasty runtime stuff (like thread management and signal handling) by leaning on the fact that the GC prevents stuff like ABA problems. And if you did make the runtime not rely on GC, then your leak detector would go haywire in a lot of interesting ports (like CPython).

But, I think someone might end up doing this exercise eventually, because if you did it, then you could just as well build a version of Fil-C that has no GC at all but relies on the memory safety of sufficiently-segregated heaps.

Is it possible to use Fil-C as a replacement for valgrind/address sanitizer/leak sanitizer? I.e. say I have a C program that does manual memory management already. Can I then compile it with Fil-C and have it panic/assert on heap use after free, uninitialized memory read (including stack), array out of bounds read, etc?

Okay, that's brilliant. I didn't even imagine that the GC-induced leak problem was even solvable. I guess the freed-but-not-GCed object could be arbitrarily large, but that's almost never going to be a gradual leak.

What's awesome about the Emacs GC?

Even if you have a large freed object, it’ll really get GC’d unless it’s referenced from a root that the GC can’t edit. I allow for such things to simplify the runtime, but they’re rare.

As a GC dev I just found the emacs GC to be so nicely engineered:

- The code is a pleasure to read. I understood it very quickly.

- lots of features! Very sophisticated weak maps, weak references, and finalizers. Not to mention support for heap images (the portable dumper).

- the right amount of tuning but nothing egregious.

It’s super fun to read high quality engineering in an area that I am passionate about!

> Nope. Reason: the Fil-C runtime itself now relies on GC, and there's some functionality that only a GC can provide that has proven indispensable for porting some complex stuff (like CPython and Perl5).

What if there was a flag you could set on an allocation, “must be freed”. An app can set the “must be freed” flag on its allocations, meaning when the GC collects the allocation, it checks if free() has been called on it, and if it hasn’t, it logs a warning (or even panics), depending on process configuration flags. Meanwhile, internal allocations by the runtime won’t set that flag, so the GC will never panic/warn on collecting them.

Internal allocations can have pointers to user allocations

It seems kind of analogous to disabling hyperthreading. Sure there's an immediate performance hit, but in exchange you're now protected from entire classes of vulnerabilities. A few years later, no one remembers or cares about that old setback that has been long since eclipsed by subsequent hardware advancements.

Modern hardware is stupidly fast compared to what existed at the time that a lot of C/C++ projects first started. My M2 MacBook Air has 5x higher multi-core performance than my previous daily driver (a 2015 MacBook Pro, a highly capable machine in its own right), and the new iPhone is now even faster than that. I'd happily accept a worst-case 4x slowdown of all user space C/C++ code in the interest of security, especially when considering how much of that code is going to be written by AI going forward.

I do not think this is niche in the slightest. I would very happily take a 2-4x slowdown for almost all of the web facing C software I run if I get guaranteed memory safety. I will be using at the very least fil-c openssh (and likely much more) on every machine I run.

Sure, that makes sense. The point I’m making is just that from an engineering perspective, that also implies that there is no longer any reason for that software you’re running to be written in C at all.

From an engineering perspective, the software is already written in C, and you're weighing the tradeoffs between rewriting it and recompiling it.

Sure there is. Making tough choices between alternatives based on where to allocate a limited amount of manpower is an engineering choice. Choosing to use Fil-C to recompile existing (established, stabilized, functional...) software rather than rewrite it is an engineering choice.

[deleted]

Apologies ahead of time as this is pure FUD, That is I don't actually know what I am talking about but had an interesting thought.

Remember the Debian weak keys kerfuffle, That was caused because the Debian package maintainer saw a warning about using uninitialized memory, fixed it, and then it turned out that uninitialized memory was a critical seed for the openssl random number generator.

Anyhow my stupid FUD thought. is there a weak-key equivalent bug that shows up now that your C compiler is memory safe?

Even if you can't use something like Fil-C in your release/production builds, being able to e.g. compile unit tests with it to catch memory safety bugs is a huge win. My team use gcc for its mips codegen, but I'm working on adopting the clang bounds-safety annotations for test builds for exactly this reason.

Yeah, I haven't yet taken a serious look into it from that perspective yet, but similar came to mind; while, outside of bootstrapping the JDK from GCJ, Boehm GC hasn't been super relevant to me for "release" builds of anything, it's been useful in leak detection mode on occasion.

I figure even if you cannot use, or do not want to use, something like Fil-C in production, there's solid potential for it to augment whatever existing suite of sanitizers and other tools that one may already build against.

The point is that it can compile most existing C and C++ code as-is, and do it while providing complete memory safety.

That's the claim, anyway. Doesn't sound all that niche to me.

If you write your software in a language that needs GC, everybody using your software needs GC, but they're guaranteed to get memory safety.

If you write your software in an unsafe, non-GC language, nobody needs GC, but nobody gets memory safety either.

This is why many software developers chose the latter option. If there were some use cases in which GC wasn't acceptable for their software, nobody would get GC, even the people who could afford it, and would prefer the increased memory safety.

Fil-C lets the user make this tradeoff. If you can accept the GC, you compile with Fil-C, otherwise you use a traditional C compiler.

The user of the code may plausibly want to make a different tradeoff than the author, without wanting to rewrite the project from scratch.

The value prop here is for existing projects in C or C++, as is made abundantly clear in the linked article

I would say that Rust would be a better choice rarher than patching memory safety on top of C. But I think the reason for this is that most, if not all, cryptographic reference implementations are in C. So they want to use existing reference implementations without having to port them to Rust.

IMO cryptographers should start using Rust for their reference implementations, but I also get that they'd rather spend their time working on their next paper rather than learning a new language.

I'm not a practioner of cryptography, but I would be wary about timing attacks that might become possible if such a dynamic runtime is introduced. At least relevant pieces of code would need to be re-evaluated in the Fil-C environment.

But maybe you could use C as the "glue language" and then the build better performing libraries in Rust for C to use. Like in Python!

Good call! Fil-C does in fact have a way to let you build and run OpenSSL with its constant time crypto. I don't know how this works exactly but I guess it's relatively easy to guarantee it's safe.

How easy is it to link Rust code with C compiled with Fil-C's ABI?

Memory safety is a very small concern for most cryptographic implementations (e.g Side Channel attacks). Rust solves essentially none of the other concerns.

The original poster got pretty much all of Debian running in Fil-C, in a fairly brief amount of time.

Re-writing even a single significant library or package in Rust would take exponentially longer, so in this case Rust would not be "a better choice", but rather a non-starter.

> IMO cryptographers should start using Rust for their reference implementations

IMO they should not, because if I look at a typical Rust code, I have no clue what is going on even with a basic understanding of Rust. C, however, is as simple as it gets and much closer to pseudocode.

Good cryptographic code should match its algorithmic description. Rust enables abstractions that allow this. C does not. That you have some familiarity with C and not Rust should not be a contributing factor.

I say this as someone who has written cryptographic code that’s been downloaded millions of times.

The problem in terms of reference implementations is exactly the abstractions. Reference implementations should be free of abstractions and should be understandable. Abstractions make code much less understandable.

I say this as someone who has been involved in cryptography and has read through dozens of reference implementations. Stick to C, not Python or Rust, it is much easier to understand because the abstractions are just there to hide code. Less abstractions in reference implementations = better. If you do not think so, I will provide you a code snippet of a language of my own choosing that is full of abstractions, and let us see that you understand exactly what it does. You will not. That is the point.

Abstractions are the only way to make sense of cryptography. Pretending otherwise leads to cross layer bugs and vulnerabilities. Of course bad abstractions are bad, but that doesn’t mean no abstraction is good.

Feel free to post your challenge snippet.

  pub fn read<P: AsRef<Path>>(path: P) -> io::Result<Vec<u8>> {
    fn inner(path: &Path) -> io::Result<Vec<u8>> {
      let mut file = File::open(path)?;
      let mut bytes = Vec::new();
      file.read_to_end(&mut bytes)?;
      Ok(bytes)
    }
    inner(path.as_ref())
  }

"aBsTraCtiOnS aRe gOod"... Right.

Reference implementations must NOT have abstractions like this. Rust encourages it. Lots of Rust codebase is filled with them. Your feelings for Rust is irrelevant. C is simple and easy to understand, therefore reference implementations must be in C. Period.

...or Common Lisp, or OCaml... why not?!

I expect Fil-C is not really aimed at green field projects. But rather at making existing projects safe.

It's amazing how much technical discourse revolves around impressions.

"Oh, it has a GC! GC bad!"

"No, this GC by smart guy, so good!"

"No, GC always bad!"

People aren't engaging with the technical substance. GC based systems and can be plenty good and fast. How do people think JavaScript works? And Go? It's like people just absorbed from the discursive background radiation the idea GC is slow without understanding why that might be or whether it's even true. Of course it's not.

You can wrack some people's brains by stating that for some problems, a GC is a great way to alleviate the performance problems caused by manual memory management.

For those problems arena allocators tend to perform even better.

Yeah, but if you actually need to retain a live subgraph of the allocated heap, the arena can't help you. So you make an arena allocator that only frees its slab after moving out the reachable set to a new compacted arena. Congratulations, you've implemented a Cheney-style compacting GC!

Not for all allocation patterns. It's hard to beat bump pointer allocation and escape analysis in general.

> How do people think JavaScript works?

Very slowly. Java, OCaml, or LuaJIT would be better examples here!

How many of the "GC is always slow" people would recognize those systems? Besides: V8 and JSC have pretty decent JITs nowadays. IME, performance of JIT systems has more to do with the structure of programs written in JS than with VM performance itself.

Maybe I don't know what I'm doing, but I rarely get performance within an order of magnitude of single-threaded C from V8. In those other systems I usually do, unless you count Java's startup time.

> It's amazing how much technical discourse revolves around impressions.

One of the single most incisive comments in the whole discussion.

My take: people don't take the time to even try to understand some things of only moderate complexity. They dismiss it as "too hard", drop it, accept the received wisdom and move on.

This is also behind the curse of "best practice". After coming up on 40Y in the industry, my impression is that this boils down to "this is what the previous guys did and it worked". In other words, very close to "nobody ever got fired for buying IBM" as a methodology.

What it means: "you don't need to think about it -- just do this." Which quickly turns to "you don't need to understand it, just do this."

Why I am saying this: because I think you're absolutely right, much of the industry discourse runs on impressions -- but there is a second factor that matters as much.

People form impressions of things they don't understand well themselves by listening to the opinions of people they trust. The question then is: where do they find those opinions?

There are communities of like-minded folks, folks interested in particular tech or whatever. Rust folks, "plain ol' C" folks, C++ folks, "let's replace C with $something_more_modern" folks (where that's D or Nim or whatever).

But those communities group together too. They hang out in the same fora, talk on the same media, etc. Result, there are hierarchies of communities, and the result is like religions: people in one church know of other related churches fairly well, and some are siblings, relatives, whatever; others are hated enemies to be scorned.

But they know next to nothing of other religions which are too far away, too different.

So when people are comparing the offspring of C, they are probably from the Unix faith. They don't know that but everyone they ever talked to, every software they ever saw, is a Unix, so they don't realise there's anything else.

I see passionate debates about Rust vs Go and things and I strongly suspect these are problems fixed among the Wirthian languages decades ago. Walls of text, thousands of lines of code, for things fixed in Modula-2 or Ada 30 or 40 years ago and those folks moved on.

Whereas the Lisp folks never had those problems and are politely baffled by tools that still have flaws that deeply passionate 20-somethings and 30-somethings are calling each other names about and blocking each other over.

I've had people in dead seriousness tell me that ALL OTHER SOFTWARE is written in C at the lowest level. They are amazed when I laugh at them. C is a niche in a niche and the team that wrote C and Unix moved on to Aleph and Limbo and one splinter wrote Go.

The Forth people laugh at the vastly verbose Lisp folks and their forgotten million-line OSes.

The APL people smile at the painfully iterative Forth and Lisp folks.

Unix won on servers and it's pushing Windows off desktops, now relegating everything else to embedded and realtime and the handwritten-by-one-person systems, where nobody else will ever read their code.

I can't help but think that there must be a better way. Not sure what it is. Classes in comparative software religion on Youtube? Sports style competitions for the smallest/simplest/fastest ways to solve particular problems in languages people might not consider? Tools for easier linkage between less-known languages and well-known OSes?

Hi, I noticed you made a typo in "JS bad, Go bad", it's not too late to edit your comment! /s

The author of Fil-C does have some ideas to avoid a garbage collector [1], in summary: Use-after-free at worst means you might see an object of the same size, but you can not corrupt data structures (no pointer / integer confusion). This would be more secure than standard C, but less secure than Fil-C with GC.

[1] https://x.com/filpizlo/status/1917410045320650839

So far we haven't found a viable alternative; CHERI has holes in its temporal integrity guarantees.

Both Fil-C and CHERI rely on a concurrent GC/a GC-like task to find and invalidate all pointers to free()'d memory objects (in "quarantine") before putting them back into the memory pool.

The difference is that because Fil-C has bounds in each object's header, it only has to nullify it to remove access whereas in CHERI a quarantined object can still be accessed through any pointer that hasn't been invalidated yet.

I've seen discussions on adding an additional memory tag to CHERI for memory in quarantine, but I dunno what is best.

Fil-C relies on the compiler being trusted whereas CHERI does not. If we do, then perhaps we could come up with a hardware-accelerated system that is more lightweight than either.

Implementations can always get better, but I think the only categorical (if even that) improvement that will arise depends on careful program design/implementation; that is, reducing the scope and number of capabilities and providing semantic information on capability usage. Fil-C and CHERI do an admirable job of maximizing backwards compatibility and even allowing incremental improvements, but I think it's time that programmers bought into capabilities too.

I didn't know this, but aside from the GC, Fil-C promptly revokes capabilities to freed objects, not relying on a concurrent task to do it eventually; CHERI cannot AFAIK.

Fil-C: A memory-safe C implementation - https://news.ycombinator.com/item?id=45735877 - Oct 2025 (130 comments)

Safepoints and Fil-C - https://news.ycombinator.com/item?id=45258029 - Sept 2025 (44 comments)

Fil's Unbelievable Garbage Collector - https://news.ycombinator.com/item?id=45133938 - Sept 2025 (281 comments)

InvisiCaps: The Fil-C capability model - https://news.ycombinator.com/item?id=45123672 - Sept 2025 (2 comments)

Just some of the memory safety errors caught by Fil-C - https://news.ycombinator.com/item?id=43215935 - March 2025 (5 comments)

The Fil-C Manifesto: Garbage In, Memory Safety Out - https://news.ycombinator.com/item?id=42226587 - Nov 2024 (1 comment)

Rust haters, unite Fil-C aims to Make C Great Again - https://news.ycombinator.com/item?id=42219923 - Nov 2024 (6 comments)

Fil-C a memory-safe version of C and C++ - https://news.ycombinator.com/item?id=42158112 - Nov 2024 (1 comment)

Fil-C: Memory-Safe and Compatible C/C++ with No Unsafe Escape Hatches - https://news.ycombinator.com/item?id=41936980 - Oct 2024 (4 comments)

The Fil-C Manifesto: Garbage In, Memory Safety Out - https://news.ycombinator.com/item?id=39449500 - Feb 2024 (17 comments)

In addition, here are the major related subthreads from other submissions:

https://news.ycombinator.com/item?id=45568231 (Oct 2025)

https://news.ycombinator.com/item?id=45444224 (Oct 2025)

https://news.ycombinator.com/item?id=45235615 (Sept 2025)

https://news.ycombinator.com/item?id=45087632 (Aug 2025)

https://news.ycombinator.com/item?id=44874034 (Aug 2025)

https://news.ycombinator.com/item?id=43979112 (May 2025)

https://news.ycombinator.com/item?id=43948014 (May 2025)

https://news.ycombinator.com/item?id=43353602 (March 2025)

https://news.ycombinator.com/item?id=43195623 (Feb 2025)

https://news.ycombinator.com/item?id=43188375 (Feb 2025)

https://news.ycombinator.com/item?id=41899627 (Oct 2024)

https://news.ycombinator.com/item?id=41382026 (Aug 2024)

https://news.ycombinator.com/item?id=40556083 (June 2024)

https://news.ycombinator.com/item?id=39681774 (March 2024)

https://news.ycombinator.com/item?id=39542944 (Feb 2024)

Thanks for the subthread discussion links, e.g. authors of LuaJIT and Fil-C, https://news.ycombinator.com/item?id=40556083 (June 2024)

Mike Pall is the author of LuaJIT.

Thanks for the correction.

I’m glad Phil’s work is finally getting the recognition it deserves.

There may be useful takeaways here for Rust’s “unsafe” mode - particularly for applications willing to accept the extra burden of statically linking Fil-C-compiled dependencies. Best of both worlds!

> particularly for applications willing to accept the extra burden of statically linking Fil-C-compiled dependencies. Best of both worlds!

As near as I can tell Fil-C doesn't support this, or any other sort of FFI, at all. Nor am I sure FFI would even make sense, it seems like an approach that has to take over the entire program so that it can track pointer provenance.

For securing and maintaining a complex legacy application it seems like a reasonable approach would be to move the majority into Fil-C, then hook the bits that don't fit up via RPC. Maybe some bits get formal verification, rewritten in Rust, ported to new platform APIs, whatever, but at least you get some safety for the whole app without a rewrite.

He could add an API to mint a capability out of thin air. It could even be done out of process.

In fact, I think Fil-C and CHERI could implement 90% the same programmer-level API!

Does Fil-C catch uninitialized memory reads?

malloc'd memory is zeroed in fil-c:

> *zgc_alloc*

> Allocate count bytes of zero-initialized memory. May allocate slightly more than count, based on the runtime's minalign (which is currently 16).

> This is a GC allocation, so freeing it is optional. Also, if you free it and then use it, your program is guaranteed to panic.

> libc's malloc just forwards to this. There is no difference between calling malloc and zgc_alloc.

from https://fil-c.org/stdfil

[deleted]

Great to see some 3letter guy into this. This might be one of those rando things which gets posted on HN (and which doesn't involve me in the slightest), but a decade later is taking over the world. Rust and Go were like that.

Previously there was that Rust in APT discussion. A lot of this middle-aged linux infrastructure stuff is considered feature-complete and "done". Not many young people are coming in, so you either attract them with "heyy rewrite in rust" or maybe the best thing is to bottle it up and run in a VM.

>Great to see some 3letter guy into this

AFAIK, djb isn't for many "some 3letter guy" for over about thirty years but perhaps it's just age related issue with those less been around.

https://en.wikipedia.org/wiki/Daniel_J._Bernstein

Just to be clear, I mean to venerate Bernstein for earning his 3letters, not to trivialize him.

[deleted]

Despite the cool shit the guy has done, keep in mind that "venerate" is not the word to use here. djb is very much not a shorthand used in any positive messaging pretty much ever by any cryptographer. He did it to himself, sadly.

Sorry, can you explain what he did to himself?

I would like to know as well. All that is public is that a couple of IETF apparatchiks want to ban him for criticizing corporate and NSA influence:

https://web.archive.org/web/20250513185456/https://mailarchi...

The IETF has now accepted the required new moderation guidelines, which will essentially be a CoC troika that can ban selectively:

https://mailarchive.ietf.org/arch/msg/mod-discuss/s4y2j3Dv6D...

It is very sad that all open source and Internet institutions are being subverted by bureaucrats.

... if he thinks some WG is making a mistake and he's not welcome there (everyone else seems to be okay with what's happening based on the quoted email on the first link), then - CoC or not - he should then leave, and publicly post distance himself from the outcome.

(Obviously he was never the one to back down from a just fight, but it's important to find the right hill to die on. And allies! And him not following RFC 2026 [from 1996, hardly the peak of Internet bureaucracy] is not a CoC thing anyway.)

Why should he leave? The IETF pretends on its sponsor page (https://www.ietf.org/support-us/endowment/):

The IETF is a global standards-setting organization, intentionally created without a membership structure so that anyone with the technical competency can participate in an individual capacity. This lack of membership ensures its position as the primary neutral standards body because participants cannot exert influence as they could in a pay-to-play organization where members, companies, or governments pay fees to set the direction. IETF standards are reached by rough consensus, allowing the ideas with the strongest technical merit to rise to the surface.

Further, these standards that advance technology, increase security, and further connect individuals on a global scale are freely available, ensuring small-to-midsize companies and entrepreneurs anywhere in the world are on equal footing with the large technology companies.

With a community from around the world, and an increased focus on diversity in all its forms, IETF seeks to ensure that the global Internet has input from the global community, and represents the realities of all who use it.

There is only one IETF, and telling dissenters to leave is like telling a dissenting citizen to go to another country. I don't think that people (apart from real spammers) were banned in 1996. The CoC discussion and power grab has reached the IETF around 2020 and it continues.

"Posting too many messages" has been deemed a CoC violation by for example the PSF and its henchmen, and functionally the IETF is using the same selective enforcement no matter what the official rationale is. They won't go after the "director" Wouters, even though his message was threatening and rude.

> Why should he leave?

Because the game is rigged apparently?

If not then let the WG work. If no one except djb feels this strongly about hybrid vs. pure post-quantum stuff then it's okay.

(And I haven't read the threads but this is a clear security trade-off. Involving complexity, processing power and bandwidth and RAM and so on, right? And the best and brightest cryptographers checked the PQ algorithms, and the closer we get to them getting anywhere near standardized in a pure form the more scrutiny they'll receive.

And someone being an NSA lackey is not a technical merit argument. Especially if it's true, because in this case the obvious thing is to start coalition building to have some more independent org up and running, because arguing with a bad faith actor is not going to end well.)

> If no one except djb feels this strongly about hybrid vs. pure post-quantum stuff then it's okay.

That is one of the contentious issues. See the last paragraphs of:

https://blog.cr.yp.to/20251004-weakened.html

Starting with "Remember that the actual tallies were 20 supporters, 2 conditional supporters, and 7 opponents".

Not to trivialise but being a 3 letter guy means being old. So, it's at best a celebration of achieving longevity and at worst a celebration of creaky joints and a short temper.

Most of us will have a problematic joint or two by a certain age. Almost none of us will be recognised by any name by that time.

Mate, we're not talking about the future, but about 3 letter guys now. I'm one, I've carried it with me for 40+ years as have the ten or twenty peers of mine I know by their tla. I got it at pobox.com when the door opened, the guy at the desk next door got a one letter name. I set up campus email for the entire uni in 1989 and gave myself the tla with my superuser rights before that. I'd done the same at ucl-cs in 85, and before that in Leeds and York.

My point here is we're not famous we're just old enough to have a tla from the time before HR demanded everyone get given.surname.

Every Unix system used to ship with a dmr account. It doesn't mean we all knew Dennis Ritchie, it means the account was in the release tape.

There are 17,000 odd of us. Ekr, Kre and Djb are famous but the other 17,573 of us exist.

I'm not sure what your point is here. OP was clearly using "three letter guy" in the sense "so famous people know them by their initials". This is hardly unread of, e.g. https://wiki.c2.com/?ThreeLetterPerson

It was the "Great to see _some_ 3letter guy into this" underlined some that.

It felt bit like s/some/random/g perhaps would apply when reading it. Intentional or not by writer. It made me long and write my comment. There are many 3letter user accounts, which some are more famous than others. To my generation not because they were early users, but great things what they have done. I'm early user too and done things then still quite widely being used with many distributions, but wouldn't compare my achievements to those who became famous and known widely by their account, short or long.

Anyhow I thought that "djb" ring bell anyone having been around for while. Not just those who have been around early 90 or so when he was held renegade opinions he expressed programming style (qmail, dj dns, etc.), dragged to court of ITAR issues etc.

But because of his latter work with cryptography and running cr.yp.to site for quite long time.

https://cr.yp.to/

I was just wondering, did not intend to start argument fight.

Is this because they're that famous though or simply because there weren't as many people in the scene back then? We just don't do the initials thing anymore.

Yes: the fame is the subtext. It's akin to mononyms; they'd be referring to famous people like Shakira, Madonna, or Beyoncé. A lot of us have first names, but the point isn't that one's family calls them "Dave" without ambiguity.

There were many unix instances, and likely multiple djb logins around the world, but there's only one considered to be the djb, and it's dur to fame.

[deleted]

It's wild how much he looks like ryg, another 3 letter genius

[deleted]

I am a bit surprised that the build_all_fast_glibc.sh script requires 31Gbyte of memory to run. Can somebody explain? I would like to try out Fil-C.

Building and linking llvm sucks.

Interesting to see some bash curl being used by a renowned cryptologist...

Almost like it's actually fine.

https://medium.com/@ewindisch/curl-bash-a-victimless-crime-d...

It is definitely not fine. The argument seems to be that since you need to trust somebody, curl | bash is fine because you just trust whoever controls the webserver. I think this is missing the point.

s/webserver/DNS/

HTTPS is there, so you go down to that level only if you want to distrust any element of the public key infrastructure. Which, to be fair, there are plenty of reasons if you are paranoid -- they do tell you who's doing what in a shady way as they revoke, so there's a huge list of transgressions.

It is not only that directly; the domain name might be reassigned to someone else, resulting in a valid certificate which is different than the one you wanted. (If you have the hash of the file which you have verified independently then it is more secure (if the hash algorithm is secure enough), although HTTPS is not needed in that case, it can still be used if you wish to avoid spies knowing which file you accessed. You can also use the server's public key if you know what it should be, although this has different issues, such as someone compromising the server (or the key) and modifying the script.) (There is also knowing if the script is what you intended or not anyways (or if there is something unexpected due to the configuration on your computer); if that is your issue, you can read it (and perhaps verifying the character encoding) before executing it, whether or not you trust the server operator and the author of that script.)

> the domain name might be reassigned to someone else

If that happens its game over. As the article I linked noted, the attackers can change the installation instructions to anything they want - even for packages that are available in Linux distros.

It's missing which point?

That you should be very careful about what you install. Cut&pasting some line from a website is the exact opposite of it. This is mostly about psychology and not technology. But there are also other issues with this, e.g. many independent failure points at different levels, no transparency, no audit chain, etc. The counter model we tried to teach people in the past is that people select a linux distribution, independently verify fingerprints of the installation media, and then only install packages from the curated a list of packages. A lot of effort went into making this safe and close the remaining issues.

None of that has anything to do with curl|bash.

Be careful who you trust when installing software is a fine thing to teach. But that doesn't mean the only people you can trust are Linux distro packagers.

I think it has a lot to do with "curl|bash". Cut&paste a curl|bash command-line disables all inherent mechanisms and stumbling blocks that would ensure properly ensuring trust. It was basically invented to make it easy to install software by circumventing all protection a Linux distribution would traditionally provide. It also eliminates all possibility for independent verification about what was installed or done on the machine.

Downloading and installing a `.deb` or `.rpm` is going to be no more secure. They can run arbitrary scripts too.

Downloading a deb via a package manager is more secure. Downloading a deb, comparing the hash (or at least noting down the hash) would also already be more secure.

But yes, that the run arbitrary scripts is also a known issue, but this is not the main point as most code you download will be run at some point (and ideally this needs sandboxing of applications to fix).

[deleted]

> Downloading a deb via a package manager is more secure.

Not what I meant. Getting software into 5 different distros and waiting years for it to be available to users is not really viable for most software authors.

I think it would be quite viable if there is any willingness to work with the distributions in the interest in security.

Well, distros haven't really put any effort into making it viable as far as I know. They really should! Why isn't there a standard Linux package format that all distros support? Flatpak is fine for user GUI apps but I don't think it would be feasible to e.g. distribute Rust via a Flatpak.

(And when I say fine, I haven't actually used it successfully yet.)

I think distros don't want this though. They all want everyone to use their format, and spend time uploading software into their repo. Which just means that people don't.

[dead]

[flagged]

Building tools is one thing, building a system like Postgres or Databases is going to be another thing.

Anyone really tried building PG or MySQL or such a complex system which heavily relies on IO operations and multi threading capabilities

Look at how fanatic the compatibility actually is. Building Postgres or MySQL is conceivable but probably will require some changes. (SQLite compiles and runs with zero changes right now.)

SQLite runs about 5 times faster compiled with GCC (13.3.0) than it does when compiled with FIL-C. And the resulting compiled binary from GCC is 13 times smaller.

Interesting! I guess that's from your standard benchmark setup. Please note that Fil-C makes no secret of having a performance penalty. It's definitely a pre-1.0 toolchain and only recently starting to pick up some momentum. The author is eager to keep improving it, and seems to think that there's still plenty of low hanging and medium hanging fruit to pick.

It does (or did, at some point) pass the thorough SQLite test suite, so at least it's probably correct! The famous SQLite test coverage and general proven quality might make SQLite itself less interesting to harden, but in order to run less comprehensively verified software that links with SQLite, we have to build SQLite with Fil-C too.

Thanks for checking! I was wondering.

If you run Nix (whether on NixOS or elsewhere) you can do `cachix use filc` and `nix run github:mbrock/filnix#sqlite` and it should drop you into a Fil-C SQLite after downloading the runtime dependencies from my binary cache (no warranty)!

Thanks!

djb uses a surprisingly low amount of RAM (12GB) considering my laptop already has 64G which is possible to expand to 128G in the future

I would really like to see Omarchy go this direction. A fully memory-safe userland for Omarchy is possible with existing techhnology.

Can you elaborate why Omarchy? I'm asking, in context of recompiling with Fil-C, because that seems to be just Arch + configurations.

For cultural reasons, I would like Omarchy—culturally—to adopt straightforward security as one of their goals, in addition to usability and beauty.

It's low hanging fruit, and a great way to further differentiate their Linux distribution.

I can't wait for all the delicious four-way flamewars. Choose your fighter!

1) Rewrite X in Rust

2) Recompile X using Fil-C

3) Recompile X for WASM

4) Safety is for babies

There are a lot of half baked Rust rewrites whose existence was justified on safety grounds and whose rationale is threatened now that HN has heard of Fil-C

Fil-C has come up on HN plenty of times before. If it was going to make much of a dent in the discussions, it would have by now.

odd fallacy. things grow in popularity / awareness over time

It's strange how ideas seem to explode at random into the discourse despite being known for a long time. It's as if some critical mass stumbles on a thing and it becomes "the current thing" everyone talks about until the next current thing.

I'm on camp 2.

We have a saying that jam is made of fruit that gave up the fight becoming a brandy.

Obviously someone needs to rewrite Rust in Fil-C

Yeah since Fil-C is just an LLVM transform we could make Rust memory safe with it

It's not an either-or (well, except for this last item).

It seems sensible to not write new software in plain C. Rust is certainly a valid choice for a safer language, but in many cases overkill wrt how painful the rewrite is vs benefits gained from avoiding a higher-level memory-safe one like OCaml.

At the same time, "let's just rewrite everything!" is also madness. We have many battle-tested libraries written in C already. Something like Fil-C is badly needed to keep them working while improving safety.

And as for wasm, it's sort of orthogonal - whether you're writing in C or in Rust, the software may be bug-free, but sandboxing it may still be desirable e.g. as a matter of trust (or lack thereof). Also, cross-platform binaries would be nice to have in general.

> the software may be bug-free, but sandboxing it may still be desirable e.g. as a matter of trust (or lack thereof)

Wouldn't the only cause of mistrust be bugs, or am I missing something? If the program is malicious, sandboxing isn't the pertinent action.

If any program can potentially be malicious (which is the effectively the case today with any downloaded software), then sandboxing is exactly the pertinent action - provided that the sandbox is tight enough.

I should have elaborated. If a program is known to be malicious, or should be treated as malicious, then it should probably be terminated. Given a potentially malicious program and no easy way to determine (lack of) malice, sandboxing is a reasonable measure.

Wish we were talking about making Fil-C required for apt, not Rust...

Those seems to be independent issues. Fil-C is about the best way to compile/run C code.

Rust would be about what language to use for new code.

Now that I have been programming in Rust for a couple of years, I don't want to go back to C (except for some hobby projects).

I agree. The main advantage of Fil-C is compatibility with C, in a secure way. The disadvantages are speed, and garbage collection. (Even thought, I read that garbage collection might not be needed in some cases; I would be very interested in knowing more details).

For new code, I would not use Fil-C. For kernel and low-level tools, other languages seem better. Right now, Rust is the only popular language in this space that doesn't have these disadvantages. But in my view, Rust also has issues, specially the borrow checker, and code verbosity. Maybe in the future there will be a language that resolves these issues as well (as a hobby, I'm trying to build such a language). But right now, Rust seems to be the best choice for the kernel (for code that needs to be fast and secure).

> disadvantages are speed, and garbage collection.

And size. About 10x increase both on disk and in memory

  $  stat -c '%s %n' {/opt/fil,}/bin/bash
  15299472 /opt/fil/bin/bash
   1446024 /bin/bash

  $ ps -eo rss,cmd | grep /bash
  34772 /opt/fil/bin/bash
   4256 /bin/bash

How does that compare with rust? You don't happen to have an example of a binary underway moving to rust in Ubuntu-land as well? Curious to see as I honestly don't know whether rust is nimble like C or not.

My impression is - rust fares a bit better on RAM footprint, and about as badly on disk binary size. It's darn hard to compare apples-to-apples, though - given it's a different language, so everything is a rewrite. One example:

Ubuntu 25.10's rust "coreutils" multicall binary: 10828088 bytes on disk, 7396 KB in RAM while doing "sleep".

Alpine 3.22's GNU "coreutils" multicall binary: 1057280 bytes on disk, 2320 KB in RAM while doing "sleep".

I don't have numbers, but Rust is also terrible for binary size. Large Rust binaries can be improved with various efforts, but it's not friendly by default. Rust focuses on runtime performance, high-level programming, and compile-time guarantees, but compile times and binary sizes are the drawback. Notably, Rust prefers static linking.

Fil-C is slow.

There is no C or C++ memory safe compiler with acceptable performance for kernels, rendering, games, etc. For that you need Rust.

The future includes Fil-C for legacy code that isn’t performance sensitive and Rust for new code that is.

No, Rust is awful for game development. It's not really what it was intended for. For one, all the graphics API are in C, so you would have to use unsafe FFI basically everywhere.

How slow? In some contexts, the trade-off might be acceptable. From what I've seen in pizlonator's tweets, in some cases the difference in speed didn't seem drastic to me.

Yeah, I would happily run a bunch of my network services in this. I have loads of services that are public-facing doing a lot of complex parsing and rule evaluation and are mostly idle. For example my whole mailserver stack could probably benefit from this. My few messages an hour can run 2x slower. Maybe I would leave dovecot native since the attack surface before authentication is much lower and the performance difference would be more noticeable (mostly for things like searches).

You may be aware that one of the things Bernstein is famous for is revolutionizing mailserver security.

[deleted]

I imagine Apt is usually IO constrained?

That's my guess, yeah

Also, Fil-C's overheads are the lowest for programs that are pushing primitive bits around.

Fil-C's overheads are the highest for programs that chase pointers.

I'm guessing the CPU bound bits of apt (if there are any) are more of the former

What does that have to do with apt?

Enough of it is performance sensitive that Fil-C is not an option.

Fil-C is useful for the long tail of C/C++ that no one will bother to rewrite and is still usable if slow.

How is apt performance sensitive?

Apt has been painfully slow since I started using Debian last millennium, but I suspect it's not because it uses a lot of CPU, or it would be snappy by now.

It parses formats and does TLS, I’m assuming it’d be quite bad. I don’t think you can mix and match.

[deleted]

stuff that talks to "the internet" and runs as "root" seems like a good thing to build with filc.

It probably uses OS sandboxing primitives already.

In normal operation, apt has to be able to upgrade the kernel, the bootloader, and libc, so it can't usefully be sandboxed except for testing or chroots.

No, that doesn't follow. That only means the networking and parsing functions can't be sandboxed in the same process that drops new root-owned files. C and C++ services have been using subprocesses for sandboxing risky functionality for a long time now. It appears Apt has some version of this:

https://salsa.debian.org/apt-team/apt/-/blob/main/apt-pkg/co...

That's true; you can't usefully sandbox apt as a whole, but, because it verifies the signatures of the packages it downloads, you could usefully sandbox the downloading process, and you could avoid doing any parsing on the package file until you've validated its signature. It's a pleasant surprise to hear that it already does something like this!

doesnt it only work on x86_64?

I wish, we will have something like Fil-C as an option for unsafe Rust.

Fil-C works because you recompile the whole C userspace. Unsafe Rust doesn't do that... and for many practical purposes you probably want to touch the non-safe-version of the C userspace.

Still, it's all LLVM, so perhaps unsafe Rust for Fil-space can be a thing, a useful one for catching (what would be) UBs even [Fil-C defines everything, so no UBs, but I'm assuming you want to eventually run it outside of Fil-space].

Now I actually wonder if Fil-C has an escape hatch somewhere for syscalls that it does not understand etc. Well it doesn't do inline assembly, so I shouldn't expect much... I wonder how far one needs to extend the asm clobber syntax for it to remotely come close to working.

at the bottom of the turtle stack, there's a yolo-c libc that does some syscall stuff:

> libyoloc.so. This is a mostly unmodified [musl/glibc] libc, compiled with Yolo-C. The only changes are to expose some libc internal functionality that is useful for implementing libpizlo.so. Note that libpizlo.so only relies on this library for system calls and a few low level functions. In the future, it's possible that the Fil-C runtime would not have a libc in Yolo Land, but instead libpizlo.so would make syscalls directly.

but mostly you are using a fil-c compiled libc:

> libc.so. This is a modified musl libc compiled with Fil-C. Most of the modifications are about replacing inline assembly for system calls with calls to libpizlo.so's syscall API.

That links here: https://github.com/pizlonator/fil-c/blob/deluge/filc/include...

Quotes from: https://fil-c.org/runtime

Unsafe Rust actually has a great runtime analyzer: Miri. It's very easy to just run `cargo +nightly miri test` in your project to get some confidence in the more questionable choices along the way.

> Debian using Fil-C (Filian?)

DJB SMACKER CONFIRMED?!