For anyone who liked this, I highly suggest you take a look at the CuriousMarc youtube channel, where he chronicles lots of efforts to preserve and understand several parts of the Apollo AGC, with a team of really technically competent and passionate collaborators.
One of the more interesting things they have been working on, is a potential re-interpretation of the infamous 1202 alarm. It is, as of current writing, popularly described as something related to nonsensical readings of a sensor which could (and were) safely ignored in the actual moon landing. However, if I remember correctly, some of their investigation revealed that actually there were many conditions which would cause that error to have been extremely critical and would've likely doomed the astronauts. It is super fascinating.
And that's why it's harder (or easier?) to make the same landing again -- we taking way less chances. Today we know of way more failure modes than back then.
They sent people up in a tin can with the bare minimum computational power to manage navigation and control sequencing. It was barely safer than taking a barrel over Niagara Falls. We do have much more capable and reliable technology.
Buzz Aldrin (?) was quoted as recalling holding a pencil inside the capsule as they were out in space and thinking "that wall isn't very thick or strong, I could probably jam a pencil through it pretty easily..."
Death being a layer of aluminum away changes your mind.
It's a miracle nobody died in flight during the program. Exploding oxygen tank, rockets shaking themselves to pieces during launch, getting hit by lightning on top of a flying skyscraper full of kerosene and liquid oxygen....
Gus Grissom, Ed White, and Roger Chaffee died on the Apollo program. I feel it's not polite to ignore that fact even if you add an 'in flight' qualifier.
Starting from the first test pilots, a lot of people died for us to get to the point to launch that flight. So while no one died on the flight, lots of people died just getting us there. If I recall, in The Right Stuff, it's mentioned that those early test pilots had something like a 25% mortality rate.
Think about the "failure mode" of the aircraft that won World War II, the Supermarine Spitfire.
There was a fuel tank mounted between the engine and cockpit so if it took enough of a hit to puncture right through (not hard, in practice) the failure mode was that the cockpit was now full of a 350mph jet of burning petrol.
Still, it did the job.
The early jet age was pretty nuts. Check the Wikipedia page for a random fighter from the era and you'll see figures like, 1,300 built, 50 lost in combat, 1,100 lost in accidents. And that's operational aircraft. Test pilots were in even more danger.
Some were pretty bad, but none were nearly that bad. The B-58 Hustler lost 22% of its airframes, the F7U Cutlass 25%, the F-104 Starfighter in German service lost 33%. And those were outliers.
You're right, those numbers are from the F-8 but include non-total-loss accidents.
I don't think the numbers you quoted are outliers, though. The F-100 lost ~900 out of 2,300. The F-106 lost ~120/342. That's a pretty big list of planes with a 1/5-1/3 loss rate.
You should go back even a little further, the USPS air mail service lost 31 of the first 40 pilots.
Back in the days where the plan was "So we've built literal signal fires and giant concrete arrows and well, good luck, it won't help"
Have you ever listened to Robert Calvert's "Captain Lockheed and the Starfighters"?
"popularly described" and how it's currently understood are two different things. Because it's hard to explain to lay people, it's popularly described in a number of simplified ways, but it's well understood.
CADR is an AGC assembly directive defining a "complete address" including a memory bank, in this case a subroutine to be called by the preceding BANKCALL (TC = transfer control, i.e., store return address and jump to subroutine), which switches to the memory bank specified in the CADR before jumping to the address specified in the CADR.
For a brief explanation of AGC subroutine calls, see [1].
CAR and CDR in Lisp come from the original implementation on the IBM 704, where pointers to the two components of a cons cell were stored as the (C)ontents of the (A)ddress and (D)ecrement fields of a (R)egister (memory word).
(CADR x) is just shorthand for (CAR (CDR x)), i.e., a function that returns the second element of a list (assuming x is a well-formed list).
I think it's interesting that they found what seems to be a real bug (should be independantly verified by experts). However I find their story mode, dramatization of how it could have happened to be poorly researched and fully in the realm of fiction. An elbow bumping a switch, the command module astronaut unable to handle the issue with only a faux nod to the fact that a reset would have cleared up the problem and it was part of their training. So it's really just building tension and storytelling to make the whole post more edgy. And yes, this is 100% AI written prose which makes it even more distasteful to me.
> An elbow bumping a switch [..] really just building tension and storytelling to make the whole post more edgy.
A guarded switch, no less.
But personally I'm trying to be more generous about this sort of thing: it is very very difficult to explain subtle bugs like this to non-technical people. If you don't give them a story for how it can actually happen, they tend to just assume it's not real. But then when you tell a nice story, all us dry aged curmudgeons tut tut about how irreverent and over the top it is :)
Finding the middle ground between a dry technical analysis and dramatization can be really hard when your audience is the entire internet.
[flagged]
The specs were derived from the code, not from the original requirements. So this is "we modeled what the code does, then found the code doesn't do what we modeled." That's circular unless the model captures intent that the code doesn't , and intent is exactly what you lose when you reverse-engineer specs. Would love to see this applied to a codebase where the original requirements still exist
But this seems like a reasonable approach for reverse-engineering, and it seems the bug they found is real.
The code was inconsistent with itself: that's not circular. Every path dropped the lock except one.
I took it as the extracted spec was weird and they looked into it.
Has someone verified this was an actual bug?
One of AI’s strengths is definitely exploration, f.e. in finding bugs, but it still has a high false positive rate. Depending on context that matters or it wont.
Also one has to be aware that there are a lot of bugs that AI won’t find but humans would
I don’t have the expertise to verify this bug actually happened, but I’m curious.
It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means. What is not clear is how they found themselves modeling the gyros software of the apollo code to begin with.
But, I do think their explanation of the lock acquisition and the failure scenario is quite clear and compelling.
Anyways, it seems it would take a dedicated professional serious work to understand if this bug is real. And considering this looks like an Ad for their business, I would be skeptical.
> It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means.
Could the "AI native language" they used be Apache Drools?
The "when" syntax reminded me of it...
(Apache Drools is an open source rule language and interpreter to declaratively formulate and execute rule-based specifications; it easily integrates with Java code.)
How did you pick out AI native and miss the rest of the SAME sentence?
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
That does not answer my confusion, especially when static analysis could reveal the same conclusion with that language. It's not clear what role ai played at all.
>It's not even clear if AI was used to find the bug
It's not even clear you read the article
Where do you think my confusion came from? All it says is that ai assists in resolving the gyroscope lock path, not why they decided to model the gyroscope lock path to begin with.
Please, keep your offensive comments to yourself when a clarifying comment might have sufficed.
Even worse, the other child comments are speculating (and didn't RTFA either) when the answer is clear in the article.
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
> distilling
A.k.a. as fabricating. No wonder they chose to use "AI".
That's the opposite of clear to me.
Has the article been updated?
2nd paragraph starts with: "We used Claude and Allium"
And later on: "With that obligation written down, Claude traced every path that runs after gyros_busy is set to true"
> It's not even clear if AI was used to find the bug
The intro says “We used Claude and Allium”. Allium looks like a tool they’ve built for Claude.
So the article is about how they used their AI tooling and workflow to find the bug.
The article does not explain anything about how they used AI—it just has some relation with the behavioral model a human seems to have written (and an AI does not seem necessary to use!)
Sure it does.
They used their AI tool to extract the rules for the Apollo guidance system based on the source code.
Then they used Claude to check if all paths followed those rules.
However, Phase 5 (deadlock demonstration) is entirely faked. The script just prints what it _thinks_ would happen. It doesn't actually use the emulator to prove that its thinking is right. Classic Claude being lazy (and the vibe coder not verifying).
I've vibe coded a fix so that the demonstration is actually done properly on the emulator. And also added verification that the 2 line patch actually fixes the bug: https://github.com/juxt/agc-lgyro-lock-leak-bug/pull/1
> However, Phase 5 (deadlock demonstration) is entirely faked. The script just prints what it _thinks_ would happen.
I see this a lot in AI slop, which I mostly get exposed to in the form of shitty pull requests.
You know when you're trying to explain Test-Driven Development to people and you want to explain how you write the simplest thing that passes the test and then improve the test, right? So you say "I want a routine that adds VAT onto a price, so I write a test that says £20+VAT is £24, and the simplest thing that can pass that test is just returning 24". Now you know and I know that the routine and its test will break if you feed it any value except £20, but we've proved we can write a routine and its test, and now we can make it more general.
Or maybe we don't care and we slap a big TODO: make this actually work on there because we don't need it to work properly now, we've got other things to do first, and every price coming up as £20+VAT is a useful indicator that we still have to make other bits work. It doesn't matter.
The problem is that AI slop code "generators" will just stop at that point and go "THERE LOOK IT'S DONE AND IT'S PERFECT!" and the people who believe in the usefulness of AI will just ship it.
Super interesting. I wish this article wasn’t written by an LLM though. It feels soulless and plastic.
It's not setting off any LLM alarm bells to me. It just reads like any other scientific article, which is very often soulless
It repeats a few points too many times for a professional writer to not catch it.
I don’t mind that they let an LLM write the text, but they should at least have edited it.
the subheadings are extremely AI IMHO
Isn't that just a normal way to organize a large document?
Any specific sections that stick out? Juxt in the past had really great articles, even before LLMs, and know for a fact they don't lack the expertise or knowledge to write for themselves if they wanted and while I haven't completely read this article yet, I'd surprise me if they just let LLMs write articles for them today.
Here's one tell-tale of many: "No alarm, no program light."
Another one: "Two instructions are missing: [...] Four bytes."
One more: "The defensive coding hid the problem, but it didn’t eliminate it."
That's just writing. I frequently write like that.
This insistence that certain stylistics patterns are "tell-tale" signs that an article was written by AI makes no sense, particularly when you consider that whatever stylistic ticks an LLM may possess are a result of it being trained on human writing.
These are just some of the good examples I found.
My hunch that this is substantially LLM-generated is based on more than that.
In my head it's like a Bayesian classifier, you look at all the sentences and judge whether each is more or less likely to be LLM vs human generated. Then you add prior information like that the author did the research using Claude - which increases the likelihood that they also use Claude for writing.
Maybe your detector just isn't so sensitive (yet) or maybe I'm wrong but I have pretty high confidence at least 10% of sentences were LLM-generated.
Yes, the stylistic patterns exist in human speech but RLHF has increased their frequency. Also, LLM writing has a certain monotonicity that human writing often lacks. Which is not surprising: the machine generates more or less the most likely text in an algorithmic manner. Humans don't. They wrote a few sentences, then get a coffee, sleep, write a few more. That creates more variety than an LLM can.
Here's an alternative way of thinking about this...
Someone probably expended a lot of time and effort planning, thinking about, and writing an interesting article, and then you stroll by and casually accuse them of being a bone idle cheat, with no supporting evidence other than your "sensitive detector" and a bunch of hand-wavy nonsense that adds up to naught.
To start, this is more or less an advertising piece for their product. It's pretty clear that they want to sell you Allium. And that's fine! They are allowed! But even if that was written by a human, they were compensated for it. They didn't expend lots of effort and thinking, it's their job.
More importantly, it's an article about using Claude from a company about using Claude. I think on the balance it's very likely that they would use Claude to write their technical blog posts.
> They didn't expend lots of effort and thinking, it's their job.
Your job doesn't require you to think or expend effort?
While I agree with the sentiment, using AI to write the final draft of the article isn’t cheating. People may not like it, but it’s more a stylistic preference.
Using AI and a human byline is 100% cheating.
Yet another way the mere possibility of AI/LLM being involved diminishes the value of ALL text.
If there is constant vigilance on the part of the reader as to how it was created, meaning and value become secondary, a sure path to the death of reading as a joy.
Those aren’t good examples - that’s just LLMs living for free in your head.
I am reminded of the Simpsons episode in which Principal Skinner tries to pass off the hamburgers from a near-by fast food restaurant for an old family recipe, 'steamed hams,' and his guest's probing into the kitchen mishaps is met with increasingly incredible explanations.
I’m so glad the witch hunt has moved on to phrasing so I get less grief for my em dashes.
In theory, wouldn't be too hard be to settle the question if whether he used ChatGPT to write it: get Olang to write a few paragraphs by hand, then have people judge (blindly) if it's the same style as the article. Which one sounds more like ChatGPT.
When people judge blindly, the are more likely to think the human is the AI and the AI is the human.
73% judged GPT 4.5 (edit: had incorrectly said 4o before)to be the human.
Not only are people bad at judging this, but are directionally wrong.
There is research showing the contrary that is far more convincing:
> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization.
The times I've written articles, and those have gone through multiple rounds of reviews (by humans) with countless edits each time, before it ends up being published, I wonder if I'd pass that test in those cases. Initial drafts with my scattered thoughts usually are very different from the published end results, even without involving multiple reviewers and editors.
I hate that I can’t write em dashes freely anymore without people accusing the writing of being AI generated.
Even though they are perfect for usage in writing down thoughts and notes.
One thing you can try⸺admittedly it's not quite correct⸺is replacing them with a two-em dash. I've never seen an AI use one, and it looks pretty funky.
I have nothing against em dashes. As long as your writing is human, experienced readers will be able to tell it's human. Only less experienced ones will use all or nothing rules. Em dashes just increase the likelihood that the text was LLM generated. They aren't proof.
That nuance is lost on the majority of anti-AI folks who’ve learned they get positive social reactions by declaring essentially everything to be AI written and condemnable.
“An em dash… they’re a witch!”… “it’s not just X, it’s Y… they’re a witch!”
> anti-AI folks who’ve learned they get positive social reactions by declaring essentially everything to be AI written and condemnable.
that's a strawman alright; all the comments complaining how they can't use their writing style without being ganged up on are positive karma from my angle, so I'm not sure the "positive social reactions" are really aligned with your imagination. Or does it only count when it aligns with your persecution complex?
You have the same problem apparently. You think it’s okay to go witch hunting and accuse people with no real evidence.
Evidently there are no experienced readers who post AI accusations.
Same weight as "there are no experienced men who'll ask a woman if she's pregnant."
Why do you care what others accuse you of?
No, it’s pretty obviously AI written. Not sure why you’re running so much interference for them…are you affiliated with this company?
[dead]
This is my exact writing style - I'm screwed.
I doubt you write like that. Where can I find your writing other than your comments which IMO don't read like the blog post?
Justify your doubt.
This is just writing; terse maybe and maybe not grammatically correct, but people write like that.
It's not just terseness, it's the rhythm and "it's not x, it's y".
In fact, the latter is the opposite of terseness. LLMs love to tell you what things are not way more than people do.
(The irony that I started with "it's not just" isn't lost on me)
> (The irony that I started with "it's not just" isn't lost on me)
But an LLM wouldn't write "It's not just X, it's the Y and Z". No disrespect to your writing intended, but adding that extra clause adds just the slightest bit of natural slack to the flow of the sentence, whereas everything LLMs generate comes out like marketing copy that's trying to be as punchy and cloying as possible at all times.
The AI writing detectors are very unreliable. This is important to mention because they can trigger in the opposite direction (reporting human written text as AI generated) which can result in false accusations.
It’s becoming a problem in schools as teachers start accusing students of cheating based on these detectors or ignore obvious signs of AI use because the detectors don’t trigger on it.
Then pangram isn't very good, because that article is full of Claude-isms.
> because that article is full of Claude-isms
Not sure how I feel about the whole "LLMs learned from human texts, so now the people who helped write human texts are suddenly accused of plagiarizing LLMs" thing yet, but seems backwards so far and like a low quality criticism.
Real talk. You're not just making a good point -- you're questioning the dominant paradigm
Horrible
I'm sure some human writers would write:
> The specification forces this question on every path through the IMU mode-switching code. A reviewer examining BADEND would see correct, complete cleanup for every resource BADEND was designed to handle.
> The specification approaches from the other direction: starting from LGYRO and asking whether any paths fail to clear it.
> *Tests verify the code as written; a behavioural specification asks what the code is for.*
However this is a blog post about using Claude for XYZ, from an AI company whose tagline is
"AI-assisted engineering that unlocks your organization's potential"
Do you really think they spent the time required to actually write a good article by hand? My guess is that they are unlocking their own organizations potential by having Claude writes the posts.
> Do you really think they spent the time required to actually write a good article by hand?
Given I'm familiar with Juxt since before, used plenty of their Clojure libraries in the past and hanged out with people from Juxt even before LLMs were a thing, yes, I do think they could have spent the time required to both research and write articles like these. Again, won't claim for sure I know how they wrote this specific article, but I'm familiar with Juxt enough to feel relatively confident they could write it.
Juxt is more of a consultancy shop than "AI company", not sure where you got that from, guess their landing page isn't 100% clear what they actually does, but they're at least prominent in the Clojure ecosystem and has been for a decade if not more.
Your guess is worth what you paid for it.
Is it possible for a tool to know if something is AI written with high confidence at all? LLMs can be tuned/instructed to write in an infinite number of styles.
Don't understand how these tools exist.
The WikiEDU project has some thoughts on this. They found Pangram good enough to detect LLM usage while teaching editors to make their first Wikipedia edits, at least enough to intervene and nudge the student. They didn’t use it punatively or expect authoritative results however. https://wikiedu.org/blog/2026/01/29/generative-ai-and-wikipe...
They found that Pangram suffers from false positives in non-prose contexts like bibliographies, outlines, formatting, etc. The article does not touch on Pangram’s false negatives.
I personally think it’s an intractable problem, but I do feel pangram gives some useful signal, albeit not reliably.
It has Claude-isms, but it doesn't feel very Claude-written to me, at least not entirely.
What's making it even more difficult to tell now is people who use AI a lot seem to be actively picking up some of its vocab and writing style quirks.
AI tends to write like it is getting paid by the word. This article wasn't too egregious but an editor could have improved it.
"Written by an LLM" based on what data or symptom?
I'm starting to develop a physiological response when I recognize AI prose. Just like an overwhelming frustration, as if I'm hearing nails on chalkboard silently inside of my head.
I feel ya.... and i have to admit in the past i tried it for one article in my own blog thinking it might help me to express... tho when i read that post now i dont even like it myself its just not my tone.
therefor decided not gonne use any llm for blogging again and even tho it takes alot more time without (im not a very motivated writer) i prefer to release something that i did rather some llm stuff that i wouldnt read myself.
You have no evidence that it was.
This is the top reply on a substantial percentage of HN posts now and we should discourage it.
It is:
- sneering
- a shallow dismissal (please address the content)
- curmudgeonly
- a tangential annoyance
All things explicitly discouraged in the site guidelines. [1]
Downvoting is the tool for items that you think don't belong on the front page. We don't need the same comment on every single article.
It's not a shallow dismissal; it's a dismissal for good reason. It's tangential to the topic, but not to HN overall. It's only curmudgeonly if you assume AI-written posts are the inevitable and good future (aka begging the question). I really don't know how it's "sneering", so I won't address that.
It’s a dismissal with no evidence i.e. it’s a witch hunt. And no one should support that.
The fact that the whole thread has basically devolved into debates over if it is or isn't an LLM written article is proving well enough that it doesn't really matter one way or another
It is a witch hunt with no evidence whatsoever, all based on intuition. It is distraction from the main topic, a topic that enough people find interesting to stay on the top page. What was intellectually interesting has now become a bore fest of repeated back and forth. That’s disrespectful and inconsiderate. Write a new post about why do you think AI writing is dangerous. I don’t mind that. I’d upvote it.
> Downvoting is the tool for items that you think don't belong on the front page.
You can’t downvote submissions. That’s literally not a feature of the site. You can only flag submissions, if you have more that 31 karma.
And flagging is appropriate when you think content is not authentic
Twelve year old account and who knows how much lurking before that and I've never noticed this. Good lord.
Optimistically, I guess I can call myself some sort of live-and-let-live person.
The site guidelines were written pre-AI and stop making sense when you add AI-generated content into the equation.
Consider that by submitting AI generated content for humans to read, the statement you're making is "I did not consider this worth my time to write, but I believe it's worth your time to read, because your time is worth less than mine". It's an inherently arrogant and unbalanced exchange.
> The site guidelines were written pre-AI and stop making sense when you add AI-generated content into the equation.
Note: the guidelines are a living document that contain references to current AI tools.
> Consider that by submitting AI generated content for humans to read, the statement you're making is "I did not consider this worth my time to write, but I believe it's worth your time to read, because your time is worth less than mine". It's an inherently arrogant and unbalanced exchange.
This is something worth saying about a pure slop content. But the "charge" against the current item is that a reader encountered a feeling that an LLM was involved in the production of interesting content.
With enough eyeballs, all prose contains LLM tells.
We don't need to be told every time someone's personal AI detection algorithm flags. It's a cookie-banner comment: no new information for the reader, but a frustratingly predictable obstacle to scroll through.
We wouldn't need any personal AI detection algorithm flags if the authors simply stated up front that their content is AI generated.
But they won't do that, because deep down they feel shameful about it (as they should).
No idea why you're being downvoted. I've done my bit to redress the balance, I hope others do the same.
Not to single out your comment, but it feels like it's gotten to the point where HN could use a rule against complaining about AI generated content.
It seems like almost every discussion has at least someone complaining about "AI slop" in either the original post or the comments.
I disagree. I like to read articles and explore Show HN posts, but in the past 6 months I’ve wasted a lot of time following HN links that looked interesting but turned out to be AI slop. Several Show HN posts lately have taken me to repos that were AI generated plagiarisms of other projects, presented on HN as their own original ideas.
Seeing comments warning about the AI content of a link is helpful to let others know what they’re getting into when they click the link.
For this article the accusations are not about slop (which will waste your time) but about tell-tell signs of AI tone. The content is interesting but you know someone has been doing heavy AI polishing, which gives articles a laborious tone and has a tendency to produce a lot of words around a smaller amount of content (in other words, you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in)
Being able to share this information is important when discussing links. I find it much more helpful than the comments that appear criticizing color schemes, font choices, or that the page doesn’t work with JavaScript disabled.
> you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in
This got me thinking: what if LLMs are used to do the opposite? To condense a long prompt into a short article? That takes more work but might make the outcome more enjoyable as it contains more information.
> This got me thinking: what if LLMs are used to do the opposite? To condense a long prompt into a short article? That takes more work but might make the outcome more enjoyable as it contains more information.
You're fighting an uphill battle against the inherent tendency to produce more and longer text. There's also the regression to the mean problem, so you get less information (and more generic) even though the text is shorter.
Basically, it doesn't work
You're suggesting this is the complainant's fault?
Yes. These HN guidlines already basically cover it:
> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.
> Yes. These HN guidlines already basically cover it:
>> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
>> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.
They don't. people. tangential.
Yes, because all of them are now irrational about the possibility of LLM writing something they read.
HN has gotten to the point where it’s not even worth clicking the link because of course it’s ai slop.
There is some real content in the haystack, but we almost need some kind of curator to find and display it rather than a vote system where most people vote on the title alone.
If you’re looking for a place that surfaces only human-written content regardless of whether it’s interesting, rather than interesting content regardless of how it was written, HN is not the place.
There might be a market for your alternative though. Should be easy enough to build with Claude Code.
If the content was interesting, the author would've written about it himself.
By asking AI to write the article for you, you're asserting that the subject matter is not interesting enough to be worth your time to write, so why would it be worth my time to read?
You just need AI to read it for you and summarise back in to the original prompt.
I know the author personally. He's hardly the type of person to publish AI slop. Read his other articles and watch his talks, this is very much Henry's literary style.
Stop voting up slop articles and I'll stop commenting on it.
I didn't say they're dispositive. I said they're suspicious. Most people don't write effectively.
So LLMs write effectively and when people do you accuse them of using an LLM?
No, they don't. They use short sentences in weird, stilted ways.
But you have the ability to detect those "weird, stilted ways." Impressive.
I did not get any “written by LLM vibes”. I enjoyed it and it pulled me in to keep reading.
Who gives a crap if it was written by an LLM. Read it or don’t read it. Your choice.
If it conveys the idea and your learn something new, then it’s mission accomplished.
[flagged]
it's actually the second one I read that fit that description.
[flagged]
Software that ran on 4KB of memory and got humans to the moon still has undiscovered bugs in it. That says something about the complexity hiding in even the smallest codebases.
My guess is that in such low memory regimes, program length is very loosely correlated with bug rate.
If anything, if you try to cram a ton of complexity into a few kb of memory, the likelihood of introducing bugs becomes very high.
Well you don't have room for a lot of "defensive" code. You write the program to function on expected inputs, and hope that all the "shouldn't happen" scenarios actually don't happen.
Yet here we are compounding the issues by adding more and more layers to these systems... The higher the level it becomes the more security risks we take.
^ This is slop. Typical platitude that really means nothing.
More likely the llm misinterpreted something and hallucinated an error. Just yesterday Claude code hallucinated itself an infinite loop.
Oh dear. I strongly suggest this author look specification up in a dictionary.
It's (what they're describing is) just reverse engineering. That's what reverse engineering is.
Fortunately reverse engineering too is in the dictionary - to help anyone mistaking it for spec generation.
[deleted][deleted]
Implying that I did make such mistake, which I did not, unless you're willfully taking me overly literal.
Nor did they make any mistakes when they described how they produced a specification, (and indeed, that it is a specification) despite your insinuation otherwise, for a similar reason.
Maybe instead of pointing towards dictionaries, stop pretending that you lack reading comprehension, and get off of your high horse please.
Another CTO "published" an AI slop to get attention to their vibe-coded company that will disappear in two years. Tell me something new...
is this bug the reason why the toilet malfunctioned?
I don't think apollo 11's toilet malfunctioned, it was just not very good. Everything smelled like poop mixed with chemicals, and that was by design.
> Rust’s ownership system makes lock leaks a compile-time error.
Rust specifically does not forbid deadlocks, including deadlocks caused by resource leaks. There are many ways in safe Rust to deliberately leak memory - either by creating reference count cycles, or the explicit .leak() methods on various memory-allocating structures in std. It's also not entirely useless to do this - if you want an &'static from heap memory, Box.leak() does exactly that.
Now, that being said, actually writing code to hold a LockGuard forever is difficult, but that's mainly because the Rust type system is incomplete in ways that primarily inconvenience programmers but don't compromise the safety or meaning of programs. The borrow checker runs separately from type checking, so there's no way to represent a type that both owns and holds a lock at the same time. Only stacks and async types, both generated by compiler magic, can own a LockGuard. You would have to spawn a thread and have it hold the lock and loop indefinitely[0].
[0] Panicking in the thread does not deadlock the lock. Rust's std locks are designed to mark themselves as poisoned if a LockGuard is unwound by a panic, and any attempt to lock them will yield an error instead of deadlocking. You can, of course, clear the poison condition in safe Rust if you are willing to recover from potentially inconsistent data half-written by a panicked thread. Most people just unwrap the lock error, though.
Someone please amend the title and add "using claude code" because that's customary nowadays.
It seems the difference between this and conventional specification languages is that Allium's specs are in natural language, and enforcement is by LLM. This places it in a middle ground between unstructured plan files, and formal specification languages. I can see this as a low friction way to improve code quality.
Fascinating read. Well done. Everyone involved in the Apollo program was amazing and had many unsung heroes.
This is so insightfully and powerfully written I had literal chills running down my spine by the end.
What a horrible world we live in where the author of great writing like this has to sit and be accused of "being AI slop" simply because they use grammar and rhetoric well.
I was completely riveted the whole read. The description of Collins' dilemma is the first time I've seen an actual real world scenario described that might cause him to return to Earth alone.
If an LLM wrote that, then I no longer oppose LLM art.
I thought that was the least likeable part of the article. They speculated wildly, somehow making the leap that a trained astronaut would not resort to a computer reset if the problems persisted to weave the narrative that this bug was super-duper-serious indeed. They didn't need that and it weakened the presentation.
Are there any consequences for the Artemis 2 mission (ironic)?
For anyone who liked this, I highly suggest you take a look at the CuriousMarc youtube channel, where he chronicles lots of efforts to preserve and understand several parts of the Apollo AGC, with a team of really technically competent and passionate collaborators.
One of the more interesting things they have been working on, is a potential re-interpretation of the infamous 1202 alarm. It is, as of current writing, popularly described as something related to nonsensical readings of a sensor which could (and were) safely ignored in the actual moon landing. However, if I remember correctly, some of their investigation revealed that actually there were many conditions which would cause that error to have been extremely critical and would've likely doomed the astronauts. It is super fascinating.
And that's why it's harder (or easier?) to make the same landing again -- we taking way less chances. Today we know of way more failure modes than back then.
They sent people up in a tin can with the bare minimum computational power to manage navigation and control sequencing. It was barely safer than taking a barrel over Niagara Falls. We do have much more capable and reliable technology.
Buzz Aldrin (?) was quoted as recalling holding a pencil inside the capsule as they were out in space and thinking "that wall isn't very thick or strong, I could probably jam a pencil through it pretty easily..."
Death being a layer of aluminum away changes your mind.
It's a miracle nobody died in flight during the program. Exploding oxygen tank, rockets shaking themselves to pieces during launch, getting hit by lightning on top of a flying skyscraper full of kerosene and liquid oxygen....
Gus Grissom, Ed White, and Roger Chaffee died on the Apollo program. I feel it's not polite to ignore that fact even if you add an 'in flight' qualifier.
Starting from the first test pilots, a lot of people died for us to get to the point to launch that flight. So while no one died on the flight, lots of people died just getting us there. If I recall, in The Right Stuff, it's mentioned that those early test pilots had something like a 25% mortality rate.
Think about the "failure mode" of the aircraft that won World War II, the Supermarine Spitfire.
There was a fuel tank mounted between the engine and cockpit so if it took enough of a hit to puncture right through (not hard, in practice) the failure mode was that the cockpit was now full of a 350mph jet of burning petrol.
Still, it did the job.
The early jet age was pretty nuts. Check the Wikipedia page for a random fighter from the era and you'll see figures like, 1,300 built, 50 lost in combat, 1,100 lost in accidents. And that's operational aircraft. Test pilots were in even more danger.
Some were pretty bad, but none were nearly that bad. The B-58 Hustler lost 22% of its airframes, the F7U Cutlass 25%, the F-104 Starfighter in German service lost 33%. And those were outliers.
You're right, those numbers are from the F-8 but include non-total-loss accidents.
I don't think the numbers you quoted are outliers, though. The F-100 lost ~900 out of 2,300. The F-106 lost ~120/342. That's a pretty big list of planes with a 1/5-1/3 loss rate.
You should go back even a little further, the USPS air mail service lost 31 of the first 40 pilots.
Back in the days where the plan was "So we've built literal signal fires and giant concrete arrows and well, good luck, it won't help"
Have you ever listened to Robert Calvert's "Captain Lockheed and the Starfighters"?
"popularly described" and how it's currently understood are two different things. Because it's hard to explain to lay people, it's popularly described in a number of simplified ways, but it's well understood.
Related topic on CuriousMarc and co.’s AGC restoration: https://news.ycombinator.com/item?id=47641528
Still my all time favorite snippet of code.
https://github.com/chrislgarry/Apollo-11/blob/master/Luminar...Cadr here has no relation with lisp cadr, right?
Correct.
CADR is an AGC assembly directive defining a "complete address" including a memory bank, in this case a subroutine to be called by the preceding BANKCALL (TC = transfer control, i.e., store return address and jump to subroutine), which switches to the memory bank specified in the CADR before jumping to the address specified in the CADR.
For a brief explanation of AGC subroutine calls, see [1].
CAR and CDR in Lisp come from the original implementation on the IBM 704, where pointers to the two components of a cons cell were stored as the (C)ontents of the (A)ddress and (D)ecrement fields of a (R)egister (memory word).
(CADR x) is just shorthand for (CAR (CDR x)), i.e., a function that returns the second element of a list (assuming x is a well-formed list).
[1] https://epizodsspace.airbase.ru/bibl/inostr-yazyki/American_...
https://news.ycombinator.com/item?id=16008239
I'm having a really bad Mandala effect right now where I remember some XKCD that wrote a poem about this. Maybe I'm thinking of another comic.
Oh, it's Mandala effect now? I could swear it was Mandela before.
Can you explain this to me?
I think the point was the comments more than any of the code requiring explanation. There's nothing more permanent than a temporary solution
Wish I could... but I know of it from a previous HN post, where there is some discussion on its purpose.
https://news.ycombinator.com/item?id=22367416
I think it's interesting that they found what seems to be a real bug (should be independantly verified by experts). However I find their story mode, dramatization of how it could have happened to be poorly researched and fully in the realm of fiction. An elbow bumping a switch, the command module astronaut unable to handle the issue with only a faux nod to the fact that a reset would have cleared up the problem and it was part of their training. So it's really just building tension and storytelling to make the whole post more edgy. And yes, this is 100% AI written prose which makes it even more distasteful to me.
> An elbow bumping a switch [..] really just building tension and storytelling to make the whole post more edgy.
A guarded switch, no less.
But personally I'm trying to be more generous about this sort of thing: it is very very difficult to explain subtle bugs like this to non-technical people. If you don't give them a story for how it can actually happen, they tend to just assume it's not real. But then when you tell a nice story, all us dry aged curmudgeons tut tut about how irreverent and over the top it is :)
Finding the middle ground between a dry technical analysis and dramatization can be really hard when your audience is the entire internet.
[flagged]
The specs were derived from the code, not from the original requirements. So this is "we modeled what the code does, then found the code doesn't do what we modeled." That's circular unless the model captures intent that the code doesn't , and intent is exactly what you lose when you reverse-engineer specs. Would love to see this applied to a codebase where the original requirements still exist
But this seems like a reasonable approach for reverse-engineering, and it seems the bug they found is real.
The code was inconsistent with itself: that's not circular. Every path dropped the lock except one.
I took it as the extracted spec was weird and they looked into it.
Has someone verified this was an actual bug?
One of AI’s strengths is definitely exploration, f.e. in finding bugs, but it still has a high false positive rate. Depending on context that matters or it wont.
Also one has to be aware that there are a lot of bugs that AI won’t find but humans would
I don’t have the expertise to verify this bug actually happened, but I’m curious.
It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means. What is not clear is how they found themselves modeling the gyros software of the apollo code to begin with.
But, I do think their explanation of the lock acquisition and the failure scenario is quite clear and compelling.
They have some spec language and here,
https://github.com/juxt/Apollo-11/tree/master/specs
have many thousands of lines of code in it.
Anyways, it seems it would take a dedicated professional serious work to understand if this bug is real. And considering this looks like an Ad for their business, I would be skeptical.
> It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means.
Could the "AI native language" they used be Apache Drools? The "when" syntax reminded me of it...
https://kie.apache.org/docs/10.0.x/drools/drools/language-re...
(Apache Drools is an open source rule language and interpreter to declaratively formulate and execute rule-based specifications; it easily integrates with Java code.)
How did you pick out AI native and miss the rest of the SAME sentence?
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
That does not answer my confusion, especially when static analysis could reveal the same conclusion with that language. It's not clear what role ai played at all.
It seems pretty clear when you follow the link?
https://juxt.github.io/allium/
>It's not even clear if AI was used to find the bug
It's not even clear you read the article
Where do you think my confusion came from? All it says is that ai assists in resolving the gyroscope lock path, not why they decided to model the gyroscope lock path to begin with.
Please, keep your offensive comments to yourself when a clarifying comment might have sufficed.
Even worse, the other child comments are speculating (and didn't RTFA either) when the answer is clear in the article.
> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.
> distilling
A.k.a. as fabricating. No wonder they chose to use "AI".
That's the opposite of clear to me.
Has the article been updated?
2nd paragraph starts with: "We used Claude and Allium"
And later on: "With that obligation written down, Claude traced every path that runs after gyros_busy is set to true"
> It's not even clear if AI was used to find the bug
The intro says “We used Claude and Allium”. Allium looks like a tool they’ve built for Claude.
So the article is about how they used their AI tooling and workflow to find the bug.
The article does not explain anything about how they used AI—it just has some relation with the behavioral model a human seems to have written (and an AI does not seem necessary to use!)
Sure it does.
They used their AI tool to extract the rules for the Apollo guidance system based on the source code.
Then they used Claude to check if all paths followed those rules.
I've had a look at the (vibe coded) repro linked in the article to see if it holds up: https://github.com/juxt/agc-lgyro-lock-leak-bug/blob/c378438...
The repro runs on my computer, that's positive.
However, Phase 5 (deadlock demonstration) is entirely faked. The script just prints what it _thinks_ would happen. It doesn't actually use the emulator to prove that its thinking is right. Classic Claude being lazy (and the vibe coder not verifying).
I've vibe coded a fix so that the demonstration is actually done properly on the emulator. And also added verification that the 2 line patch actually fixes the bug: https://github.com/juxt/agc-lgyro-lock-leak-bug/pull/1
> However, Phase 5 (deadlock demonstration) is entirely faked. The script just prints what it _thinks_ would happen.
I see this a lot in AI slop, which I mostly get exposed to in the form of shitty pull requests.
You know when you're trying to explain Test-Driven Development to people and you want to explain how you write the simplest thing that passes the test and then improve the test, right? So you say "I want a routine that adds VAT onto a price, so I write a test that says £20+VAT is £24, and the simplest thing that can pass that test is just returning 24". Now you know and I know that the routine and its test will break if you feed it any value except £20, but we've proved we can write a routine and its test, and now we can make it more general.
Or maybe we don't care and we slap a big TODO: make this actually work on there because we don't need it to work properly now, we've got other things to do first, and every price coming up as £20+VAT is a useful indicator that we still have to make other bits work. It doesn't matter.
The problem is that AI slop code "generators" will just stop at that point and go "THERE LOOK IT'S DONE AND IT'S PERFECT!" and the people who believe in the usefulness of AI will just ship it.
Super interesting. I wish this article wasn’t written by an LLM though. It feels soulless and plastic.
It's not setting off any LLM alarm bells to me. It just reads like any other scientific article, which is very often soulless
It repeats a few points too many times for a professional writer to not catch it.
I don’t mind that they let an LLM write the text, but they should at least have edited it.
the subheadings are extremely AI IMHO
Isn't that just a normal way to organize a large document?
Any specific sections that stick out? Juxt in the past had really great articles, even before LLMs, and know for a fact they don't lack the expertise or knowledge to write for themselves if they wanted and while I haven't completely read this article yet, I'd surprise me if they just let LLMs write articles for them today.
Here's one tell-tale of many: "No alarm, no program light."
Another one: "Two instructions are missing: [...] Four bytes."
One more: "The defensive coding hid the problem, but it didn’t eliminate it."
That's just writing. I frequently write like that.
This insistence that certain stylistics patterns are "tell-tale" signs that an article was written by AI makes no sense, particularly when you consider that whatever stylistic ticks an LLM may possess are a result of it being trained on human writing.
These are just some of the good examples I found.
My hunch that this is substantially LLM-generated is based on more than that.
In my head it's like a Bayesian classifier, you look at all the sentences and judge whether each is more or less likely to be LLM vs human generated. Then you add prior information like that the author did the research using Claude - which increases the likelihood that they also use Claude for writing.
Maybe your detector just isn't so sensitive (yet) or maybe I'm wrong but I have pretty high confidence at least 10% of sentences were LLM-generated.
Yes, the stylistic patterns exist in human speech but RLHF has increased their frequency. Also, LLM writing has a certain monotonicity that human writing often lacks. Which is not surprising: the machine generates more or less the most likely text in an algorithmic manner. Humans don't. They wrote a few sentences, then get a coffee, sleep, write a few more. That creates more variety than an LLM can.
Fun exercise: https://en.wikipedia.org/wiki/Wikipedia:AI_or_not_quiz
Here's an alternative way of thinking about this...
Someone probably expended a lot of time and effort planning, thinking about, and writing an interesting article, and then you stroll by and casually accuse them of being a bone idle cheat, with no supporting evidence other than your "sensitive detector" and a bunch of hand-wavy nonsense that adds up to naught.
To start, this is more or less an advertising piece for their product. It's pretty clear that they want to sell you Allium. And that's fine! They are allowed! But even if that was written by a human, they were compensated for it. They didn't expend lots of effort and thinking, it's their job.
More importantly, it's an article about using Claude from a company about using Claude. I think on the balance it's very likely that they would use Claude to write their technical blog posts.
> They didn't expend lots of effort and thinking, it's their job.
Your job doesn't require you to think or expend effort?
While I agree with the sentiment, using AI to write the final draft of the article isn’t cheating. People may not like it, but it’s more a stylistic preference.
Using AI and a human byline is 100% cheating.
Yet another way the mere possibility of AI/LLM being involved diminishes the value of ALL text.
If there is constant vigilance on the part of the reader as to how it was created, meaning and value become secondary, a sure path to the death of reading as a joy.
Those aren’t good examples - that’s just LLMs living for free in your head.
I am reminded of the Simpsons episode in which Principal Skinner tries to pass off the hamburgers from a near-by fast food restaurant for an old family recipe, 'steamed hams,' and his guest's probing into the kitchen mishaps is met with increasingly incredible explanations.
I’m so glad the witch hunt has moved on to phrasing so I get less grief for my em dashes.
See also: “I'm Kenyan. I Don't Write Like ChatGPT. ChatGPT Writes Like Me” by Marcus Olang', https://marcusolang.substack.com/p/im-kenyan-i-dont-write-li...
For what it’s worth, Pangram reports that Marcus’ article is 100% LLM-written: https://www.pangram.com/history/640288b9-e16b-4f76-a730-8000...
In theory, wouldn't be too hard be to settle the question if whether he used ChatGPT to write it: get Olang to write a few paragraphs by hand, then have people judge (blindly) if it's the same style as the article. Which one sounds more like ChatGPT.
When people judge blindly, the are more likely to think the human is the AI and the AI is the human.
73% judged GPT 4.5 (edit: had incorrectly said 4o before)to be the human.
https://arxiv.org/abs/2503.23674
Not only are people bad at judging this, but are directionally wrong.
There is research showing the contrary that is far more convincing:
> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization.
https://arxiv.org/html/2501.15654v2
Great find, I've submitted this preprint as a standalone item: https://news.ycombinator.com/item?id=47678270
The times I've written articles, and those have gone through multiple rounds of reviews (by humans) with countless edits each time, before it ends up being published, I wonder if I'd pass that test in those cases. Initial drafts with my scattered thoughts usually are very different from the published end results, even without involving multiple reviewers and editors.
I hate that I can’t write em dashes freely anymore without people accusing the writing of being AI generated.
Even though they are perfect for usage in writing down thoughts and notes.
One thing you can try⸺admittedly it's not quite correct⸺is replacing them with a two-em dash. I've never seen an AI use one, and it looks pretty funky.
Since the advantage of standards is that there are so many to choose from, one lesser-used but still regionally acceptable approach (e.g. https://www.alberta.ca/web-writing-style-guide-punctuation#j...) is to use en-dashes offset with spaces.
I have nothing against em dashes. As long as your writing is human, experienced readers will be able to tell it's human. Only less experienced ones will use all or nothing rules. Em dashes just increase the likelihood that the text was LLM generated. They aren't proof.
That nuance is lost on the majority of anti-AI folks who’ve learned they get positive social reactions by declaring essentially everything to be AI written and condemnable.
“An em dash… they’re a witch!”… “it’s not just X, it’s Y… they’re a witch!”
> anti-AI folks who’ve learned they get positive social reactions by declaring essentially everything to be AI written and condemnable.
that's a strawman alright; all the comments complaining how they can't use their writing style without being ganged up on are positive karma from my angle, so I'm not sure the "positive social reactions" are really aligned with your imagination. Or does it only count when it aligns with your persecution complex?
You have the same problem apparently. You think it’s okay to go witch hunting and accuse people with no real evidence.
Evidently there are no experienced readers who post AI accusations.
Same weight as "there are no experienced men who'll ask a woman if she's pregnant."
Why do you care what others accuse you of?
No, it’s pretty obviously AI written. Not sure why you’re running so much interference for them…are you affiliated with this company?
[dead]
This is my exact writing style - I'm screwed.
I doubt you write like that. Where can I find your writing other than your comments which IMO don't read like the blog post?
Justify your doubt.
This is just writing; terse maybe and maybe not grammatically correct, but people write like that.
It's not just terseness, it's the rhythm and "it's not x, it's y".
In fact, the latter is the opposite of terseness. LLMs love to tell you what things are not way more than people do.
See https://www.blakestockton.com/dont-write-like-ai-1-101-negat...
(The irony that I started with "it's not just" isn't lost on me)
> (The irony that I started with "it's not just" isn't lost on me)
But an LLM wouldn't write "It's not just X, it's the Y and Z". No disrespect to your writing intended, but adding that extra clause adds just the slightest bit of natural slack to the flow of the sentence, whereas everything LLMs generate comes out like marketing copy that's trying to be as punchy and cloying as possible at all times.
"Here’s how the bug might have manifested."
For what it’s worth, Pangram thinks this article is fully human-written: https://www.pangram.com/history/f5f68ce9-70ac-4c2b-b0c3-0ca8...
The AI writing detectors are very unreliable. This is important to mention because they can trigger in the opposite direction (reporting human written text as AI generated) which can result in false accusations.
It’s becoming a problem in schools as teachers start accusing students of cheating based on these detectors or ignore obvious signs of AI use because the detectors don’t trigger on it.
Then pangram isn't very good, because that article is full of Claude-isms.
> because that article is full of Claude-isms
Not sure how I feel about the whole "LLMs learned from human texts, so now the people who helped write human texts are suddenly accused of plagiarizing LLMs" thing yet, but seems backwards so far and like a low quality criticism.
Real talk. You're not just making a good point -- you're questioning the dominant paradigm
Horrible
I'm sure some human writers would write:
> The specification forces this question on every path through the IMU mode-switching code. A reviewer examining BADEND would see correct, complete cleanup for every resource BADEND was designed to handle.
> The specification approaches from the other direction: starting from LGYRO and asking whether any paths fail to clear it.
> *Tests verify the code as written; a behavioural specification asks what the code is for.*
However this is a blog post about using Claude for XYZ, from an AI company whose tagline is
"AI-assisted engineering that unlocks your organization's potential"
Do you really think they spent the time required to actually write a good article by hand? My guess is that they are unlocking their own organizations potential by having Claude writes the posts.
> Do you really think they spent the time required to actually write a good article by hand?
Given I'm familiar with Juxt since before, used plenty of their Clojure libraries in the past and hanged out with people from Juxt even before LLMs were a thing, yes, I do think they could have spent the time required to both research and write articles like these. Again, won't claim for sure I know how they wrote this specific article, but I'm familiar with Juxt enough to feel relatively confident they could write it.
Juxt is more of a consultancy shop than "AI company", not sure where you got that from, guess their landing page isn't 100% clear what they actually does, but they're at least prominent in the Clojure ecosystem and has been for a decade if not more.
Your guess is worth what you paid for it.
Is it possible for a tool to know if something is AI written with high confidence at all? LLMs can be tuned/instructed to write in an infinite number of styles.
Don't understand how these tools exist.
The WikiEDU project has some thoughts on this. They found Pangram good enough to detect LLM usage while teaching editors to make their first Wikipedia edits, at least enough to intervene and nudge the student. They didn’t use it punatively or expect authoritative results however. https://wikiedu.org/blog/2026/01/29/generative-ai-and-wikipe...
They found that Pangram suffers from false positives in non-prose contexts like bibliographies, outlines, formatting, etc. The article does not touch on Pangram’s false negatives.
I personally think it’s an intractable problem, but I do feel pangram gives some useful signal, albeit not reliably.
It has Claude-isms, but it doesn't feel very Claude-written to me, at least not entirely.
What's making it even more difficult to tell now is people who use AI a lot seem to be actively picking up some of its vocab and writing style quirks.
Pangram has a very low false positive rate, but not the best false negative rate: https://www.pangram.com/blog/third-party-pangram-evals
You sound like a flat earther and a moon landing denier combined.
Pangram doesn't reliably detect individual LLM-generated phrases or paragraphs among human written text.
It seems to look at sections of ~300 words. And for one section at least it has low confidence.
I tested it by getting ChatGPT to add a paragraph to one of my sister comments. Result is "100% human" when in fact it's only 75% human.
Pangram test result: https://www.pangram.com/history/1ee3ce96-6ae5-4de7-9d91-5846...
ChatGPT session where it added a paragraph that Pangram misses: https://chatgpt.com/share/69d4faff-1e18-8329-84fa-6c86fc8258...
This is useful, thanks! TIL
So you're saying Pangram isn't worth much?
And it turns out at least the part about Rust and locks is plain wrong. What a surprise: https://news.ycombinator.com/reply?id=47676938&goto=item%3Fi...
AI tends to write like it is getting paid by the word. This article wasn't too egregious but an editor could have improved it.
"Written by an LLM" based on what data or symptom?
I'm starting to develop a physiological response when I recognize AI prose. Just like an overwhelming frustration, as if I'm hearing nails on chalkboard silently inside of my head.
I feel ya.... and i have to admit in the past i tried it for one article in my own blog thinking it might help me to express... tho when i read that post now i dont even like it myself its just not my tone.
therefor decided not gonne use any llm for blogging again and even tho it takes alot more time without (im not a very motivated writer) i prefer to release something that i did rather some llm stuff that i wouldnt read myself.
You have no evidence that it was.
This is the top reply on a substantial percentage of HN posts now and we should discourage it.
It is:
- sneering
- a shallow dismissal (please address the content)
- curmudgeonly
- a tangential annoyance
All things explicitly discouraged in the site guidelines. [1]
Downvoting is the tool for items that you think don't belong on the front page. We don't need the same comment on every single article.
[1] - https://news.ycombinator.com/newsguidelines.html
It's not a shallow dismissal; it's a dismissal for good reason. It's tangential to the topic, but not to HN overall. It's only curmudgeonly if you assume AI-written posts are the inevitable and good future (aka begging the question). I really don't know how it's "sneering", so I won't address that.
It’s a dismissal with no evidence i.e. it’s a witch hunt. And no one should support that.
The fact that the whole thread has basically devolved into debates over if it is or isn't an LLM written article is proving well enough that it doesn't really matter one way or another
It is a witch hunt with no evidence whatsoever, all based on intuition. It is distraction from the main topic, a topic that enough people find interesting to stay on the top page. What was intellectually interesting has now become a bore fest of repeated back and forth. That’s disrespectful and inconsiderate. Write a new post about why do you think AI writing is dangerous. I don’t mind that. I’d upvote it.
> Downvoting is the tool for items that you think don't belong on the front page.
You can’t downvote submissions. That’s literally not a feature of the site. You can only flag submissions, if you have more that 31 karma.
And flagging is appropriate when you think content is not authentic
Twelve year old account and who knows how much lurking before that and I've never noticed this. Good lord.
Optimistically, I guess I can call myself some sort of live-and-let-live person.
The site guidelines were written pre-AI and stop making sense when you add AI-generated content into the equation.
Consider that by submitting AI generated content for humans to read, the statement you're making is "I did not consider this worth my time to write, but I believe it's worth your time to read, because your time is worth less than mine". It's an inherently arrogant and unbalanced exchange.
> The site guidelines were written pre-AI and stop making sense when you add AI-generated content into the equation.
Note: the guidelines are a living document that contain references to current AI tools.
> Consider that by submitting AI generated content for humans to read, the statement you're making is "I did not consider this worth my time to write, but I believe it's worth your time to read, because your time is worth less than mine". It's an inherently arrogant and unbalanced exchange.
This is something worth saying about a pure slop content. But the "charge" against the current item is that a reader encountered a feeling that an LLM was involved in the production of interesting content.
With enough eyeballs, all prose contains LLM tells.
We don't need to be told every time someone's personal AI detection algorithm flags. It's a cookie-banner comment: no new information for the reader, but a frustratingly predictable obstacle to scroll through.
We wouldn't need any personal AI detection algorithm flags if the authors simply stated up front that their content is AI generated.
But they won't do that, because deep down they feel shameful about it (as they should).
No idea why you're being downvoted. I've done my bit to redress the balance, I hope others do the same.
Not to single out your comment, but it feels like it's gotten to the point where HN could use a rule against complaining about AI generated content.
It seems like almost every discussion has at least someone complaining about "AI slop" in either the original post or the comments.
I disagree. I like to read articles and explore Show HN posts, but in the past 6 months I’ve wasted a lot of time following HN links that looked interesting but turned out to be AI slop. Several Show HN posts lately have taken me to repos that were AI generated plagiarisms of other projects, presented on HN as their own original ideas.
Seeing comments warning about the AI content of a link is helpful to let others know what they’re getting into when they click the link.
For this article the accusations are not about slop (which will waste your time) but about tell-tell signs of AI tone. The content is interesting but you know someone has been doing heavy AI polishing, which gives articles a laborious tone and has a tendency to produce a lot of words around a smaller amount of content (in other words, you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in)
Being able to share this information is important when discussing links. I find it much more helpful than the comments that appear criticizing color schemes, font choices, or that the page doesn’t work with JavaScript disabled.
> you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in
This got me thinking: what if LLMs are used to do the opposite? To condense a long prompt into a short article? That takes more work but might make the outcome more enjoyable as it contains more information.
> This got me thinking: what if LLMs are used to do the opposite? To condense a long prompt into a short article? That takes more work but might make the outcome more enjoyable as it contains more information.
You're fighting an uphill battle against the inherent tendency to produce more and longer text. There's also the regression to the mean problem, so you get less information (and more generic) even though the text is shorter.
Basically, it doesn't work
You're suggesting this is the complainant's fault?
Yes. These HN guidlines already basically cover it:
> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.
> Yes. These HN guidlines already basically cover it:
>> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
>> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.
They don't. people. tangential.
Yes, because all of them are now irrational about the possibility of LLM writing something they read.
HN has gotten to the point where it’s not even worth clicking the link because of course it’s ai slop.
There is some real content in the haystack, but we almost need some kind of curator to find and display it rather than a vote system where most people vote on the title alone.
If you’re looking for a place that surfaces only human-written content regardless of whether it’s interesting, rather than interesting content regardless of how it was written, HN is not the place.
There might be a market for your alternative though. Should be easy enough to build with Claude Code.
If the content was interesting, the author would've written about it himself.
By asking AI to write the article for you, you're asserting that the subject matter is not interesting enough to be worth your time to write, so why would it be worth my time to read?
You just need AI to read it for you and summarise back in to the original prompt.
I know the author personally. He's hardly the type of person to publish AI slop. Read his other articles and watch his talks, this is very much Henry's literary style.
Stop voting up slop articles and I'll stop commenting on it.
Point to one.
This is on the front page now https://rajnandan.com/posts/taste-in-the-age-of-ai-and-llms/
I've seen way, way worse. Either someone LLM-polished something they already wrote, or they did their own manual editing pass.
The short sentence construction is the most suspicious, but I actually don't see anything glaring. It normally jumps out and hits me in the face.
>Hemingway's 4 Fast Rules For Effective Writing
1. Use Short Sentences
https://www.wordsthatsing.com.au/post/hemingway-rules
I didn't say they're dispositive. I said they're suspicious. Most people don't write effectively.
So LLMs write effectively and when people do you accuse them of using an LLM?
No, they don't. They use short sentences in weird, stilted ways.
But you have the ability to detect those "weird, stilted ways." Impressive.
I did not get any “written by LLM vibes”. I enjoyed it and it pulled me in to keep reading.
Who gives a crap if it was written by an LLM. Read it or don’t read it. Your choice.
If it conveys the idea and your learn something new, then it’s mission accomplished.
[flagged]
it's actually the second one I read that fit that description.
[flagged]
Software that ran on 4KB of memory and got humans to the moon still has undiscovered bugs in it. That says something about the complexity hiding in even the smallest codebases.
My guess is that in such low memory regimes, program length is very loosely correlated with bug rate.
If anything, if you try to cram a ton of complexity into a few kb of memory, the likelihood of introducing bugs becomes very high.
Well you don't have room for a lot of "defensive" code. You write the program to function on expected inputs, and hope that all the "shouldn't happen" scenarios actually don't happen.
Yet here we are compounding the issues by adding more and more layers to these systems... The higher the level it becomes the more security risks we take.
^ This is slop. Typical platitude that really means nothing.
More likely the llm misinterpreted something and hallucinated an error. Just yesterday Claude code hallucinated itself an infinite loop.
Both the article and repo[1] are slop.
[1] In the repo, the "reproduce" is just a bunch of print statements about what would happen, the bug isn't actually triggered: https://github.com/juxt/agc-lgyro-lock-leak-bug/blob/c378438...
> The specs were derived from the code itself
Oh dear. I strongly suggest this author look specification up in a dictionary.
It's (what they're describing is) just reverse engineering. That's what reverse engineering is.
Fortunately reverse engineering too is in the dictionary - to help anyone mistaking it for spec generation.
Implying that I did make such mistake, which I did not, unless you're willfully taking me overly literal.
Nor did they make any mistakes when they described how they produced a specification, (and indeed, that it is a specification) despite your insinuation otherwise, for a similar reason.
Maybe instead of pointing towards dictionaries, stop pretending that you lack reading comprehension, and get off of your high horse please.
Another CTO "published" an AI slop to get attention to their vibe-coded company that will disappear in two years. Tell me something new...
is this bug the reason why the toilet malfunctioned?
I don't think apollo 11's toilet malfunctioned, it was just not very good. Everything smelled like poop mixed with chemicals, and that was by design.
> Rust’s ownership system makes lock leaks a compile-time error.
Rust specifically does not forbid deadlocks, including deadlocks caused by resource leaks. There are many ways in safe Rust to deliberately leak memory - either by creating reference count cycles, or the explicit .leak() methods on various memory-allocating structures in std. It's also not entirely useless to do this - if you want an &'static from heap memory, Box.leak() does exactly that.
Now, that being said, actually writing code to hold a LockGuard forever is difficult, but that's mainly because the Rust type system is incomplete in ways that primarily inconvenience programmers but don't compromise the safety or meaning of programs. The borrow checker runs separately from type checking, so there's no way to represent a type that both owns and holds a lock at the same time. Only stacks and async types, both generated by compiler magic, can own a LockGuard. You would have to spawn a thread and have it hold the lock and loop indefinitely[0].
[0] Panicking in the thread does not deadlock the lock. Rust's std locks are designed to mark themselves as poisoned if a LockGuard is unwound by a panic, and any attempt to lock them will yield an error instead of deadlocking. You can, of course, clear the poison condition in safe Rust if you are willing to recover from potentially inconsistent data half-written by a panicked thread. Most people just unwrap the lock error, though.
Someone please amend the title and add "using claude code" because that's customary nowadays.
Also add "AI can make mistakes". Thank you.
Thank you for your attention to this matter.
An application of their specification language, https://juxt.github.io/allium/
It seems the difference between this and conventional specification languages is that Allium's specs are in natural language, and enforcement is by LLM. This places it in a middle ground between unstructured plan files, and formal specification languages. I can see this as a low friction way to improve code quality.
Fascinating read. Well done. Everyone involved in the Apollo program was amazing and had many unsung heroes.
This is so insightfully and powerfully written I had literal chills running down my spine by the end.
What a horrible world we live in where the author of great writing like this has to sit and be accused of "being AI slop" simply because they use grammar and rhetoric well.
I was completely riveted the whole read. The description of Collins' dilemma is the first time I've seen an actual real world scenario described that might cause him to return to Earth alone.
If an LLM wrote that, then I no longer oppose LLM art.
I thought that was the least likeable part of the article. They speculated wildly, somehow making the leap that a trained astronaut would not resort to a computer reset if the problems persisted to weave the narrative that this bug was super-duper-serious indeed. They didn't need that and it weakened the presentation.
Are there any consequences for the Artemis 2 mission (ironic)?