It is rare that I say this but, thanks MS! Arguably just as, if not more, important is the BASIC that they wrote. That was what they actually wanted to do. DOS just got them the contract with IBM. For decades MS was really a developer tools company with a side biz of writing operating systems and other misc software. They also open sourced that BASIC code too [1].
wow, they had to OCR it back in from paper printouts
> This source code is old enough that it hadn’t been stored digitally. “A dedicated team of historians and preservationists led by Yufeng Gao and Rich Cini,” calling itself the “DOS Disassembly Group,” painstakingly transcribed and scanned in code from paper printouts provided by Paterson. This process was made even more difficult because modern OCR software struggled with the quality of the decades-old printout.
I'd like to hear more about what works in OCR of dot-matrix fonts.
I've been able to OCR letter-quality printer output to 97% (mostly Os and Xs problems).
But it seems that machine-learning text-recognition is also now biased to reject computer code because it doesn't look like human language.
Yet another case where text printed on paper outlived any digital storage.
Seems like it was never digitally stored in the first place, and the printed text was barely readable due to age. Not really a big win for paper.
Well it had to have been on disk or tape at some point. It wasn't all typed in by hand every time they needed to build a new version.
unless they used punch cards
> unless they used punch cards
For MS-DOS?
The idea that it never existed digitally is obviously untrue. Likely poor wording in the author's part. They probably meant something like, so old that a printout is all that survived (which sounds vaguely like not being digital to someone in an era so far removed from a time when programs were/could realistically be printed.)
> a time when programs were/could realistically be printed
Really depends on the program. Source code is often quite manageable. Even artifacts aren't always as large as you might expect. Busybox on my system weighs in at 1.9 MiB or alternatively 928 KiB with zstd maxed out.
But I don't really see a point to printing any of it. A situation that might require the printouts is likely to largely preclude the continued existence of modern electronics, the ability to replace batteries, or even a connection to a reliable electrical grid.
Yeah, that's why I tried to include both categories. Even for programs that are small enough to be printed, we just don't do it any more. I could have worded that part better myself.
How did they print it then, I wonder?
They had some old German guy with a big beard, and two interns, running some sort of big contraption that looked like a medieval torture instrument, and the interns would run and put letters in a row and then the old guy move a massive letter and in the end out came a bit of paper with source code on it.
> struggled with the quality of the decades-old printout.
barely
It sounds like this printout has deteriorated badly and was barely readable.
It is wonderful how early years of modern computing was brilliant. We treated machines as they really are: machines. Performance, creativity, science..., all possible to make a 386 machine work. Nowadays is all about libraries, virtualization, [bad] code over [bad] code over [bad] code..., I dont like it.
For a very long while now, we had programmers who never understood any low level concepts at all. They have started with js or python, and never looked 'down'. There are no limits to monstrosities they will consider normal.
Linus Torvalds, a few months ago, said something to this effect when discussing AI coding tools. That his (also, mine) generation was lucky to have started with low level stuff and managed to retain the understanding of the whole stack - and kids these days don't get that. Good luck acquiring this level of feel for computers, algorithms, data structures today, when a kid's first experience with coding will be a seemingly genius chatbot.
I sometimes think that my mental model of a computer is still an Apple ][+ with 48K of RAM leads to my writing better code.
And mine is a Commodore Vic-20 circa 1981, with 3583 bytes of free RAM. Programmed in 6502 assembler. Can't get much closer to the CPU than that.
I wonder how long it'll be before they release the source for the earliest Windows versions. The fact that they still have the source for this very old DOS at least gives hope that they also do for old Windows.
The day they would make Windows 2000 codebase open source (or source available) would be the day I could die happy (although I'd probably be long dead anyways by the time there's a glimmerof chance of it happening). What a beautiful, smooth-running operating system it was.
Wasn't there a 2000 source leak a while ago? I remember some exploits coming out after the leak.
Agreed. It's still my favorite Windows version.
There is a mostly complete leak of it...
I imagine its not far off. I get the impression they are almost done with windows as a platform.
I am sure that there is a lot good material to take inspiration and learning even from the early Windows 3.11.
Do a deep dive into how OS/360 formalized to having DOS.
/s ?
Pretty sure it's a bot or simple karma farming operation.
They waited a couple decades too long for this to be of interest.
Time to find vulnerabilities!
I remember in the naughts, coming across a dos machine that was quite out of time… even for the university basement it was living in next to a pile of lead brick. Its only job was to run an instrument via an home-built ISA card and write data out to 5.25” floppies.
What uses would this code have in 2026?
It's a single user OS that runs everything in ring zero by design. I'm not sure, definitionally, that it can have security vulnerabilities. I... guess maybe code execution on exposure to an untrusted floppy disk filesystem?
To see what decisions they made.
Like any historical document. Aim to understand the people of the time.
[deleted][deleted]
Too little, too late.
in the words of mr. mitch-hedburg “here, you throw this away“
He could have sold those printouts instead of giving them away.
Back when it was all written by hand and optimized well.
It is rare that I say this but, thanks MS! Arguably just as, if not more, important is the BASIC that they wrote. That was what they actually wanted to do. DOS just got them the contract with IBM. For decades MS was really a developer tools company with a side biz of writing operating systems and other misc software. They also open sourced that BASIC code too [1].
[1] https://opensource.microsoft.com/blog/2025/09/03/microsoft-o...
Discussion, on the source, at the time (79 points, 24 days ago, 19 comments) https://news.ycombinator.com/item?id=47957494
Or on the GitHub clone (162 points, 15 comments) https://news.ycombinator.com/item?id=47946813
wow, they had to OCR it back in from paper printouts
> This source code is old enough that it hadn’t been stored digitally. “A dedicated team of historians and preservationists led by Yufeng Gao and Rich Cini,” calling itself the “DOS Disassembly Group,” painstakingly transcribed and scanned in code from paper printouts provided by Paterson. This process was made even more difficult because modern OCR software struggled with the quality of the decades-old printout.
I'd like to hear more about what works in OCR of dot-matrix fonts.
I've been able to OCR letter-quality printer output to 97% (mostly Os and Xs problems).
But it seems that machine-learning text-recognition is also now biased to reject computer code because it doesn't look like human language.
Yet another case where text printed on paper outlived any digital storage.
Seems like it was never digitally stored in the first place, and the printed text was barely readable due to age. Not really a big win for paper.
Well it had to have been on disk or tape at some point. It wasn't all typed in by hand every time they needed to build a new version.
unless they used punch cards
> unless they used punch cards
For MS-DOS?
The idea that it never existed digitally is obviously untrue. Likely poor wording in the author's part. They probably meant something like, so old that a printout is all that survived (which sounds vaguely like not being digital to someone in an era so far removed from a time when programs were/could realistically be printed.)
> a time when programs were/could realistically be printed
Really depends on the program. Source code is often quite manageable. Even artifacts aren't always as large as you might expect. Busybox on my system weighs in at 1.9 MiB or alternatively 928 KiB with zstd maxed out.
But I don't really see a point to printing any of it. A situation that might require the printouts is likely to largely preclude the continued existence of modern electronics, the ability to replace batteries, or even a connection to a reliable electrical grid.
Yeah, that's why I tried to include both categories. Even for programs that are small enough to be printed, we just don't do it any more. I could have worded that part better myself.
How did they print it then, I wonder?
They had some old German guy with a big beard, and two interns, running some sort of big contraption that looked like a medieval torture instrument, and the interns would run and put letters in a row and then the old guy move a massive letter and in the end out came a bit of paper with source code on it.
> struggled with the quality of the decades-old printout.
barely
It sounds like this printout has deteriorated badly and was barely readable.
Recent and related:
Microsoft open sources DOS 1.00 on 45th anniversary - https://news.ycombinator.com/item?id=47957494 - April 2026 (19 comments)
It is wonderful how early years of modern computing was brilliant. We treated machines as they really are: machines. Performance, creativity, science..., all possible to make a 386 machine work. Nowadays is all about libraries, virtualization, [bad] code over [bad] code over [bad] code..., I dont like it.
For a very long while now, we had programmers who never understood any low level concepts at all. They have started with js or python, and never looked 'down'. There are no limits to monstrosities they will consider normal.
Linus Torvalds, a few months ago, said something to this effect when discussing AI coding tools. That his (also, mine) generation was lucky to have started with low level stuff and managed to retain the understanding of the whole stack - and kids these days don't get that. Good luck acquiring this level of feel for computers, algorithms, data structures today, when a kid's first experience with coding will be a seemingly genius chatbot.
I sometimes think that my mental model of a computer is still an Apple ][+ with 48K of RAM leads to my writing better code.
And mine is a Commodore Vic-20 circa 1981, with 3583 bytes of free RAM. Programmed in 6502 assembler. Can't get much closer to the CPU than that.
I wonder how long it'll be before they release the source for the earliest Windows versions. The fact that they still have the source for this very old DOS at least gives hope that they also do for old Windows.
The day they would make Windows 2000 codebase open source (or source available) would be the day I could die happy (although I'd probably be long dead anyways by the time there's a glimmerof chance of it happening). What a beautiful, smooth-running operating system it was.
Wasn't there a 2000 source leak a while ago? I remember some exploits coming out after the leak.
Agreed. It's still my favorite Windows version.
There is a mostly complete leak of it...
I imagine its not far off. I get the impression they are almost done with windows as a platform.
I am sure that there is a lot good material to take inspiration and learning even from the early Windows 3.11.
Do a deep dive into how OS/360 formalized to having DOS.
/s ?
Pretty sure it's a bot or simple karma farming operation.
They waited a couple decades too long for this to be of interest.
Time to find vulnerabilities!
I remember in the naughts, coming across a dos machine that was quite out of time… even for the university basement it was living in next to a pile of lead brick. Its only job was to run an instrument via an home-built ISA card and write data out to 5.25” floppies.
What uses would this code have in 2026?
It's a single user OS that runs everything in ring zero by design. I'm not sure, definitionally, that it can have security vulnerabilities. I... guess maybe code execution on exposure to an untrusted floppy disk filesystem?
To see what decisions they made. Like any historical document. Aim to understand the people of the time.
Too little, too late.
in the words of mr. mitch-hedburg “here, you throw this away“
He could have sold those printouts instead of giving them away.
Back when it was all written by hand and optimized well.
[flagged]
[flagged]