Kernighan on Programming

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it"

This has been a timely PSA.

Kernighan's Lever - https://linusakesson.net/programming/kernighans-lever/index....

This article is perennially posted here and is probably the best breakdown of this quote.

So is reviewing and verifying code. Maybe not twice as "hard" if you're skilled in such things. But most programmers I've worked with can't be bothered to write tests let alone verify correctness by other means (good tests, property tests, types, model checking, etc).

It's one thing to point out small trivialities like initialization and life time issues in a small piece of code. But it's quite another to prove they don't exist in a large code base.

Kernigan is a good source of quotes and thinking on programming.

I am fascinated by the prevalence of wanting "tests" from hacker news comments. Most of the code I have worked on in the past 20 years did not have tests. Most of it was shopping carts, custom data transformation code, orchestrating servers, plugin code functionality to change some aspect of a website.

Now, I have had to do some salesforce apex coding and the framework requires tests. So I write up some dummy data of a user and a lead and pass it through the code, but it feels of limited value, almost like just additional ceremony. Most of the bugs I see are from a misconception of different users about what a flag means. I can not think of a time a test caught something.

The organization is huge and people do not go and run all the code every time some other area of the system is changed. Maybe they should? But I doubt that would ever happen given the politics of the organization.

So I am curious, what are the kinds of tests do people write in other areas of the industry?

what are the kinds of tests do people write in other areas of the industry?

Aerospace here. Roughly this would be typical:

- comprehensive requirements on the software behavior, with tests to verify those requirements. Tests are automated as much as possible (e.g., scripts rather than manual testing)

- tests are generally run first in a test suite in a completely virtual software environment

- structural coverage analysis (depending on level of criticality) to show that all code in the subsystem was executed by the testing (or adequately explain why the testing can't hit that code)

- then once that passes, run the same tests in a hardware lab environment, testing the software as it runs on the the actual physical component that will be installed on the plane

- then test that actually on a plane, through a series of flight tests. (The flight testing would likely not be as entirely comprehensive as the previous steps)

A full round of testing is very time-consuming and expensive, and as much as possible should be caught and fixed in the virtual software tests before it even gets to the hardware lab, much less to the plane.

How do you know you haven't unknowingly broken something when you made a change?

I think if: - the code base implements many code paths depending on options and user inputs and options such that a fix for code path A may break code path B - it takes a great deal of time to run in production such that issues may only be caught weeks or months down the line when it becomes difficult to pinpoint their cause (not all software is real-time or web) - any given developer does not have it all in their head such that they can anticipate issues codebase wide

then it becomes useful to have (automated) testing that checks a change in function A didn't break functionality in function B that relies on A in some way(s), that are just thorough enough that they catch edge cases, but don't take prod levels of resources to run.

Now I agree some things might not need testing beyond implementation. Things that don't depend on other program behavior, or that check their inputs thoroughly, and are never touched again once merged, don't really justify keeping unit tests around. But I'm not sure these are ever guarantees (especially the never touched again).

The value of tests is when the fail they show you of something you broke that you didn't realize. 80% (likely more, but I don't know how to measure) of the tests I write could safely be thrown away because they fail again - but I don't know which tests will fail and thus inform me that I broke things.

The system I'm working on has been in production for 12 years - we have added a lot of new features over those years. Many of those needed us to hook into existing code, tests help us know that we didn't break something that used to work.

Maybe that helps answer the question of why they are important to me. They might not be to your problems.

I think the whole concept of testing confuses a lot of people. I know I was (and still sometimes am) confused about the various "best practices" and opinions around testing. As as well as how/what/when to test.

For my projects, I mainly want to Get Shit Done. So I write tests for the major functional areas of the business logic, mainly because I want to know ASAP when I accidentally break something important. When a bug is found that a test didn't catch, that's usually an indicator that I forgot a test, or need to beef up that area of functional testing.

I do not bother with TDD, or tests that would only catch cosmetic issues, and I avoid writing tests that only actually test some major dependency (like an ORM).

If the organization you are in does not value testing, you are probably not going to change their mind. But if you have the freedom to write worthwhile tests for your contributions to the code, doing so will probably make you a better developer.

Yes

I worked for a company that had no tests.

I worked on the core software, new employee, the programmer who wrote the software gone...

Regularly released new features and found out, some days later, that I'd broken some peripheral, but important, business logic.

Drove me mad! I was not allowed to write tests, it was "unproductive"

Follow-up questions: Do you test manually? Why? Do you debug manually? Why?

You wanted examples: https://github.com/openjdk/jdk/tree/master/test/jdk/java/uti...

I do test manually in salesforce. Mainly its because you do not control everything and I find the best test is to log in as the user and go through the screens as they do. I built up some selenium scripts to do testing.

In old days, for the kinds of things I had to work on, I would test manually. Usually it is a piece of code that acts as glue to transform multiple data sources in different formats into a database to be used by another piece of code.

Or a aws lambda that had to ingest a json and make a determination about what to do, send an email, change a flag, that sort of thing.

Not saying mock testing is bad. Just seems like overkill for the kinds of things I worked on.

I haven't worked in a codebase in 20 years that didn't have some sort of tests.

Out of interest, what language ecosystems do you tend to work in?

My guess is that some languages - like Go - have a more robust testing culture than other languages like PHP.

Not who you asked but I think it comes down to risk/reward. The consequences of some user finding a big in most websites is low, compared to the risk of an astronaut finding a bug the hard way whilst attempting re-entry.

There is genuinely a reasonable and rational argument to “testing requires more effort than fixing the issues as users find them” if the consequences are low. See video games being notorious for this.

So, industry is more important than language I’d say.

I don't see testing as a quality thing any more, I see it as a developer productivity thing.

If my project has tests I can work so much faster on it, because I can confidently add tests and refactor and know that I didn't break existing functionality.

You gotta pay that initial cost to to get the framework in place though. That takes early discipline.

Reading this article seems outdated and therefore quaint in some areas now, the “we’ve all felt that moment of staring at a small bit of simple code that can’t possibly be failing and yet it does” - I so rarely experience this anymore as I’d have an LLM take a look and they tend to find these sort of “stupid” bugs very quickly. But my earlier days were full of these issues so it’s almost nostalgic for me now. Bugs nowadays are far more insidious when they crop up.

Trying to get LLMs to understand bugs that I myself am stuck on has had an approximately 0% success rate for me.

They're energetic "interns" that can churn out a lot of stuff fast but seem to struggle a lot with critical thinking.

In the age of LLMs, debugging is going to be the large part of time spent.

Interesting, I actually find LLMs very useful at debugging. They are good at doing mindless grunt work and a great deal of debugging in my case is going through APIs and figuring out which of the many layers of abstraction ended up passing some wrong argument into a method call because of some misinterpretation of the documentation.

Claude Code can do this in the background tirelessly while I can personally focus more on tasks that aren't so "grindy".

> In the age of LLMs, debugging is going to be the large part of time spent.

That seems a premature conclusion. LLMs are quite good as debugging and much faster than people.

[dead]

The real question is whether “debugging” the LLM is going to be as effective as debugging the code.

IME it pays dividends but it can be really painful. I’ve run into a situation multiple times where I’m using Claude Code to write something, then a week later while working it’ll come up with something like “Oh wait! Half the binaries are in .Net and not Delphi, I can just decompile them with ilspy”, effectively showing the way to a better rewrite that works better with fewer bugs that gets done in a few hours because I’ve got more experience from the v1. Either way it’s tens of thousands of lines of code that I could never have completed myself in that amount of time (which, given problems of motivation, means “at all”).

LLMs are where you need the most tests.

You want them writing tests especially in critical sections, I'll push to 100% coverage. (Not all code goes there, but thing that MUST work or everything crumbles. Yeah I do it.)

There was one time I was doing the classic: Pull a bug find 2 more thing. And I just told the LLM. "100% test coverage on the thing giving me problems." it found 4 bugs, fixed them, and that functionality has been rock solid since.

100% coverage is not a normal tool. But when you need it. Man does it help.