Vibecoding #2

Well for me, kind of a IT jack of all trades, a little programming, a little server management, a little DBA, HTML, Network and domain shat, etc, yeah a little bit of everything under my belt. I am finding Cursor incredibly enabling. You have heard it here before I know but I really wish I had this when I was in the trenches. I am retired now. I use it for various little programs I am writing and one big project. I use Cursor with opus 4.5 mostly, and finding that none of my questions and none of my requests have hit a brick wall, some walls for sure but not the kind of brick walls I would run into in the past where I would have no one to turn to immediately and those that I could turn to were also busy very busy with their s**, sometimes taking hours to get through to them or maybe even days. All that's gone. With the help of AI I can usually work out any kind of problem I have. Now, as for the quality of the code, well that may be another story. It might be twice as much as any , more experienced programmer might write but so far, with my experience, I have not seen anything that looks untoward.

Bottom line is that I am extremely grateful for AI has a teammate. As a solopreneur even more so. I'm building an application that I know would have taken at least $10 to 20K to build but all I'm paying is $60 a month Cursor Pro+ and my public facing server. And only $60 because I ran into a Cursor Claude limit.

Buckle up guys and gals, the midwit you always feared has the keys to the tank now...

I'm curious, what stopped you from learning the information you needed to complete these bigger projects before LLMs?

For me, mostly time, time to learn it, time it takes to complete these projects. We have so many other things to do, why bother learning the details of a specific language or tool if AI can do it in minutes. More time to learn about architecture/management/ux/design/guitar/etc.

But couldn't you then extend the argument to everything? Like why learn design if AI can do it in minutes? Or why learn guitar when AI can create music in minutes?

Its always worth learning something if you enjoy it, the same applies to code and languages. You can definitely create better apps knowing the details of a specific language than not knowing it and I think its still worth doing if you care about the ultimate quality of your work.

I think architecture and UX have more impact on the quality of the software you write for the end user than the details of a specific language. And when you're creating guitar training software, music and guitar playing knowledge has more impact on the quality of the software, than the details of a specific language.

When working with an LLM i care more about prompting it about software architecture, software UX, and the domain we're working on, than the details of the language it uses.

> I think architecture and UX have more impact on the quality of the software you write for the end user than the details of a specific language. And when you're creating guitar training software, music and guitar playing knowledge has more impact on the quality of the software, than the details of a specific language.

hard disagree on both points. You're talking about "impact" but surely you'll be a better coder if you can actually, you know, code? The other stuff is important sure but if you literally cannot read the code and just pleasure yourself with dreams of architecture and UX, what you're generating is 99% bad quality.

But prove me wrong, would love to see something you've made.

Here’s something I had Claude code make recently: https://github.com/ako/backing-tracks

Yup, Time.

I'm confident I can do anything with enough time. But I only have so much.

AI is going to enable so many more ideas to come to fruition and a better world because of it!

Suppose that when someone is retired, there is more time doing stuff, but time is running out…

If someone is in their 30’ or 40’ planning to work the next 5+ years on a project is no problem, even if it takes 10+ years in the end.

For the ones over 65 or older, it’s a different story…

I can tell you from my perspective that it really is a different story when you're over 65, I'm 73 so it's even more different. It's obligations that distract keep coming. I'm just having fun with it at this point. I just can't imagine what you guys are facing right now. Some existential s**. It's like you were swinging through the trees and all the trees disappeared now you got to learn how to live on the desert. You can do it!

I'm such a noob for an OFG; I responded to you at the top of the post. TL;DR is so much stuff got in the way, mostly of my own creation. A lot of excuses but all seemed reasonable at the time.

TS + Deno + dax is my favorite scripting environment. (Bun has a similar $ function built in.) For parsing CLI args, I like the builders from Cliffy (https://cliffy.io/) or Commander.js because you get typed options objects and beautiful help output for free.

If you want to script in Rust, xshell (https://docs.rs/xshell/latest/xshell/) is explicitly inspired by dax.

There's also zx (https://google.github.io/zx/). I use it every now and then and it's generally been a blast.

matklad made xshell

Oh! Well that makes sense.

The tension between 'learning fundamentals' and 'shipping products' has always existed in software development, but AI coding assistants make it more acute.

What's interesting is that the bottleneck is shifting. For experienced developers, the constraint was never typing speed or recalling syntax - it was understanding the problem domain, making architectural decisions, and maintaining systems over time. AI tools amplify this: they make the gap between 'can generate code' and 'can build maintainable systems' even wider.

The real question isn't whether to use AI tools, but how they change what's worth learning deeply. If AI can scaffold boilerplate, then understanding why certain patterns exist becomes more valuable, not less. The ability to evaluate AI-generated code, spot subtle bugs, or recognize when it's taking you down a bad architectural path - these skills require deep knowledge.

For solopreneurs and builders, the calculation is different. Getting something working that creates value for users is often better than perfect code that ships too late. The key is being honest about the tradeoffs: move fast with AI, but budget time to understand what you've built before scaling it.

> I am at the tail end of AI adoption, so I don’t expect to say anything particularly useful or novel.

Are they really late? Has everyone started using agents and paying $200 subscriptions?

Am I the one wrong here or these expressions of "falling behind" are creating weird FOMO in the industry?

EDIT: I see the usefulness of these tools, however I can't estimate how many people use them.

>Has everyone started using agents and paying $200 subscriptions?

If anything in my small circle the promise is waning a bit, in that even the best models on the planet are still kinda shitty for big project work. I work as a game dev and have found agents to only be mildly useful to do more of what I've already laid out, I only pay for the $100 annual plan with jetbrains and that's plenty. I haven't worked at a big business in a while, but my ex-coworkers are basically the same. a friend only uses chat now because the agents were "entirely useless" for what he was doing.

I'm sure someone is getting use out of them making the 10 billionth node.js express API, but not anyone I know.

I’m using it for scripts to automate yak shaving type tasks. But for code that’s expected to last, folks where I work are starting to get tired of all the early 2000s style code that solves a 15 LOC problem in 1000 lines through liberal application of enterprise development patterns. And, worse, we’re starting to notice an uptick in RCA meetings where a contributing factor was freshman errors sailing through code review because nobody can properly digest these 2,000 line pull requests at anywhere near the pace that Claude Code can generate them.

That would be fine if our value delivery rate were also higher. But it isn’t. It seems to actually be getting worse, because projects are more likely to get caught in development hell. I believe the main problem there is poorer collective understanding of generated code, combined with apparent ease of vibecoding a replacement, leads to teams being more likely to choose major rewrites over surgical fixes.

For my part, this “Duke Nukem Forever as a Service” factor feels the most intractable. Because it’s not a technology problem, it’s a human psychology problem.

So glad that I'm not the only one struggling with these huge generated PRs that are too big to honestly review, all while an AI reassuringly whispers in my ear "just trust me."

Don't get me wrong, overall I really like having AI in my workflow and have gotten many benefits. But even when I ask it to check its own work by writing test cases to prove that properties A, B and C hold, I just end up with thousands more lines of unit and integration tests that then take even more time to analyze -- like, what exactly is being tested here?, are the properties these tests purport to prove even the properties that I care about and asked the agent for in the first place, etc.

I have tried (with at least modest success) to use a second or third agent to review the work of the original coding agent(s), but my general finding has been that there is no substitute for actual human understanding from a legitimate domain expert.

Part of my work involves silicon design, which requires a lot of precision and complex timing issues, and I'll add that the best AI success I've had in those cases is a test-first approach (TDD), where I hand write a boatload of testbenches (that's what we call functional tests in chip design land), then coach my various agents to write the Verilog until my `make test` runs with no errors.

yeah it seems the usual front/back complexity is well in the training corpus of gemini and you get good enough output

Definitely FOMO. I have tried it once or twice and saw absolutely zero value in it. I will stick to writing the code by hand, even if it's "boring" parts. If I have to sit down and review it anyway, I can also go and write it myself.

Especially considering that these 200$ subscriptions are just the start because those companies are still mostly operating at a loss.

It's either going to be higher fees or Ads pushed into the responses. Last I need is my code sprinkled with Ads.

> saw absolutely zero value in it

At the very least, it can quickly build throwaway productivity enhancing tools.

Some examples from building a small education game: - I needed to record sound clips for a game. I vibe coded a webapp in <15 mins that had a record button, keyboard shortcuts to progress though the list of clips i needed, and outputted all the audio for over 100 separate files in the folder structure and with the file names i needed, and wrote the ffmpeg script to post process the files

- I needed json files for the path of each letter. gemini 3 converted images to json and then codex built me an interactive editor to tidy up the bits gemini go wrong by hand

The quality of the code didn't matter because all i needed was the outputs.

The final games can be found: https://www.robinlinacre.com/letter_constellations https://www.robinlinacre.com/bee_letters/ code: https://github.com/robinL/

So using something once or twice is plenty to give it a fair shake?

How long did it take to learn how to use your first IDE effectively? Or git? Or basically any other tool that is the bedrock of software engineering.

AI fools people into thinking it should be really easy to get good results because the interface is so natural. And it can be for simple tasks. But for more complex tasks, you need to learn how to use it well.

So is it strictly necessary to sign up for the 200 a month subscription? Because every time, without fail, the free ChatGPT, Copilot, Gemini, Mistral, Deepseek whatever chatbots, do not write PowerShell faster than I do.

They “type” faster than me, but they do not type out correct PowerShell.

Fake modules, out of date module versions, fake options, fake expectations of object properties. Debugging what they output makes them a significant speed down compared to just, typing and looking up PowerShell commands manually and using the -help and get-help functions in my terminal.

But again, I haven’t forked over money for the versions that cost hundreds of dollars a month. It doesn’t seem worth it, even after 3 years. Unless the paid version is 10 times smarter with significantly less hallucinations the quality doesn’t seem worth the price.

Not necessary. I use Claude/Chatgpt ~$20 plan. Then you'll get access to the cli tools, Claude Code and Codex. With web interface, they might hallucinate because they can't verify it. With cli, it can test its own code and keep iterating on it. That's one of the main difference.

> So is it strictly necessary to sign up for the 200 a month subscription?

No, the $20/month plans are great for minimal use

> Because every time, without fail, the free ChatGPT, Copilot, Gemini, Mistral, Deepseek whatever chatbots, do not write PowerShell faster than I do.

The exact model matters a lot. It's critical to use the best model available to avoid wasting time.

The free plans generally don't give you the best model available. If they do, they have limited thinking tokens.

ChatGPT won't give you the Codex (programming) model. You have to be in the $20/month plan or a paid trial. I recommend setting it to "High" thinking.

Anthropic won't give you Opus for free, and so on.

You really have to use one of the paid plans or a trial if you want to see the same thing that others are seeing.

You are exposing your lack of learning how to use the tools.

Tools like GitHub copilot can access the CLI. It can look up commands for you. Whatever you do in the terminal, it can do.

You can encode common instructions and info in AGENTS.md to say how and where to look up this info. You can describe what tools you expect it to use.

There are MCPs to help hook up other sources of context and info the model can use as well.

These are the things you need to learn to make effective use of the technology. It’s not as easy as going to ChatGPT and asking a question. It just isn’t.

Too many people never get past this low level of knowledge, then blame the tool.

I hate that Microsoft did this but I meant Microsoft 365 Copilot. Not Github Copilot. The Copilot I am talking about does not have those capabilities.

GitHub Copilot has a free tier as well. The $20/month one gives you much better models though.

All I’m saying is that the vast majority of people who say that AI dev tools don’t work and are a waste of time/money don’t know how and really haven’t even made a serious attempt at learning how to use them.

Well I am not a dev so I am just using the freely available search assist and chatbots. I am not saying the dev tools don’t work; I am saying the chatbot makes up fake PowerShell commands. If the dev tool version is better it still seems significantly less efficient and more expensive than just running “Get-Help” in the terminal from my perspective.

You are not disproving my point. You are just repeating that you don’t want to try to learn how you can actually use AI tools to help you work, but yet you still want to complain online that they are a waste of time and money.

To be fair there seems to be a weird dissonance between the marketing (fire your workers because AI can do everything now) and the reality (actually you need to spend time and effort and expertise to setup a good environment for AI tools and monitor them).

So when people just Yolo the ladder they don't get the results they expect.

I'm personally in the middle, chat interface + scripts seems to be the best for my productivity. Agentic stuff feels like a rabbit hole to me.

I'm on the $20 plan with Claude. It's worth mentioning that Claude and Codex both support per token billing, if your usage is so light that $20 is not worth it.

But if you use them for more than a few minutes, the tokens start adding up, and the subscriptions are heavily discounted relative to the tokens used.

There are also API-neutral tools like Charm Crush which can be used with any AI provider with API keys, and work reasonably well (for simple tasks at least. If you're doing something bigger you will probably want to use Claude Code).

Although each AI appears to be "tailored" to the company's own coding tools, so you'll probably get better results "holding it right".

That being said, the $3/month Z.ai sub also works great in Claude Code, in my experience. It's a bit slower and dumber than actual Claude, so I just went for the real thing in the end. 60 cents a day is not so bad! That's like, 1/3 of my canned ice coffee... the greater cost is the mental atrophy I am now undergoing ;)

No, it's not necessary to pay 200/mo.

I haven't had an issue with a hallucination in many months. They are typically a solved problem if you can use some sort of linter / static analysis tool. You tell the agent to run your tool(s) and fix all the errors. I am not familiar with PowerShell at all, but a quick GPT tells me that there is PSScriptAnalyzer, which might be good for this.

That being said, it is possible that PowerShell is too far off the beaten path and LLMs aren't good at it. Try it again with something like TypeScript - you might change your mind.

im unconvinced that you can learn to use it well while its moving so quickly.

whatever you learn now is going to be invalid and wasteful in 6 months

Who cares if it’s better in 6 months if you find it useful today?

And I reject that anything you learn today will be invalid. It’ll be a base of knowledge that will help you understand and adopt new tools.

It can also backfire and sometimes give you absolute made-up nonsense. Or waste your whole day moving in a circle around a problem.

Good news, if you upgrade to our $300 plan you can avoid all ads, which will instead be injected into the code that you ship to your users.

Regarding the $200 subscription. For Claude Code with Opus (and also Sonnet) you need that, yes.

I had ChatGPT Codex GPT5.2 high reasoning running on my side project for multiple hours the last nights. It created a server deployment for QA and PROD + client builds. It waited for the builds to complete, got the logs from Github Actions and fixed problems. Only after 4 days of this (around 2-4 hours) active coding I reached the weekly limit for the ChatGPT Plus Plan (23€). Far better value so far.

To be fully honest, it fucked up one flyway script. I have to fix this now my self :D. Will write a note in the Agent.md to never alter existing scripts. But the work otherwise was quite solid and now my server is properly deployed. If I would switch between High reasoning for Planing and Middle reasoning for coding, I would get even more usage.

> ChatGPT Codex GPT5.2 high reasoning

"... brought to you by Costco."

But seriously, I can't help but think that this proliferation of massive numbers of iterations on these models and productizations of the models is an indication that their owners have no idea what they are doing with any of it. They're making variations and throwing them against the wall to see what sticks.

It's really not that hard.

Codex = The model trained specifically for programming tasks. You want this if you're writing code.

GPT5.2 = The current version. You don't have to think about this, you just use the latest.

High Reasoning = A setting you select for balancing between longer thinking time or quicker answers. It's usually set and forget.

I've paid, but I am usually quick to adopt/trial things like this.

I think for me it's a case of fear of being left behind rather than missing out.

I've been a developer for over 20 years, and the last six months has blown me away with how different everything feels.

This isn't like JQuery hitting the scene, PHP going OO or one of the many "this is a game changer" experiences if I've had in my career before.

This is something else entirely.

Just because it feels faster or are you actually satisfied with the code that is being churned out? And what about long term prospects of maintaining said code?

I'm currently testing Claude Code for a project where it isn't coding. But the workflows built with it are now making me money after ~2 weeks, and I've previously done the same work manually, so I know the turnaround time: The turnaround for each deliverable is ~2 days with Claude and the fastest I've ever done it manually was 21 days. (Yes, I'm being intentionally vague - there isn't much of a moat for that project given how close Claude gets with very little prompting)

There are absolutely maintainability challenges. You can't just tell these tools to build X and expect to get away with not reviewing the output and/or telling it to revise it.

But if you loosen the reigns and review finished output rather than sit there and metaphorically look over its shoulder for every edit, the time it takes me to get it to revise its work until the quality is what I'd expect of myself is still a tiny fraction of what it'd take me to do things manually.

The time estimate above includes my manual time spent on reviews and fixes. I expect that time savings to increase, as about half of the time I spend on this project now is time spent improving guardrails and adding agents etc. to refine the work automatically before I even glance at the output.

The biggest lesson for me is that when people are not getting good results, most of the time it seems to me it is when people keep watching every step their agent takes, instead of putting in place a decent agent loop (create a plan for X; for each item on the plan: run tests until it works, review your code and fix any identified issues, repeat until the tests and review pass without any issues) and letting the agent work until it stops before you waste time reviewing the result.

Only when the agent repeatedly fails to do an assigned task adequately do I "slow it down" and have it do things step by step to figure out where it gets stuck / goes wrong. At which point I tell it to revise the agents accordingly, and then have it try again.

It's not cost effective to have expensive humans babysit cheap LLMs, yet a lot of people seem to want to babysit the LLMs.

Let's put it this way, I don't think AI will take my job/career away until company owners are also prepared to also let it handle being on-call. I still very accountable for the code produced.

I basically have two modes

1. "Snipe mode"

I need to solve problem X, here I fire up my IDE, start codex up and begin prompting to find the bug fix. Most of the time I have enough domain context about the code that once it's found and fixed the issue it's trivial for to reconcile that it's good code and I am shipping it. I can be sniping several targets at anyone time.

Most of my day-to-day work is in snipe mode.

2. "Feature mode"

This is where I get agents to build features/apps, I've not used this mode in anger for anything other than toy/side projects and I would not be happy about the long term prospects of maintaining anything I've produced.

It's stupidly stupidly fun/addictive and yes satisfying! :)

I rebuilt a game that I used to play when I was 11 and still had a small community of people actively wanting to play it, entirely by vibe coding, it works, it's live and honestly I've had some of the most rewarding feedback from making that I've had in my career from complete strangers!

I've also built numerous tools for myself and my kids that I'd never of had time to build before, and I now can. Again the level of reward for building apps etc that my kids ( and their friends ) are using, is very different from anything I've been career wise.

You must share that game. I don’t even know what it is and I want to play it!

I fear you'll be very disappointed :joy:

It doesn't work on mobile, and unless you played it back in the day the feedback from my friends who I've introduced it too, is that it's got quite the learning curve.

https://playbattlecity.com/

You can see all the horrible vibe coding here ( it's slop, it's utter utter slop, but it's working slop )

https://github.com/battlecity-remastered/battlecity-remaster...

lol this might have been a mistake, this is the most players it's ever had on it....

If your job is going to be reduced to ops it's a different job.

Ah, sorry, that wasn't the point I was trying to make.

I think ultimately I've succumbed to the fact that writing code is no longer a primary aspect of my job.

Reading/reviewing and being accountable for code that something else is written very much is.

Its blown me away also

I'm also fairly confident having it write my code is not a productivity boost, at least for production work I'd like to maintain long term

> Are they really late? Has everyone started using agents and paying $200 subscriptions?

No, most programmers I know outside of my own work (friends, family, and old college pals) don't use AI at all. They just don't care.

I personally use Cursor at work and enjoy it quite a bit, but I think the author is maybe at the tail end of _their circle's_ adoption, but not the industry's.

I can't figure out if I'm at the tail end of adoption, or the leading edge of disillusionment. I guess being able to say where you are in relation to the herd, depends on knowing where the herd is and which way it's headed. Which I don't know. All I know is, it seems to take longer to write the prompt, wait for the output, and then verify/correct the output, iteratively mind you, than to just write the goddamn code. And said process, in addition to being equal or longer, is also boring as fuck the entire time, and deeply annoying about half the time. Nobody is pressuring me to use it, but if this is the future, then I'm ready to change to a different career where I actually enjoy the work.

I do not pay for any AI nor does my employer pay for it on my behalf. It will stay this way for as long as I can make that work while remaining employed.

Thats like being proud of not using google or stackoverflow and only reading manuals, or using notepad instead of an IDE (or editor with language server support).

A 10$ GitHub Copilot or 20$ ChatGPT/Claude subscription get you a long way.

And if the employer isn't willing to spend this little money to improve their workers productivity they're pretty dumb.

There are valid concerns like privacy and oss licences. But lack of value or gain in productivity isn't one of them.

What kind of work do you do?

I'm a developer.

Are you a programmer?

The $20/mo I pay is quite affordable given the ROI.

I could see jumping between various free models.

With the $20/month claude subscription I frequently run into the session limit in after-work hobby projects. If the majority of your dayjob is actual programming (and not people management, requirements engineering, qa, etc, which is admittedly the reality of many "developer" jobs) the $200/month version seems almost required to have a productive coding assistant

How are you using it? I'm curious if you hit the limit so quickly because you're running it with Claude Code and so it's loading your whole project into its context, making tons of iterations, etc., or if you're using the chat and just asking focused questions and having it build out small functions or validate code quality of a file, and still hitting the limit with that.

Not because I think either way is better, just because personally I work well with AI in the latter capacity and have been considering subscribing to Claude, but don't know how limiting the usage limits are.

The $20/month will go fast if you're trying to drive the LLM to do all the coding.

It also goes very fast if you don't actively manage your context by clearing it frequently for new tasks and keeping key information in a document to reference each session. Claude will eat through context way too fast if you just let it go.

For true vibecoding-style dev where you just prompt the LLM over and over until things are done, I agree that $100 or $200 plans would be necessary though.

I am. I use Deepseek and free-tier ChatJippity and as a sometimes-better search.

EDIT: I also wasn't going to say it but it's not about the money for me, I just don't want to support any of these companies. I'm happy waste their resources for my benefit but I don't lean on it too often.

Well that's your problem, you are using Deepseek.

Its not even SOTA open source anymore, let alone competitive with GPT/Gemini/Grok.

¯\_(ツ)_/¯ Wasn't my point.

But this matters for your usage of LLMs.

I couldnt use GPT3 for coding and deepseek is at GPT3 + COT levels.

You're a little too focused on my dig about it being a "sometimes better search" which is fair.

I'm not going to be sending money every month to billion dollars companies who capitulate to a goon threatening to annex my country. I accept whatever consequences that has on my programming career.

Okay so you have some cognitive bias that is ruining your ability to make good decisions. Yikes.

Hush, child.

[dead]

[deleted]

I have a really simple app that I asked various models to build, but it requires interacting with an existing website.

(“Scrape kindle highlights from the kindle webpage, store it in a database, and serve it daily through an email digest”).

No success so far in getting it to do so without a lot of handholding and manually updating the web scraping logic.

It’s become something of a litmus test for me.

So, maybe there is some FOMO but in my experience it’s a lot of snake oil. Also at work, I manage a team of engineers and like 2 out of 12 clearly submit AI generated code. Others stopped using it, or just do a lot more wrangling of the output.

> Are they really late? Has everyone started using agents and paying $200 subscriptions?

If you rephrase the question as "Are most engineers already using AI?" -- because it transcends the specific modality (agents vs chat vs autocomplete) and $200 subscriptions (because so many tools are available for free) -- signs point to "yes."

Adoption seems to be all the way upto 85% - 90% in 2025, but there is a lot of variance in the frequency of use:

https://dora.dev/research/2025/

https://survey.stackoverflow.co/2025/

https://newsletter.pragmaticengineer.com/p/the-pragmatic-eng...

If there is FOMO, I'm not sure it's "weird."

> Are they really late? Has everyone started using agents and paying $200 subscriptions?

The $20/month subscriptions go a long way if you're using the LLM as an assistant. Having a developer in the loop to direct, review, and write some of the code is much more token efficient than trying to brute force it by having the LLM try things and rewrite until it looks like what you want.

If you jump to the other end of the spectrum and want to be in the loop as little as possible, the $100/$200 subscriptions start to become necessary.

My primary LLM use case is as a hyper-advanced search. I send the agent off to find specific parts of a big codebase I'm looking for and summarize how it's connected. I can hit the $20/month windowed limits from time to time on big codebases, but usually it's sufficient.

Is it FOMO if for $100 a month you can build things that takes months, and then refine them and polish them, test them, and have them more stable than most non-AI code has been for the last decade plus? I blame Marketing Driven development for why software has gone downhill. Look at Windows as a great example. "We can fix that later" is a lie, but not with a coding agent. You can fix it now.

> Is it FOMO if for $100 a month you can build things that takes months

It is the very definition of FOMO if there is an entire cult of people telling you that for a year, and yet after a year of hearing about how "everything has changed", there is still not a single example of amazing vibe-coded software capable of replacing any of the real-world software people use on a daily basis. Meanwhile Microsoft is shipping more critical bugs and performance regressions in updates than ever while boasting about 40% of their code being LLM-generated. It is especially strange to cite "Windows as a great example" when 2025 was perhaps one of the worst years I can remember for Windows updates despite, or perhaps because of, LLM adoption.

[deleted]

You misunderstood what I meant about Microsoft as a great example ;) I meant a great example of a bloated piece of software driven by marketing, you telling me all the ads on Microsoft was not the marketing / business department?

For MS, it's currently eroding through every single one of their products.

Azure, Office, Visual Studio, VS Code, Windows are all shipping faster than ever, but so much stuff is unfinished, buggy, incompatible to existing things, etc.

"We can fix it later" is not the staple of Marketing Driven Development. It's not why Windows has been getting more user-hostile and invasive, why its user experience has been getting worse and worse.

Enshittification is not primarily caused by "we can fix it later", because "we can fix it later" implies that there's something to fix. The changes we've seen in Windows and Google Search and many other products and services are there because that's what makes profit for Microsoft and Google and such, regardless of whether it's good for their users or not.

You won't fix that with AI. Hell, you couldn't even fix Windows with AI. Just because the company is making greedy, user-hostile decisions, it doesn't mean that their software is simple to develop. If you think Windows will somehow get better because of AI, then you're oversimplifying to an astonishing degree.

Every place where the marketing types are making us take on dev workloads has always deprioritized bugs. The only place I ever worked at where I felt like I could get things done, and done correctly, all the managers were former devs, including the director, and didn't waste any time taking crap from anyone if the dev needed time to make sure he got something done and done correctly. That didn't mean a license to waste time by any means, but it meant we knew we could get things done correctly. Some of our products were completely off the grid once published and might not see updates for months, years, or ever again.

> Every place where the marketing types are making us take on dev workloads has always deprioritized bugs.

My point is that they will continue to do so no matter how easy it is to fix bugs. It's a people problem, not a tech problem.

A hobby of mine was listening to motivational audio tapes from the 1980s.

In those days already the attitude with regard to professional work was that if you aren't constantly advancing in your industry, you are falling behind.

For the past 20 years the population of the internet has been increasingly sorted into filter bubbles, designed by media corporations which are incentivized to use dark patterns and addictive design to hijack the human brain by weaponizing its own emotions against and creating the illusion of popular consensus. To suggest that someone who has been vibecoding for only a few months is at the tail end of mass adoption is to reveal that one's brain has been pickled by exposure to Twitter. These tools are still extremely undercooked; insert the "meet potential man" meme here.

A bulk of developers are probably on claude $20 or cursor waiting for their company to pay up.

When fortune 500, 100 and 50 organizations are buying AI coding tools at scale (I know from personal exp), then I would say you're late. So yes. Late stage adoption for this wave.

Thank you to OP for writing a "LLM assisted coding" usage report which feels humble and honest. I'm in an evaluation phase myself and I find it very difficult to find good evaluations from typical developers.

Hey ibobev! I've actually been building something very close to box at a snail's pace for 2 years. I built it since I was working a lot with a bunch of raspberry pis where it was better to compile directly on the pi then on my mac but I didn't want to bother to ssh in or lose my local setup. The major difference with what I have so far is that the tool takes a direnv automagical approach to work with multiple machines across multiple projects/directories. It works across docker and ssh without any extra setup other than the tool on the client side.

I just got native LSP working this past weekend and in sublime it's as much as: { "clients": { "remote-gopls": { "command": [ "tool", "lsp", "gopls" ], "enabled": false, "selector": "source.go", }, } }

From what you built so far, do you think there's any appetite in people paying for this type of tool which lets you spin up infra on demand and gives you all the capabilities built so far? I'm skeptical and I may just release it all as OSS when it gets closer to being v1.0.

> do you think there's any appetite in people paying for this type of tool which lets you spin up infra on demand and gives you all the capabilities built so far?

(I'm not the author) The easiest way to charge for this kind of software is to make it SaaS, and I think that's pretty gross, especially for a CLI tool.

> I'm skeptical and I may just release it all as OSS

It doesn't have to be one or the other: you could sell the software under libre[1] terms, for example.

[1]: https://en.wikipedia.org/wiki/Free_software

Agreed that SaaS feels ugly. Agreed on selling it as libre. The only thing I could imagine charging for in a subscribed model would be for services that are hosted (e.g. instance/gpu provisioning/monitoring/maintenance in the cloud) while also offering the ability to self-host the same machinery.

Any time doing hand-rolled on demand spinning up of ec2 instances, be sure they are properly spun down later.

It's very easy to get hit with a massive bill due to just leaving instances around.

I run

sudo shutdown +15 (or other amount of minutes)

when I need a compute instance and don’t want to forget to turn it off. It’s a simple trick that will save you in some cases.

For expensive GPU instances I have a crontab one-liner that shuts the node down after 2 hours if I don't touch an override file (/var/run/keepalive).

/5 * * * [ -f /var/run/keepalive ] && [ $(( $(date +\%s) - $(stat -c \%Y /var/run/keepalive) )) -gt 7200 ] && shutdown -h now

Sort of like cancelling Disney right after signing up for the free trial.. nice!

I bought a glm-4.7 subscription and paired it with with claude code. According to usage, I have already used millions of tokens, yet I have barely reached the prompt limits for the tier I have per the 5 hour window. I haven't done anything crazy with my prompts as well. Now if this were something else billed per 1 million tokens it would have cost me a lot more. Yet apparently the majority of LLM providers are billing for 1 million tokens. What's the catch? What I am not understanding? What other providers have similar usage/pricing pattern? I can't see any of the providers billing per 1 million tokens to be useful/cost effective at all for coding. Granted, I am new to all of this, I want to see what the fuss is about so perhaps I am blind to something obvious.

@gtowey I think it was basically a lack of ability to focus for myriad distractions in my life, suboptimal upbringing, family obligations, I think I'm a little dyslexic, larger career obligations, top of everything else I had a career in the reserves also that's where a lot of my energy went primetime, also being a musician I found it a lot easier to play a musical instrument than spend the time needed develop chops programming wise. and then there's my family, lot of time spent focused on them. A lot of distractions on top of a pretty poor memory basically. I can say all that stuff now cuz I'm yes an entrepreneur but I'm also retired. It might actually boil down to a lack of discipline. But who knows, Tell you one thing for sure I am totally getting off on AI assisted programming. Any question I have, there's an answer that fits within my understanding of the way things work which is not minimal with all my experience in IT so whatever it is it's working.

I've been meaning to try programming with AI, but it's a bit tricky to manage the maze of models which one can download and run. I know one can pull down qwen3 or starcoder2 with ollama, but there's multiple variations and I have no idea what everything means and what the implications are for running them on my PC. I suppose cloud-based offers simplify all that, but I don't trust them for a hair.

Unfortunately, local modals are not good yet. For serious work, you'll need Claude/Gemimi/OpenAI models. Pretty huge difference.

Just try out AntiGravity ( Google) or Claude ( 17$)/month.

Ollama with qwen3 and starcoder2 are ok.

I'd recomment to experiment with the following models atm. (eg. with "open-webui"): - gpt-oss:20b ( fast) - nemotron-3-nano:30b ( good general purpose)

It doesn't compare to the large LLM's atm. though.

this isn't technically vibe coding right? this is just like using llms here and there for details you don't care to learn more about

My personal definition/interpretation would be if it's more than 50% of the project's code/work, even if one is reviewing the code (to whatever extent).

Love this. People making their own tools for their own problems.

Author needed a thing, it didn’t exist, so they made that thing.

That’s incredible empowerment.

[deleted]

> The spec ended up being 6KiB of English prose. The final implementation was 14KiB of TypeScript.

Wait, this is how people vibe code? I thought it was just giving instruction line by line and refining your program. People are really creating a dense, huge spec for their project first?

I have not seen any benefit of AI in programming yet, so maybe I should try it with specs and like a auto-complete as well.

Yes, definitely! The AI tooling works much like a human: it works better if you have a solid specification in place before you start coding. My best successes have been using a design document with clear steps and phases, usually the AI creates and reviews that as well and I eyeball it.

Lots of people are using PRD files for this. https://www.atlassian.com/agile/product-management/requireme...

I've been using checklists and asking it to check off items as it works.

Another nice feature of using these specs is that you can give the AI tools multiple kicks at the can and see which one you like the most, or have multiple tools work on competing implementations, or have better tools rebuild them a few months down the line.

So I might have a spec that starts off:

    #### Project Setup

    - [ ] Create new module structure (`client.py`, `config.py`, `output.py`, `errors.py`)
    - [ ] Move `ApiClient` class to `client.py`
    - [ ] Add PyYAML dependency to `pyproject.toml`
    - [ ] Update package metadata in `pyproject.toml`

And then I just iterate with a prompt like:

    Please continue implementing the software described in the file "dashcli.md".  Please implement the software one phase at a time, using the checkboxes to keep track of what has already been implemented.  Checkboxes ("[ ]") that are checked ("[X]") are done.  When complete, please do a git commit of the changes.  Then run the skill "codex-review" to review the changes and address any findings or questions/concerns it raises.  When complete, please commit that review changeset.  Please make sure to implement tests, and use tests of both the backend and frontend to ensure correctness after making code changes.

I’ve always heard (despite the incredibly fluid definition) that “vibe” coding specifically was much more on the “not reading/writing code” side of the spectrum vs. AI assisted code writing where you review and tweak it manually

Nice write up but one thing that irks me..

> I personally don’t know how to solve it without wasting a day. So, I spent a day vibecoding my own square wheel.

This worries me, in the case of OP it seems was dillegent and reviewed everything thoroughly but I can bet that that's not the majority.. And pushing to prod an integral piece without fully knowing how it works just terrifies me.

Capistrano? Fabric?

Terraform? Ansible?

Using those as hint I bet CC would have one-shotted it pretty easily

[deleted]

It feels like gnu parallel with --transfer-file would have solved this problem

The only way I would approach a problem like this is with NixOS and nixosTest/runTest. Development iterations can be against local VMs, and then you can fire it at AWS when you're confident it works correctly.

I wish I could be slightly more interested in this to actually see what payoff this person is reporting here but I just can't bring myself to care about this agentic nonsense.

Instructions unclear, Claude just spent three days and millions of tokens rebuilding SLURM from the ground up. /s

Maybe AWS ParallelCluster which is a managed SLURM on AWS.

This is excellent and innovative, good stuff! I guess my only comment is why not just Ansible, but this feels like a way simpler and better fit (and more fun/cool! Plus you can just easily modify/bend the tool to your liking/need) just for playing around in your local homelab etc

Lol the downvotes