90% of Claude-linked output going to GitHub repos w <2 stars

Perfect example of a base rate fallacy - https://en.wikipedia.org/wiki/Base_rate_fallacy

What percentage of GitHub activity goes to GitHub repos with less than 2 stars? I would guess it's close to the same number.

My reaction as well -- I have a few dozen public repos of 100% human-written code, most are 0 stars!

The first thing I do when I make a new repo is star it myself ;-)

https://knowyourmeme.com/memes/obama-awards-obama-a-medal

I have a few dozen org repos, of course none of them have stars, who stars their corporate repos?

We need to have a talk about your pieces of flair.

> who stars their corporate repos?

workers on the management track

The actual number is that 98% have less than 2 stars (0 or 1). About 90.25% has zero stars.

I think this is useful in answering the grandparent comment's question:

stars : uniq(k)

1 : 14946505

10 : 1196622

100 : 213026

1000 : 28944

10000 : 1847

100000 : 20

each line (mostly) being equal length provides me an odd comfort

interesting that you only need ~150 stars on a project for it to be in the top 1%

You should check recent commits, because obviously there are a lot of forked 0 star repos.

How do you know that?

https://ghe.clickhouse.tech/

[flagged]

It is relevant because if the vast, vast majority of repos have 2 or less stars then it's not that weird that a great deal of repos linked are, too, 2 or less stars.

[flagged]

Yeah. Most of my public repos have 0 stars. Most of what I write sucks.

GitHub Stars (or any online 'star count') is not an indicator of quality.

Yeah, but knowing something sucks means you are probably reasonably competent at coding. =3

https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

Doesn’t matter if the recruiter doesn’t call you back because you’re not a 1000x engineer.

Why would anyone settle for underpaid positions from an agency taking a 7% contract cut, and purging CVs from any external firm also contracting with their services.

Most people figure out this scam very early in life, but some cling to terrible jobs for unfathomable reasons. =3

> Why would anyone settle for

The answer to such questions is always that, given their circumstances, they have no realistic choice not to.

This is very obvious, and it's frustrating to continually see people pretend otherwise.

> they have no realistic choice not to

If folks expect someone to solve problems for them, than 100% people end up unhappy. The old idea of loyalty buying a 30 year career with vertical movement died sometime in the 1990s.

Ikigai chart will help narrow down why people are unhappy:

https://stevelegler.com/2019/02/16/ikigai-a-four-circle-mode...

Even if folks are not thinking about doing a project, I still highly recommend this crash course in small business contracts

https://www.youtube.com/watch?v=jVkLVRt6c1U

Rule #24: The lawyers Strategic Truth is to never lie, but also avoid voluntarily disclosing information that may help opponents.

Best of luck =3

+1 star for ttul

Off topic, but it reminds me of another principle: every geographic heatmap is just a population map. https://xkcd.com/1138/

https://reddit.com/r/PeopleLiveInCities/

That, or https://i.redd.it/soy72dye93o91.jpg

Yep, every time I see a heatmap of Australian lotto winners - very high correlation with Australia's population.

shouldn't a serious heatmap (or any comparative graph for that matter) normalize the stat being displayed versus the baseline population in that bucket?

in otherwords, plot the percentage or average metric and not the absolute metric.

e.g. number of lotto winners per thousand people living in that grid, percentage of starred repos as a percentage of all repos, per capita alcohol consumption, average screen-time etc.

There is still a sampling bias if you compare blanket human written repos. I would guess people are far more likely to share their homework assignments, experiments, hackathon results, weekend toys, etc. as a public repo if they put some amount of work into it. I would guess minority of those would get any stars at all. If the whole thing was generated by AI in less then 20 minutes, I would guess they are more likely to simply throw it away when they are done with it.

Personally I think comparing github stars is always going to be a fraught metric.

[deleted]

Already enough comments about base rate fallacy, so instead I'll say I'm worried for the future of GitHub.

Its business is underpinned by pre-AI assumptions about usage that, based on its recent instability, I suspect is being invalidated by surges in AI-produced code and commits.

I'm worried, at some point, they'll be forced to take an unpopular stance and either restrict free usage tiers or restrict AI somehow. I'm unsure how they'll evolve.

Having managed GitHub enterprises for thousands of developers who will ping you at the first sign of instability.. I can tell you there has not been one year pre-AI where GitHub was fully "stable" for a month or maybe even a week, and except for that one time with Cocoapods that downtime has always been their own doing.

In a (possibly near) future where most new code is generated by AI bots, the code itself becomes incidental/commodotized and it's nothing more than an intermediate representation (IR) of whatever solution it was prompt-engineered to produce. The value will come from the proposals, reviews, and specifications that caused that code to be produced.

Github is still code-centric with issues and discussions being auxilliary/supporting features around the code. At some point those will become the frontline features, and the code will become secondary.

The instability is related to their Azure migration isn't it? Cynically you could say it hasn't been helped by the rolling RIFs at Microsoft

I keep hearing this, and I know Azure has had some issues recently, but I rarely have an issue with Azure like I do with GitHub. I have close to 100 websites on Azure, running on .NET, mostly on Azure App Service (some on Windows 2016 VMs). These sites don't see the type of traffic or amount of features that GitHub has, but if we're talking about Azure being the issue, I'm wondering if I just don't see this because there aren't enough people dependent on these sites compared to GitHub?

Or instead, is it mistakes being made migrating to Azure, rather than Azure being the actual problem? Changing providers can be difficult, especially if you relied on any proprietary services from the old provider.

Running on Azure is not the same as migrating to Azure.

Making big changes like the tech that underpins your product while still actively developing that product means a lot of things in a complicated system changing at once which is usually a recipe for problems.

Incidentally I think that is part of the current problem with AI generated code. Its a fire hose of changes in systems that were never designed or barely holding together at their existing rate of change. AI is able to produce perfectly acceptable code at times but the churn is high and the more code the more churn.

Azure is fine, stability wise.

The assumption is it would be mistakes in their migration - edge cases that have to be handled differently either in the infrastructure code, config or application services.

Does anyone actually know? So far I've just seen people guessing, and seeing that repeated.

I dont believe sudden influx of few million bots running 24/7 generating PRa and commits and invoking actions does not impact GitHub.

It even sounds silly when you say it this way.

That is fair, in fact I just came across their recent blog post on this. They're pointing to usage growth as the issue https://github.blog/news-insights/company-news/addressing-gi...

Counterpoint: Ai coding without GitHub is like performing a stunt where you set yourself on fire but without a fire crew to extinguish the flames

This.

But also, GitHub profiles and repos were at one point a window into specific developers - like a social site for coders. Now it's suffering from the same problem that social media sites suffer from - AI-slop and unreliable signals about developers. Maybe that doesn't matter so much if writing code isn't as valuable anymore.

Fuck GitHub. It's a corporate attempt at owning git by sprinkling socials on top. I hope it fails.

If you need to host git + a nice gui (as opposed to needing to promote your shit) Forgejo is free software.

The true value prop of github isn't "hosted git + nice gui", it is the whole ecosystem of contributers, forks, and PRs. You don't get that by hosting your own forge.

Also, I wouldn't say GitHub is a corporate attempt to own git... GitHub is a huge part of why Git is as popular as it is these days, and GitHub started as a small startup.

Of course, you can absolutely say Microsoft bought GitHub in an attempt to own git, but I think you are really underselling the value of the community parts of GitHub.

Or they'll just keep forcing policies that let them steal the code you post on GitHub (for their AI training), and make everyone leave that way.

100% of all code I have put on github, using claude or not, is on repos with zero stars.

Just to clarify as OP, the point here is not that Claude is not contributing to serious work, just that the dashboard suggests a lot of usage in public GitHub repos seems to be tied to low attention, high LOC repos. This is at least something to keep in mind when considering the composition of coding agent usage, and when assessing the sustainability of current trends.

In hindsight the headline was a bit more sensational than I meant it to be!

This seems to be the same misunderstanding about agentic coding I see a lot of places.

Agentic coding is not about creating software, it's about solving the problems we used to need software to solve directly.

The only reason I put my agentic code in a repo is so that I can version control changes. I don't have any intention of sharing that code with other people because it wouldn't be useful for them. If people want to solve a similar problem to me, they're much better of making their own solution.

I'm not at all surprised that most of Claude linked output is in low star repos. The only Claude repos I even bother sharing are those that are basically used as context-stores to help other people get up to speed faster with there of CC work.

Do people really put weight in stars? It seems completely unrelated to anything but, well, popularity. Even when I modify other peoples' code I fork to a private repo and maintain my changes separately, and I'm fairly certain I have never starred a repo.

Stars have been useless as signals for project quality for a while. They’re mostly bought, at this point. I regularly see obviously vibe-coded nonsense projects on GitHub’s Trending page with 10,000 stars. I don’t believe 10,000 people have even cloned the repo, much less gotten any personal value from it. It’s meaningless.

I'm with you on all points except for it being bought.

Programming has long succumbed to influencer dynamics and is subject to the same critiques as any other kind of pop creation. Popular restaurants, fashion, movies - these aren't carefully crafted boundary pushing masterpieces.

Pop books are hastily written and usually derivative. Pop music is the same as is pop art. Popular podcasts and YouTube channels are usually just people hopping unprepared on a hot mic and pushing record.

Nobody is reading a PhD thesis or a scholarly journal on the bus.

The markers for the popularity of pop works are fairly independent from the quality of their content. It's the same dynamics as the popular kid at school.

So pop programming follows this exact trend. I don't know why we expect humans to behave foundationally differently here.

For example, it's used as a kind of internal bookmarking system. I don't necessarily star a repo because I think it has good code, but maybe a good idea or something related to something I'm interested in developing.

Stars on GitHub have nothing to do with quality.

They are bookmarks. It is a way to bookmark a repo, and while it might correlate with quality, it isn't a measure of it.

It's more of a signal for investigating "did this get spammed on Reddit or Twitter", "is this new/old/weird hype", and "does this provide real value"

I've seen people "buy" stars enough not to look at them so closely. Maybe will consider whether it has 0-1 or 2-2M.

Maybe not to devs, but I've had VCs ask about them because of popularity so there you go it's a signal to someone.

Whatever reaction you have to this know that my internal reaction and yours were probably close.

it’s my signal for popular forks

Probably not today, but there was a time when you could get funding based on just a github repo with a bunch of stars.

Shout out to Broadwayscore by thomaspryor@github

At 2mo old - nearly a 1GB repo, 24M loc, 52K commits

https://github.com/thomaspryor/Broadwayscore

Polished site:https://broadwayscorecard.com/

I was really confused how this could be possible for such a seemingly simple site but it looks like it's storing + writing many new commits every time there's a new review, or new financial data, or a new show, etc.

Someone might want to tell the author to ask Claude what a database is typically used for...

json in git for reference data actually isn't terrible. having it with the code isn't great, and the repo is massively bloated in other ways, but for change tracking a source of truth, not bad except for maybe it should be canonicalized.

It's not a terrible storage mechanism but 36,625 workflow runs taking between ~1-12 minutes seems like a terrible use of runner resources. Even at many orgs, constantly actions running for very little benefit has been a challenge. Whether it's wasted dev time or wasted cpu, to say nothing of the horrible security environment that global arbitrary pr action triggers introduce, there's something wrong with Actions as a product.

It is pretty damn fast though.

Lol @ the proprietary license, you can just copy and use whatever Claude-committed code you want to from that repository.

Can you? My understanding is that AI cannot claim copyright and my assumption would be that copyright law immediately extends authorship to the user operating the AI (or their employer).

AI output can't be copyrighted, copyright applies to human creations.

Substantive transformation of AI output via human creativity can be copyrighted, but if you're sticking to Claude commits, that's AI output.

It looks like my one-star repository [1] came close to making this person's leaderboard for number of commits (currently 5,524 since January, all by Claude Code). I'm not sure what that means, though. Only a small percentage of those commits are code. The vast majority are entries for a Japanese-English dictionary being written by Claude under my supervision. I'm using Github for this personal project because it turned out to be more convenient than doing it on my local computer.

[1] https://github.com/tkgally/je-dict-1

Make your own Github: forgejo.org

One used Lenovo micro PC (size of a book) from eBay will serve you well.

Thanks for the recommendation. I didn’t know about forgejo.org.

The main convenience of Github for me is the ability to send preprepared prompts to Claude through its web interface or the mobile app and have it write or revise a batch of dictionary entries in the repository. I can then confirm the results on the built website, which is hosted on Github Pages, and request changes or reverts to Claude when necessary. Each prompt takes ten to thirty minutes to carry out and I run a dozen or more a day, and it is very convenient to be able to do that prompting and checking wherever I am.

When I have Claude make changes to the codebase, I find that I need to pay closer attention to the process. I can’t do that while sitting in restaurant or taking a walk like I do with the prompting for dictionary-entry writing. The next time I start a mostly (vibe) coding project, I’ll look into Forgejo.

This is awesome. Your repo is now two stars.

Thanks! The dictionary should be more or less finished in a few months. If you or anyone else might find it helpful for studying Japanese, feel free to use it, copy it, and adapt it however you like.

I'm one of those zero star repos. I've been using Claude Code for some weeks now and built a personal knowledge graph with a reasoning engine, belief revision, link prediction. None of it is designed for stars, its designed for me. The repo exists because git is the right tool for versioning a system.. that evolves every day.

The framing assumes github repos are supposed to be products.

Hold on. I'm in the middle of building this[0]! What the heck? Your email isn't in your profile -- reach out.

[0]: https://github.com/ctoth/propstore

Wait a minute! Ha, just saw this. The knowledge graph I mentioned is a separate project (heartwood on my profile). Different angle from propstore but I think we're circling the same problem, conflicting claims that shouldn't be silently resolved. Added my email to my profile now.

https://github.com/rodspeed/heartwood

Who cares?

I used Claude code to build a custom notes application for my specific requirements.

It’s not perfect, but I barely invested 10 hours in it and it does almost everything I could have asked for, plus some really cool stuff that mostly just works after one iteration. I’ll probably open source the code at some point, and I fully expect the project to have less than two stars.

Still, I have my application.

For anyone that’s interested in taking a look, my terrible landing page is at rayvroberts.com

Auto updates don’t work quite right just yet. You have to manually close the app after the update downloads, because it is still sandboxed from when I planned to distribute via the Mac App Store. Rejected in review because users bring their own Claude key.

I have many GH repos, most have no stars. Probably because most of what I write is not very useful to other people due to quality or use case. I would say this is true of most fully human-created repos on GitHub.

- 90% of Claude's repos have <2 stars

- 98% of human's repos have <2 stars

Claude is 5 times smarter than humans!

The math is a bit of a stretch, but the correlation still holds up.

I cannot understate how much of an improvement that is. If I had a dollar for all the shit I made myself, the old fashioned way, that got 0 attention at all? I'd have enough for a month or two of claude

I hate everything about this headline and metric. As a lifelong graphics programmer from Pentium U/V pipeline assembly optimisation days: so fucking what.

I have never cared about LinkedIn or GitHub stars or any of those bullshit metrics (obviously because I don't score very highly in them), and am enjoying exploring a million things at the speed of thought; get left outside, if it suits you. Smart and flexible people have no trouble using it, and it's amazing.

Rather measure how much I've learnt and created recently compared to before, and get ready for some sobering shit because us experienced old dudes can judge good code from bad pretty well.

I'd betcha a lot more than 90% goes to repositories without any stars at all, or even public code!

Absolutely! I think the real stats will far exceed what we can see on public GitHub. That said, going through some of the top "performers" by commit and line count - I am surprised by how many people have all their code in public repos.

Isn't that expected as well?

The idea with Claude writing code for most part is that everyone can write software that they need. Software for the audience of one. GitHub is just a place for them to live beyond my computer.

Why will I want to promote it or get stars?

Yeah, but all these internal and not so internal tools I baked with it are great - they solve my own problems - and without LLMs I would never have a chance to implement even 20% of that.

Maybe because people are using claude to to write code for themselves, to scratch their own itch, and upload it to the world just because. The value of code can't be measured in star counts.

Even if that stat were compared directly to the base rate (human output), it could easily be explained by correlating strongly with Claude usage skewing towards new repos.

the more interesting signal in that data is about intent, not quality. most of these low-star repos probably aren't failed open source attempts - they're personal tools that were never meant to be shared.before ai-assisted coding, the effort-to-build ratio was high enough that most personal scripts stayed on a laptop or in a private gist. pushing to a public repo implied an implicit claim that someone else might want this. now the build cost is low enough that people just push things to git for their own version history and move on.what's actually happening is that git is becoming a personal dev journal as much as a collaboration platform. stars were always a weak proxy for value, but they're especially wrong for this use case.the 90% number probably also undercounts the real extent of this - most serious claude code usage is on private repos and internal tooling that never touches public github at all. the 50b lines stat would look very different if you could see total token output vs just github-public-linked output.

It would be very interesting to see how much of this is the "audience of one" type of project - i.e. personal scripts - vs new developers/vibe coders trying to start an app. I have definitely been surprised by the scale of some of the repos that seem to be vibe-coded. People who seem to have no history in development are building game engines, and payroll systems, and Broadway review websites.

Unfortunately that type of analysis would take a bit more work, but I think the repo info and commit messages could probably be used to do that.

Yeah. Because they are mostly private I suspect.

I think time for AI Free Code (AIFC™) mark has arrived.

How long does it normally take projects to get stars though? You're not going to have a project with 100+ stars overnight or even within a month, you have to promote the project?

Depends widely on the target audience. In my case, targeting Julia developers who want to package their applications into installers to reach 100 stars took 2 years - https://peacefounder.org/AppBundler.jl. If I were to target Python developers, I would have many more stars.

It depends on how much you promote your repo and how big it is. I know when my repo gets posted somewhere because I'll get a little burst of stars for a few days and then it'll calm down until it's posted somewhere again. Much larger repos will get stars at a more constant rate as they reach a critical liftoff velocity.

[deleted]

The 2 stars or fewer metric may show one thing. We’re moving from an era of 'open source as a digital monument' to 'open source as a disposable scratchpad.' Not that the code is slop, it’s that the cost of creating a repository has dropped to near zero.

This is just base rate neglect though. Something like 98% of all GitHub repos have <2 stars regardless of how they were made. If 90% of Claude repos have <2 stars that actually means they're outperforming the baseline...

The HN headline is at least misleading, because I suspect a majority of Claude usage is at the enterprise level (deep pockets), which goes to private GitHub repos.

Some of the comments point toward genuine concern, some smell of gatekeeping.

It is interesting to see a flip in attitude toward GitHub.

I have a star on one of my repos. Almost all of my work is only relevant to me or is internal to my org.

What percentage of non-Claude-linked output hours to repos with <2 stars?

Claude is only as good as the prompts it’s given

So wait, 10% is going to repos w>2 stars?

I mean, most of the code that I have written to Github with normal human intelligence also goes to Github repos will less than two stars. They're usually repos that I create and no one else touches.

[deleted]

At a glance this may read as “most of this code isn’t valuable to others” but reality is probably complected with “this type of code is reducing the need for shared libraries”.

Why is this interesting?

The LLM content piracy to isomorphic plagiarism business loop is unsustainable. Yet for context search it is reasonably useful. =3

https://www.youtube.com/watch?v=T4Upf_B9RLQ

embarrassing

guilty :) 1 Star here - and even that is worthless

[dead]

Codeberg if you hate AI.

I wonder if there's a critical failure mode / safety feature of our species for some percentage of the population to always dislike whatever some other large percentage of the population likes.

As if it's to prevent the species from over-indexing on a particular set of behaviors.

Like how divisive films such as "Signs", "Cloud Atlas", and even "The Last Jedi" are loved by some and utterly reviled by others.

While that's kind of a silly case, maybe it's not just some random statistical fluke, but actually a function of the species at a population level to keep us from over-indexing and suboptimizing in some local minima or exploring some dangerous slope, etc.

Did we democratise software engineering? Seriously, I created a bunch of tools that I find useful without the bloated framework issues that are present in software nowadays. Jokes on me if something does not work.

Software production yes engineering no lol

Toggling the stars shows 50b lines of code created across all projects, only 5b on projects with 2+ stars since Claude Code launch. Kind of eye opening where these Claude Code tokens are going.

Came across this from this ShowHN post yesterday https://news.ycombinator.com/item?id=47501348

Thanks for starting the conversation and sharing my dashboard. :)

I hope you don't mind, I thought this was a really valuable dashboard.

Not at all! The ShowHN didn't really get a lot of feedback but this thread has already given me a lot to think about adding/improving.

Dashboard looks nice. Was this a Claude creation or did you instruct it to use a certain template or CSS framework?