Supply Chain Vuln Compromised Core AWS GitHub Repos & Threatened the AWS Console

Breaking this down, several of AWS's core repos like the JS SDK use an allowlist of which contributor ids can run workflow actions in their PRs. The list was a regex, contained several short ids, and wasn't anchored with ^$, so if it allowed user 12345, then any userid containing 12345 could run their own actions on the PR, including one that exfiltrated access tokens. So they spammed GH with user creation requests, got an id that matched, and they were in like Flynn.

Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins. Not sure if they were rotated, but github tokens are usually long-lived, like up to a year. Hey, isn't AWS the one always lecturing us to use temporary credentials? To be fair, AWS did more than just fix the regex, they introduced an "approve workflow run" UI unto the PR process that I think GH is also using now (not sure about that).

As a security dude I spend way too much of my time fixing missing anchors or unescaped wildcards in regex. The good news is that it's trivial to detect with static analysis tooling. The bad news is that broken regex is often used for security checks.

https://xkcd.com/1171/

> Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins.

Ah... Github permissions. What fun.

Github actually has a way to federate with AWS for short-lived credentials, but then it screws everything up by completely half-assing the ghcr.io implementation. It's only available using the old deprecated classic access tokens.

Right? How is it that you still need a PAT or a custom app installation to access a registry?

At least the vuln was old enough so that they couldn't blame AI for it, otherwise the article would read different ;)

Ironically (?) an AI code review would very likely have noticed the overly-permissive regex.

This is a good point. On my GH I’ve disabled Copilot reviews because the vast majority of them are false positives, but I’m reconsidering that position as it might still be worth it to wade through the spurious reviews just to catch some real issues.

> The list was a regex ...

Regexpes for security allow lists: what could possibly every go wrong uh!?

Another success story for Regexes! Let's keep using this cryptic mess!

I met regexes when I was 13, I think. I spent a little time reading the Java API docs on the language's regex implementation and played with a couple of regex testing websites during an introductory programming class at that age. I've used them for the rest of my life without any difficulty. Strict (formal) regexes are extremely simple, and even when using crazy implementations that allow all kinds of backreferences and conditionals, 99.999% of regexes in the wild are extremely simple as well. And that's true in the example from TFA! There's nothing tricky or cryptic about this regex.

That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.

> That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.

Agree. I would understand if there was some obvious advantage here, but it doesn’t really seem like there is a dimension here where regex has an advantage over a list. It’s (1) harder to implement, (2) harder to review, (3) much harder to test comprehensively, (4) harder for users to use (correctly/safely).

[flagged]

This is too hot a take. Regular expressions are used in some cases where they shouldn’t be, yes, but there’s also been a ton of code which used other string operations but had bugs due to the complexity or edge-cases which would have been easier to avoid with a regex. You should know both tools and when they’re appropriate.

Regex is not used for parsing HTML or C++ code. So it is not good for complex tasks.

What is the claim? That it is compact for simple cases. Well Brainfuck is a compact programming language but I don't see it in production. Why?

Because the whole point of programming is that multiple eyeballs of different competence are looking at the same code. It has to be as legible as possible.

> To escalate privileges, we abused the token’s repo scope, which can manage repository collaborators, and invited our own GitHub user to be a repository administrator.

From everything I know about pentesting, they should have stopped before doing this, right? From https://hackerone.com/aws_vdp?type=team :

> You may only interact with accounts you own or with explicit written permission from AWS or the account owner

I think it comes down to what you do with the access. Since this is a public repo I don't think I'd be too upset at the addition of a new admin so long as they didn't do anything with that access. It's a good way to prove the impact. If it were a private repo I might feel differently.

It’s possible that AWS is a Wiz customer, which would allow them to do more stuff.

I try to avoid regexes like the plague, it is right up there with passing stuff into SQL strings. It is tempting enough to be used but it always goes wrong, no matter how good your sanitation. Even if the original author gets it right sooner or later someone will tweak the regex just a little to allow some edgecase and accidentally open the door to a whole pile of other cases. It's just too finicky and too powerful.

I worked on docs at GitHub which are open source, synced to an internal repo, and deployed on internal infra. I recall jumping through many hoops to make it work safely. These were workflows that had secrets access for deployments, and I recall zipping files, doing some weird handoffs/file filtering between different workflows based on the triggers and permissions. Security folks were really quick to find any gaps =)

Glad to see a few more security knobs on actions these days!

I always wondered if their decision to limit availability of CodeCommit had something to do with the overall quality of the underlying implementation. It always came off as an "also ran" product without any real care or effort put into it. Either that or the team responsible for creating it ultimately left the company.. anyways..

This article lends some credibility to that notion.

Oh no, is the AWS Console ok?

happens to the best of us