AlphaQubit: AI to identify errors in Quantum Computers

When maintaining a quantum memory, you measure parity checks of the quantum error correcting code. These parity checks don't contain any information about the logical state, just (partial) information about the error, so the logical quantum information remains coherent through the process (i.e. the logical part of the state is not collapsed).

These measurements are classical data, and a computation is required in order to infer the most likely error that led to the measured syndrome. This process is known as decoding.

This work is a model that acts as a decoding algorithm for a very common quantum code -- the surface code. The surface code is somewhat like the quantum analog of a repetition code in a sense.

I would instead give the example of the Hamming code. As you probably know, you can construct a quantum code, the Steane code, which is just analogous to Hamming code.

The Steane code is the simplest triangular color code. i.e. you can arrange all the qubits on a 2D triangular lattice, and only do nearest neighbor interactions [1]. The surface code is a similar quantum code, in which the qubits can also be placed on a 2D lattice, except that lattice is made up of squares.

Why do we care about 2D surfaces and nearest neighbor interactions. Because it makes building quantum hardware easier.

EDIT:

[1] The Steane code's picture is shown here. https://errorcorrectionzoo.org/c/steane Seven data qubits are on the vertices of the triangles. 2 syndrome qubits on each of the faces.

>AlphaQubit, a recurrent-transformer-based neural-network architecture that learns to predict errors in the logical observable based on the syndrome inputs (Methods and Fig. 2a). This network, after two-stage training—pretraining with simulated samples and finetuning with a limited quantity of experimental samples (Fig. 2b)—decodes the Sycamore surface code experiments more accurately than any previous decoder (machine learning or otherwise)

>One error-correction round in the surface code. The X and Z stabilizer information updates the decoder’s internal state, encoded by a vector for each stabilizer. The internal state is then modified by multiple layers of a syndrome transformer neural network containing attention and convolutions.

I can't seem to find a detailed description of the architecture beyond this bit in the paper and the figure it references. Gone are the days when Google handed out ML methodologies like candy... (note: not criticizing them for being protective of their IP, just pointing out how much things have changed since 2017)

Eh. It was always sort of muddy. We never actually had an implementation of doc2vec as described in the paper.

wait. Are you saying you were a paper author who described a method in their paper that wasn't actually implemented? IE, your methods section contained a false description?

No I’m saying the original doc2vec paper described an approach which the ML community never seemed to actually implement. There were things that were called doc2vec, but they were not what the paper described. Folks mostly seemed to just notice.

That's because attention is all we need.

…and a green line by the GOOG ticker.

So, an inherently error-prone computation is being corrected by another very error prone computation?

I feel like this is basically how humanity operates as a whole, and that seems to produce usable results, so why the heck not?

No problem, said von Neumann. https://www.scottaaronson.com/qclec/27.pdf

what he actually said: "as long as the physical error probability ε is small enough" you can build a reliable system from unreliable parts.

So it remains for you to show that AI.ε ~= QC.ε since JvN proved the case for a system made of similar parts, that is vacuum tubes, with the same error probability.

(p.s. thanks for the link)

A quick careless Google didn’t yield Scott Aaronson’s take on this, which as a layperson is the one take I’d regard seriously.

Has he remarked on it and my search-fu failed?

Yes, he gave comments for a New Scientist piece about it: "“It’s tremendously exciting,” says Scott Aaronson at the University of Texas at Austin. “It’s been clear for a while that decoding and correcting the errors quickly enough, in a fault-tolerant quantum computation, was going to push classical computing to the limit also. It’s also become clear that for just about anything classical computers do involving optimisation or uncertainty, you can now throw machine learning at it and they might do it better.”

https://www.newscientist.com/article/2457207-google-deepmind...

I've never seen so much money spent on a fundamentally flawed tech, since maybe Theranos. I'm really starting to doubt the viability of the current crop of quantum computing attempts. I think there probably is some way to harness quantum effects, but I'm not sure computing with inherently high margin of error is the right way to do it.

I feel like these are extremely different things being compared.

For a lot of technology, most really, the best way to study how to improve it is to make the best thing you know how to and then work on trying to make it better. That's what's been done with all the current quantum computing attempts. Pretty much all of the industry labs with general purpose quantum computers can in fact run programs on them, they just haven't reached the point where they're running programs that are useful beyond proving out and testing the system.

I'm optimistic about current quantum computers, because they are a tool to study wave function collapse. I hope that they will help to understand the relation between the number of particles and a time how long a system can stay in entangled state, which will point to a physical interpretation of quantum mechanics (different from "we don't talk about wave function collapse" Copenhagen interpretation).

The non-experts here might be interested in why you’d want to do that. Do you have explanations or links about it?

In short, quantum mechanics has a major issue at its core: quantum states evolve by purely deterministic, fully time reversible, evolutions of the wave function. But, once a classical apparatus measures a quantum system, the wave function collapses to a single point corresponding to the measurement result. This collapse is non-deterministic, and not time reversible.

It is also completely undefined in the theory: the theory doesn't say anything at all about what interaction constitutes "a quantum interaction", that keeps you in the deterministic time evolution regime; and what interactions constitute "a measurement" and collapse the wave function.

So, this is a major gap in the core of quantum mechanics. Quantum computers are all about keeping the qubits in the deterministic evolution state while running the program, and performing a measurement only at the end to get a classical result out of it (and then repeating that measurement a bunch of times, because this is a statistical computation). So, the hope is that they might shed some light on how to presicsely separate quantum interactions from measurements.

Wow, that is a huge gap. Thanks for the explanation.

I think quantum computing research makes a lot more sense through the lens of “real scientists had to do something for funding while string theory was going on”.

Quantum computing may or may not get industrial results in the next N years, but those folks do theory, they often if not usually (in)validate it by experiment: it’s science.

> fundamentally flawed tech, since maybe Theranos

That's a pretty dramatic claim. We've had to (and still have to) deal with the same class of problems when going from analog -> digital in chips, communications, optics, etc. etc. The primitives that reality gives us to work with are not discrete.

How can a classical system detect/correct errors in a quantum one? I thought all the error correction algos for quantum also relied on qbits e.g. Shor Code.

The full error correction system involves qubits. This paper is mainly about the decoder, which is responsible for taking the symptom data produced by the quantum circuit and determining the most likely errors that caused those symptoms. In the blog post it's not stated what code is being run, but in the illustration it's clear it's a surface code [1] and this is confirmed in the paper's abstract [2].

Disclaimer: am one of the authors, but not a main contributor. I wrote the simulator they used and made some useful suggestions on how to use it to extract information they wanted for training the models more efficiently, but know nothing of transformers.

[1]: https://errorcorrectionzoo.org/list/quantum_surface

[2]: https://www.nature.com/articles/s41586-024-08148-8.pdf

The world of quantum has all these interesting gotchas.

In a quantum computer, your logical quantum state is encoded in lots of physical qubits (called data qubits) in some special way. The errors that occur on these qubits are indeed arbitrary, and for enough physical qubits are indeed not practically classically simulatable.

To tackle these errors, we do "syndrome measurement" i.e. interact the data qubits with another set of physical qubits (called syndrome qubits), in a special way, and then measure the syndrome qubits. The quantum magic that happens is that the arbitrary errors get projected down to a countable and finite set of classical errors on the data and syndrome qubits!!! Without this magic result we would have no hope for quantum computers.

Anyway, this is where a decoder - a classical algorithm running on a classical computer - comes in. OP is a decoder. It takes the syndrome qubit measurements and tries to figure out what classical errors occurred and what sort of correction, if any, is needed on the data qubits.

Quantum computing is not intractable and can be still simulated with a sufficient amount of time. This work used quantum simulator to generate data points then use it to train a transformer, which doesn't seem that different from other neural network use cases to optimize computation heavy problems.

The question would be whether this approach still works when it is scaled to thousands or even millions of qubits. The team is optimistic that that is the case, but we will see.

The model could choose which measurement operations to make on the qubits, and which operations to take to repair the qubits?

In some quantum error correcting codes, there is a large set of operators that, when there are currently no errors, measuring these will not change the state (well, assuming the measurement is made without error), but would result in some information about the kind of error if there is an error, and this info can be used to choose what operations to take to correct the error.

For a number of such schemes, there’s a choice of a strategy of what schedule to check which of the measurements with, and how to correct the errors.

The way you describe this reminds me of the quantum bomb tester (Elitzur & Vaidman). Uhhh so this is treating a potential environmental interaction the same way as E&V's "bomb"? With at least the new wrinkle that there are multiple potential bombs, each with low probability?

It's not a perfect detector. If you give up perfection and settle for X% accuracy, then you can use a classical system (FWIU).

The error correction itself requires qbits, but reading out the final answer apparently becomes more probabilistic and complex, to the point where a neural net is a reasonable solution for interpretation and denoising.

Quantum computing + AI is undoubtedly the hype singularity.

We're almost there, now we just need to incorporate crypto here somehow :)

QUANTUM AI BLOCKCHAIN!

decentralized deep learning powered quantum computing enabled vertical farming onions on the blockchain

TAKE MY MONEY! TAKE IT!

Part of the problem of this form of benchmarking is that in some domains we wouldn't only be interested in the percent of times that an error channel is successfully mitigated, we would also be interested in the distribution of types of errors for cases where an error channel isn't successfully mitigated. The paper appears to be silent on that matter.

This all feels like the "with a computer" patents of yore.

I go on the front page and there’s nowhere to complain about AI hype?!

The one AI thing is semi-legitimate sounding?

What is YC coming to.

Short NVDA is what those guys are.

The tide goes out.

Interesting. I don't know too much about quantum computers tbh.

Quantum computer parts list:

- Everything you need

- A bunch of GPUs

Been trying for the longest time, I still don’t understand how quantum computing work. It’s always something-something tries all possible combinations and viola, your answer.

The whole "tries all possible combinations" thing is a very misleading oversimplification in the first place.

Instead, think of it more like a completely different set of operations than classical computers that, if you were to try and replicate/simulate them using a classical computer, you would have no choice but to try all possible combinations in order to do so. Even that is oversimplifying, but I find it at least doesn't hint at "like computers, but faster", and is as close as making the parallelism pov "correct" as you're going to get.

What these operations do is pretty exotic and doesn't really map onto any straightforward classical computing primitives, which puts a pretty harsh limit of what you can ask them to do. If you are clever enough, you can mix and match them in order to do some useful stuff really quickly, much faster than you ever could with classical computers. But that only goes for the stuff you can make them do in the first place.

That's pretty much the extent I believe someone can "understand" quantum computing without delving into the actual math of it.

Every quantum algorithm is a unitary operation in a Hilbert space. If you want to understand the theory then you will have to do the actual work of learning about Hilbert spaces and unitary operators.

Thoroughly unhelpful response. There’s a number of analogies available.

I recommend you list them instead of telling random strangers about the helpfulness of their responses.

"""Are Git branches, in fact, "homeomorphic endofunctors mapping submanifolds of a Hilbert space"?"""

https://dlicata.wescreates.wesleyan.edu/pubs/amlh14patch/aml...

>It’s always something-something tries all possible combinations

"If you take nothing else from this blog: quantum computers won't solve hard problems instantly by just trying all solutions in parallel." - Scott Aaronson

This short comic he helped author actually summarizes the core idea fairly well https://www.smbc-comics.com/comic/the-talk-3

https://www.youtube.com/watch?v=F_Riqjdh2oM

This video is the simplest explanation that I have found for Quantum Computing which doesn't do the whole pop-sciency "is both zero and one at the same time" nonsense.

>viola

A large violin provides little answers.