My first real look into empirical quantum algorithms
In 1994, American mathematician Peter Shor published a paper to the Annual Symposium on Foundations of Computer Science that drew attention to the idea of quantum computation. His paper contained a series of algorithms that, if implemented on a quantum computer, expose critical vulnerabilities in widely-used public-key encryption standards. The primary algorithm in the paper, referred to as Shor's algorithm, was one of the first practical algorithms that showed quantum computers could outperform classical computers.
Part 1: Diffie-Hellman Key Exchange
Shor's algorithm breaks many widely used asymmetric encryption protocols. In asymmetric key encryption, two parties that wish to communicate in secrecy need to agree on shared public and private keys for encryption/decryption. However, these parties must find a way to agree on this private key without letting an adversary discover it, and they must coordinate together in public channels. In these public channels, adversarial parties can listen to both parties' communication. Prominent asymmetric encryption protocols include RSA, ECC, and Diffie-Hellman (DH). These protocols depend on mathematical problems that are easy to compute in one direction, but hard to reverse - a property of one-way functions. For simplicity, I will focus this blog on Diffie-Hellman, since all three use slightly different one-way functions. In DH, two parties agree on a set of public parameters, and then each party combines these public parameters with a secret value only they know. At the end, they have a mutually agreed upon secret.
DH relies on the discrete logarithm problem (DLP) for secretly establishing a key on a public channel. Put simply, DLP is a "hard" math problem that serves as a one-way function. Imagine we are playing a game where I take a number and raise it to a secret power, giving you the initial number and the final number. Your goal is to guess what power I raised the number to. This is essentially what the discrete logarithm problem is - given a base number g, a large prime p, and a result h, DLP asks: what is the exponent x such that
gxmodp=h
Here g is a generator for a multiplicative cyclic group G where G={g1,g2,g3,…,gnmodp} for some prime p. It is typically denoted as Zp∗. For any element in G, we can arrive at any other element within G by raising it to a certain power. This ensures that the secret x can be any number within a large, hard-to-guess space.
While it is easy to compute gxmodp where all three variables are known, it is believed to be computationally infeasible to attempt to find x given g, p, and h - especially as p increases. This property led to its adoption in modern-day encryption protocols like Diffie-Hellman, where we let x be a secret value that each party chooses and does not disclose. DLP is widely considered not realistically computable by classical computers with no known polynomial-time algorithms.
In the case of DH, g and p are the agreed upon public parameters, and each party selects their own secret x value, so my x would differ from your x. If we wanted to establish a mutual secret key, we would exchange g and p. Computing gxmemodp would yield my public key that I can share with you publicly. Despite my hidden key, xme remaining secret, we can openly exchange public keys. After receiving your public key, I raise your public key gxyou to my secret exponent xme resulting in the shared secret gxyou⋅xme. To write the actual calculation out, let B be your public key, secret=Bxmemodp. We can see that we are "sharing" our secret x values without directly telling the other person what the value is, and we can arrive at the same mutually agreed upon secret! Given my public key A, you can reach the same secret value by calculating Axyoumodp. An eavesdropper would be able to see g, p, and both of our public keys, but without the ability to efficiently solve DLP, they cannot discover the shared secret. I created an illustration of two parties using DH to agree on a secret key over a public channel.
Note: My illustration uses very small numbers for the sake of simplicity. In practice, much larger numbers are used. For example, the current RSA protocol uses 2048-bit integers.
Diffie-Hellman key exchange can be seen across the internet. A variation of DH using elliptic curves (ECC), is used in modern TLS which is THE standard encryption protocol for HTTPS, email services, voice over IP, etc (pretty much everywhere online). For classical computers, DLP is solvable in exponential-time (cn), but it is widely believed that for quantum computers, DLP can be solved in polynomial-time (nc). This means that DH, and any other encryption schemes using similar one-way functions, are not secure in a post-quantum world. Although large-scale quantum computers capable of running Shor's algorithm do not yet exist, encrypted packets can be stored in today's classical world and decrypted in tomorrow's quantum world - a threat known as "harvest now, decrypt later".
Part 2: The Hidden Subgroup Problem
Enough talk about Diffie-Hellman, how does Shor's algorithm potentially defeat it? By exploiting the properties of superposition and interference of quantum computers, it turns out discrete logarithms can be calculated efficiently in polynomial time. The discrete logarithm problem is an instance of a hidden subgroup problem (HSP) - a series of mathematical problems including DLP, prime factorization, and graph isomorphism. In these HSPs, an element or a set of elements is "hidden". HSPs can be seen elsewhere in quantum algorithms such as in Simon's and Grover's.
Note: For this blog, since I am learning much of this as I write about it, I want to go over what HSPs are first. Next, I will show how DLP is an instance of HSP. By covering HSPs first, this generalized knowledge can be applied to future algorithms that I may learn about.
In an instance of HSP, you are given a function that hides a subgroup, and you are challenged to find that subgroup. You may notice that there are similarities to DLP - x was the "hidden" variable in the problem, and we relied on x remaining secret in Diffie-Hellman. A lot of modern crypto protocols rely on these one-way "trapdoor" like functions where a value remains hidden. If HSP can be broken with quantum computing, by showing that a problem can be reduced to an instance of HSP, we can show that it too must be breakable with quantum computing. For this blog, I will focus on finite abelian groups (∀x,y∈G:xy=yx) since that is what DLP uses.
In HSPs, given a group G, a finite set S, and a function f:G↦S. f must satisfy the following:
∀g∈G:f(g)=f(g′)⟺gH=g′Hi.e. f is constant on the left cosets of H, and different cosets yield different values
for some hidden subgroup H (H≤G). For some g∈G, all elements in the coset gH map to the same f(g). Similar to DLP, the goal of HSP is to find the elements of H by repeatedly invoking f on a series of inputs. Fortunately for quantum computers and due to the periodic structure of f, Fourier analysis can be used to uncover hidden information about these subgroups. Analagous to the discrete Fourier transform where one vector of complex numbers is mapped to another, the quantum Fourier transform acts on one quantum state ∣x⟩, where x is an integer, and maps it to another quantum state ∣χx⟩ using the function f. ∣χx⟩ is the superposition of all ∣x⟩ states with complex amplitudes.
∣χx⟩=N1y=0∑n−1e2πixy/N∣y⟩
It is alright if you do not fully understand the math behind a QFT. Just note that we can treat it as a lens that allows us to see the "hidden" periodicity in HSPs. Despite these problems seeming random, with a QFT we can turn it into a structured frequency problem.
I will provide a sketch of a generalized quantum algorithm for HSPs, where we perform a quantum Fourier transform to uncover hidden values in H .
Sketch: Here, we are given a hidden group H and a function f:G↦S for some abelian group G and subgroup S.
Start out with two registers that have dimensions ∣G∣ and ∣S∣. One register will represent group elements g∈G, and the other will be used to store function values f(g)∈S∣0⟩∣0⟩
Create a uniform superposition of the first register over G.
∣G∣1g∈G∑∣g⟩∣0⟩
Let the second register be the computation of f(g).
∣G∣1g∈G∑∣g⟩∣f(g)⟩
Measure the value in the second register and discard it. It is some f(g) for an unknown g - we will refer to this value as s. By measuring s, we collapse the existing superposition onto the coset s+H⊆G that maps to the measured f(g). Since f has the same value for each element of coset s+H, where H is the hidden group, we can rewrite the first register to express this.
∣H∣1h∈H∑∣s+h⟩
Perform the QFT on this state.
∣H∣1h∈H∑∣χs+h⟩
After applying the QFT, we arrive at a superposition of contstraints on H. Now something very interesting has happened - if we measure this value after applying the QFT, we will observe some value h∈H⊥ - we have uncovered a constraint on H! To explain a little more, H⊥ is the set of characters χ:G↦C such that ∀h∈H:χ(h)=1. If we repeatedly run through these steps and try various possible inputs, we will eventually reveal enough contstraints to determine H.
This algorithm runs in polynomial time, specifically in O(log∣G∣) time - far faster than any known classical algorithm today. It falls within BQP - the class of problems that are solvable by a quantum computer in polynomial time with a higher than 50% success rate. It is important to note that this sketch only covers abelian HSPs: there are some non-abelian HSPs that are not in BQP.
Part 3: Cracking DLP
Given this generalized algorithm for solving HSP problems, if we are able reduce an instance of the DLP problem we saw earlier to an instance of this generalized algorithm, then we should be able to solve DLP within polynomial time using a quantum computer.
Proof: To reduce this, lets first go over what we are given:
A prime p
A cyclic group G=Zp+ of order p−1
A generator g∈G of order p−1
The result h∈G equal to gxmodp
This reduction will be over the group Zp−1×Zp−1. Define a function f as follows:
f:(Zp−1×Zp−1)→G such that f(α,β)=gαh−βmodp
As a note, from now on, I will omit the modulo p for simplicity, but everything we are doing remains modulo p.
Since h=gxmodp, we can rewrite f as
f(α,β)=gαg−βx=gα−βx
Let (α1,β1) and (α2,β2) be in Zp−1×Zp−1.
f(α1,β1)=f(α2,β2)⟺gα1−β1x=gα2−β2x
Then:
(α1−α2)+x(β1−β2)=0
Lets define the integer k:
k:=β1−β2⟹α1−α2=kx
We can express (α1−α2,β1−β2) as (xk,k)=k(x,1), which implies that (α1−α2,β1−β2) falls into the cyclic subgroup of Zp−1×Zp−1 generated by (x,1), which we will refer to as ⟨(x,1)⟩ or H. We have essentially created a scenario where we only need to find the first element that is x, since the second is 1 and any number raised to the first power is equal to itself. By finding the generator of H, we will have found the discrete logarithm for our DLP problem defeating the Diffie-Hellman key exchange protocol.