The goal of this article is to dive into the techniques behind the SuperSpartan's polynomial IOP, which uses the sum-check protocol to prove CCS instances, by writing the protocol explicitely for a specific example. This is part 1 of a two-post series. These techniques used in SuperSpartan are at the core of the HyperNova protocol that will be explored in part 2.
I'd like to thank Nicolas Mohnblatt for brainstorming, discussing and reviewing this article, and for his many shared insights. I'm also thankful to Srinath Setty, Jamie Gabbay and Apriori for their generous feedback.This article wouldn't have been possible without them. All mistakes are my own.
This article was originally written on a SageMath Jupyter notebook, which can be found here.
Introduction
The sum-check protocol is like one of those magic tricks that, despite having been exposed to how it works, it still makes your mind do some acrobatics every time you see it applied.
Any polynomial equation can be transformed into an instance of the sum-check, making this amazing algorithm surprisingly applicable.
Since the sum-check lies at the core of the HyperNova's folding techniques, we will dive first into its details on a simpler case: the checking of a Customisable Constraint System (CCS) instance using the SuperSpartan’s polynomial IOP as described in their paper.
Transforming any check into a sumcheck
The goal of the sum-check is to check that an untrusted prover has computed the following equation correctly, where f∈K[X1,...,Xl] is a multivariate polynomial (not necessarily multilinear):
After l rounds, the protocol allows the verifier to check f(r1,...,rl)=c, for some random ri∈F, which with high probability implies the original equation, due to the Schwartz-Zippel lemma.
Say we have a polynomial f encoding a vector of evaluations v=[v1,...,vn] such that for all xi in some subdomain H, f(xi)=vi. In our setting, a prover wants to convince a verifier that this is the case (i.e. ∀xi∈H.f(xi)=vi). If every vi=0, this is known as a zero-check.
Note that we can always transform the check f(xi)=vi into a zero-check since f(xi)−vi=0.
For example, in vanilla Plonk, f is a univariate polynomial that encodes an execution trace and H is a subset of the roots of unity of some prime field (e.g. H:={ω1,ω2,ω3,ω4,ω5,ω6,ω7,ω8}).
In HyperPlonk (a variant of Plonk that transforms the zero-checks in the protocol into sumchecks), H becomes a boolean hypercube {0,1}l (i.e. all the l combinations of zeroes and ones), and f a multivariate polynomial of some degree d such that ∀x∈{0,1}l.f(x)=0.
For example, if l is set to 3, then H:={0,1}3={(0,0,0),(0,0,1),(0,1,0),(0,1,1),(1,0,0),(1,0,1),(1,1,0),(1,1,1)}
To apply the sum-check protocol, we need a statement of the form ∑x∈{0,1}lf(x)=0.
It is clear that if ∀x∈{0,1}l.f(x)=0 then ∑x∈{0,1}lf(x)=0, but the converse doesn't necessarily hold. For example, f((0,0,1))=−1, f((0,1,0))=1 and f(x)=0 otherwise makes ∑x∈{0,1}lf(x)=0 but f(x)=0 for some x∈{0,1}l.
To overcome this, we define a different multilinear polynomial:
eq:Fl×Fl→F(X1,X2)↦1, if X1=X2,X1,X2∈{0,1}l(X1,X2)↦0, if X1=X2,X1,X2∈{0,1}l
The new polynomial eq together with the sum-check protocol will allow us to compute the desired zero-check,∀x∈H.f(x)=0, as follows:
We define a new polynomial
h:Fl→FX↦x∈{0,1}l∑f(x)⋅eq(x,X)
Note that X is a variable while x is not. h is known as a multilinear extension of f in the boolean hypercube, since ∀X∈{0,1}l.h(X)=f(X). That is, f is defined over {0,1}l while h is defined over Fl.
For example, if X=(1,0,1), then h((1,0,1))=∑x∈{0,1}lf(x)⋅eq(x,(1,0,1))=f((1,0,1)).
Since ∀x∈{0,1}l.f(x)=0, and ∀X∈{0,1}l.h(X)=f(X), then ∀X∈{0,1}l.h(X)=0.
The main difference between f and h is that f is a multivariate polynomial of (potentially high) degree d, whereas h is a multilinear polynomial.
We know that a univariate polynomial g of k>1 coefficients in uniquely determined by k evaluations, and it is of degree k−1. Similarly, a multilinear polynomial h of k variables is uniquely determined by 2k evaluations.
In this case, h(Y):=∑x∈{0,1}lf(x)⋅eq(x,Y) has 2l coefficients (most of them zero) and so it is uniquely determined by 2l evaluations. Since ∀{0,1}l.h(Y)=f(Y)=0, then h is zero at 2l points and must necessarily be the zero polynomial, h=0.
The verifier has reduced his task to checking that h is indeed the zero polynomial, which is a simpler problem.
By the Schwartz-Zippel lemma, we can verify that h is the zero polynomial with high probability by evaluating h at a random point r∈Fl.
That is, h(r):=∑x∈{0,1}lf(x)⋅eq(x,r) must be equal to zero. Here is where we apply the sum-check protocol.
So if h(r)=0, this implies with high probability that h=0, which implies that ∀Y∈{0,1}l.h(Y)=0, which implies that ∀x∈{0,1}l.f(x)=0, thus proving the original statement with high probability (w.h.p.). In short,
The following recurrence relation is tailored to help us illustrate the workings sum-check protocol for CCS relations as is used in HyperNova. This example reflects the SuperSpartan’s polynomial IOP for CCS protocol described in the CCS paper.
// i-th iteration
fibonnaci(x_2, x_1, y_2, y_1) {
x = x_1 + x_2 // Fibonacci
y = y_1 * y_2 // Multiplicative Fibonacci
t = x * y // Fibonacci x (Multiplicative Fibonacci)
(x, y, t)
}
In the second part of this post, we will prove the result of iterating n times over f, starting with some initial values, by applying the folding techniques in HyperNova.
For now, we focus on proving a single iteration, in particular, that which results from the initial values x2=0,x1=1,y2=2,y1=3 that outputs x=1,y=6,t=6.
a structures:=([M1,...,Mt],[S1,...,Sq],[c1,...,cq]), plus some other bound parameters such as the number of rows m and columns n of the matrices Mi.
an instance, which consists of public inputs ×∈Fl and private inputs ω∈Fm.
that satisfies:
i=0∑q−1ci⋅∘j∈SiMj⋅z=0
where:
z:=(ω,x,1)∈Fn.
x∈Fl are the public inputs.
ω∈Fm−l−1 are the private inputs.
Mj⋅z denotes matrix-vector multiplication,
∘ denotes the Hadamard product between vectors
0 is an m-sized vector with entries equal to the the additive identity 0∈F.
In our case, m=4 and n=8 are the rows and columns of the t=3 matrices M1,M2,M3. {M1,M2} are multiplied and added to {M3}, so S1={1,2},S2={3}. Since there are two addends S1 and S2, q=2.
A Fibonacci CCS relation
The Fibonacci combined recurrence relation of our example can easily be represented as a CCS relation:
We have a particular interest in having our rows and columns of size 2m for some m, since the sum-check protocol runs over the boolean hypercube. In this case, m=4=22 is the number of constraints (i.e. rows) and n=8=23 is the number of witnesses (i.e. columns) as well as the size of the vector z.
Note that we have added a dummy row is [0,...,0] for M1,M2,M3 (it could be any other vector as long as the CCS relation holds trivially).
Note that a matrix M can be seen as a function M:{0,1}logm×{0,1}logn→F, since for each row mi and column nj we can derive the element M(mi,nj) from the matrix M.
As in the HyperNova paper, we will denote s=logm and s′=logn.
The multilinear extensions Mi of Mi and z of z are defined as
# From-binary-to-decimal polynomials
row = x1 + 2*x2
col = y1 + 2*y2 + 4*y3
def Mi_linear(Mi):
return sum([
sum([
sum([
sum([
sum([
Mi[Integer(row(x1=x1, x2=x2))][Integer(col(y1=y1,y2=y2,y3=y3))] * eqx(x1=x1,x2=x2,x11=x11,x22=x22) * eqy(y1=y1, y11=y11, y2=y2, y22=y22, y3=y3, y33=y33)
for y3 in [0,1]])
for y2 in [0,1]])
for y1 in [0,1]])
for x2 in [0,1]])
for x1 in [0,1]])
def z_linear(zi):
return sum([
sum([
sum([
zi[Integer(col(y1=y1,y2=y2,y3=y3))]* eqy(y1=y1, y11=y11, y2=y2, y22=y22, y3=y3, y33=y33)
for y3 in [0,1]])
for y2 in [0,1]])
for y1 in [0,1]])
Thus the multilinear extension fi of our matrix-vector product M⋅z is
fi((X1,X2))=∑y∈{0,1}3Mi((X1,X2),y)⋅z(y)
def Mi_z_prod(Mi, zi):
return sum([
sum([
sum([
Mi_linear(Mi)(y11=y1,y22=y2,y33=y3) * z_linear(zi)(y11=y1,y22=y2,y33=y3)
for y3 in [0,1]])
for y2 in [0,1]])
for y1 in [0,1]])
As we did earlier, now h(X) is a multilinear extension that encodes the 4 evaluations ({(0,0),(0,1),(1,0),(1,1)}) of G into its coefficients, so it is uniquely determined by 4 evaluations (i.e. a polynomial of n coefficients is of degree n−1, thus uniquely defined by n evaluations).
Since ∀x1,x2∈{0,1}.h((x1,x2))=G((x1,x2))=0, then h must necessarily be the zero polynomial h=0.
h = sum([
sum([
G(x11=x11, x22=x22) * eqx(x11=x11, x22=x22)
for x22 in [0,1]])
for x11 in [0,1]])
h
0
Note that there would be no harm in doing such transformation if G were already multilinear.
If a polynomial G∈F[x1,...,xs] is multilinear, then G(X)=∑x∈{0,1}sG(x)⋅eq(x,X)
You can check lemma 6 of HyperNova for a proof.
To help us refer to the different sum-checks conceptually, we give them different names:
Outer sum-check: 0=G(x) for all x∈{0,1}2.
Inner sum-check: Tj=∑y∈{0,1}3Mj(r,y)⋅z(y)
Outer sum-check
The prover engages with the verifier in the following sum-check:
∑x1∈{0,1}∑x2∈{0,1}Q(x1,x2)=0
which implies that G((x1,x2))=0 for all x1,x2∈{0,1}.
sum([
sum([ Q(x11=x1, x22=x2) for x2 in [0,1]])
for x1 in [0,1]]) == 0
True
Round 1
The prover P sends the verifier V a univariate polynomial s1(X1) claiming to be equal to Q1(X1):=∑x2∈{0,1}Q(X1,x2). That is, we keep the first variable unbound while summing up the values over the boolean hypercube.
Q1 = sum([ Q(x22=x2) for x2 in [0,1]])
s1 = Q1
s1
17*x11^3 - 29*x11^2 + 12*x11
The verifier first checks that s1(0)+s1(1) matches the expected result, i.e. s1(0)+s1(1)=0.
s1(x11=0) + s1(x11=1) == 0
True
Then the verifier checks s1=Q1 by checking that Q1 and s1 agree at a random point r1 (Schwartz-Zippel lemma).
r1 = k.random_element()
r1
84
The verifier can compute directly s1(r1) but doesn't know what Q1 is, so the check s1=Q1 must be done recursively.
Round 2
The new claim is that s1(r1):=∑x2∈{0,1}Q(r1,x2), so P sends V a univariate s2(X2) which he claims to be equal to Q2(X2):=Q(r1,X2).
Q2 = Q(x11=r1)
s2 = Q2
s2
-18*x22^3 + 37*x22^2 + 27*x22 - 20
The verifier first checks that s1(r1) is indeed s2(0)+s2(1)
s2(x22=0) + s2(x22=1) == s1(x11=r1)
True
Then the verifier sends a random challenge r2 to check that s2(r2)=Q2(r2).
r2 = k.random_element()
r2
31
There is no more need for recursion, since the verifier can now evaluate
and to send Ti to the verifier for i∈{1,2,3} to sum-check it.
def Ti_generator(Mi, zi):
return sum([
sum([
sum([
Mi_linear(Mi)(x11=r1, x22= r2, y11=y1,y22=y2,y33=y3) * z_linear(zi)(y11=y1,y22=y2,y33=y3)
for y3 in [0,1]])
for y2 in [0,1]])
for y1 in [0,1]])
T1 = Ti_generator(M1, z1)
T2 = Ti_generator(M2, z1)
T3 = Ti_generator(M3, z1)
T1, T2, T3
(33, -10, 10)
However, instead of engaging in separate sum-checks Ti=∑y∈{0,1}3Mi((r1,r2),y)⋅z(y) for each i∈{1,2,3}, the prover will batch them using a random element α generated by the verifier.