2023 

The SumCheck Protocol over Fields of Small Characteristic
Justin Thaler
Manuscript
Links: [pdf]
Abstract:
The sumcheck protocol of Lund, Fortnow, Karloff, and Nisan underlies SNARKs with the fastest
known prover. In many of its applications, the prover can be implemented with a number of field
operations that is linear in the number, n, of terms being summed.
We describe an optimized prover implementation when the protocol is applied over an extension field
of a much smaller base field. The rough idea is to keep most of the prover’s multiplications over the base
field (at the cost of performing more total field multiplications).
When the sumcheck protocol is applied to a product of polynomials that all output values in the base
field, our algorithm reduces the number of extension field operations by multiple orders of magnitude. In
other settings, our improvements are more modest but nonetheless meaningful.
In SNARK design, the sumcheck protocol is often combined with a polynomial commitment scheme,
which are growing faster, especially when the values being committed are small. These improved
commitment schemes may render the sumcheck prover the overall bottleneck, which our results help to
mitigate.

BabySpartan: Lassobased SNARK for nonuniform computation
Srinath Setty, and
Justin Thaler
Manuscript
Links: [eprint]
Abstract:
Lasso (Setty, Thaler, Wahby, ePrint 2023/1216)
is a recent lookup argument that ensures that the prover
cryptographically commits to only "small" values. This note
describes BabySpartan, a SNARK for a large class of constraint systems
that achieves the same property. The SNARK is a simple combination of
SuperSpartan and Lasso. The specific class of constraint systems supported
is a generalization of socalled Plonkish constraint systems (and a special
case of customizable constraint systems (CCS)). Whereas a recent work called
Jolt (Arun, Setty, and Thaler, ePrint 2023/1217) can be viewed as an
application of Lasso to uniform computation, BabySpartan can be viewed
as applying Lasso to nonuniform computation.

Jolt: SNARKs for Virtual Machines via Lookups
Arasu Arun
Srinath Setty, and
Justin Thaler
Manuscript
Links: [eprint]
Abstract:
Succinct Noninteractive Arguments of Knowledge (SNARKs) allow an untrusted prover to establish
that it correctly ran some "witnesschecking procedure" on a witness.
A zkVM (short for zeroknowledge Virtual Machine) is a SNARK that allows
the witnesschecking procedure to be specified as a computer program written in
the assembly language of a specific instruction set architecture (ISA).
A frontend
converts computer programs into a lowerlevel representation such as an arithmetic circuit or
generalization thereof. A SNARK for circuitsatisfiability can then be applied to the resulting circuit.
We describe a new frontend technique called Jolt that applies to a variety of ISAs.
Jolt arguably realizes a vision called the "lookup singularity", which seeks to produce circuits that
only perform lookups into predetermined lookup tables.
The circuits output by Jolt primarily perform lookups into a gigantic lookup table, of size more than
$2^{128}$, that depends only on the ISA. The validity of the lookups are proved via a new lookup argument
called Lasso described in a companion work (Setty, Thaler, and Wahby, eprint 2023). Although size$2^{128}$
tables are vastly too large to materialize in full, the tables arising in Jolt are structured, avoiding costs that grow linearly with the table size.
We describe performance and auditability benefits of Jolt compared to prior zkVMs,
focusing on the popular RISCV ISA as a concrete example. The dominant cost for the Jolt prover applied to this ISA (on
64bit data types) is cryptographically committing to about six
256bit field elements per step of the RISCV CPU.
This compares favorably to prior zkVM provers, even those focused on far simpler VMs.

Unlocking the lookup singularity with Lasso
Srinath Setty,
Justin Thaler, and
Riad Wahby
Manuscript
Links: [eprint]
Abstract:
This paper introduces Lasso, a new family of lookup arguments, which
allow an untrusted prover to commit to a vector $a \in \mathbb{F}^m$
and prove that all entries of $a$ reside in some predetermined table $t \in
\mathbb{F}^n$.
Lasso's performance characteristics unlock the socalled
``lookup singularity''.
Lasso works with any multilinear polynomial commitment scheme, and
provides the following efficiency properties.

For $m$ lookups into a table of size $n$, Lasso's
prover commits to just $m+n$ field elements.
Moreover, the committed field elements are \emph{small}, meaning that, no matter
how big the field $\mathbb{F}$ is,
they are all in the set $\{0, \dots, m\}$.
When using a multiexponentiationbased commitment scheme, this
results in the prover's costs dominated by only $O(m+n)$ group
\emph{operations} (e.g., elliptic curve point additions), plus the cost to
prove an evaluation of a multilinear polynomial whose
evaluations over the Boolean hypercube are the table entries.
This represents a significant
improvement in prover costs over prior lookup arguments (e.g., plookup, Halo2's
lookups, lookup arguments based on logarithmic derivatives).
 Unlike all prior lookup arguments, if the table $t$ is structured (in a precise sense that we define),
then no party needs to commit to $t$, enabling the use of much
larger tables than prior works (e.g., of size $2^{128}$ or larger).
Moreover, Lasso's prover only ``pays'' in runtime for table entries that are
accessed by the lookup operations.
This applies to tables commonly used to implement range checks,
bitwise operations, bignumber arithmetic, and even transitions of a fullfledged CPU such as RISCV.
Specifically, for any integer
parameter $c>1$, Lasso's prover's dominant cost
is committing to $3 \cdot c \cdot m + c \cdot n^{1/c}$ field
elements. Furthermore, all these field elements are ``small'', meaning they are in the set $\{0, \dots, \max\{m, n^{1/c}, q\}1\}$,
where $q$ is the maximum value in $a$.
Lasso's starting point is Spark, a
timeoptimal polynomial commitment
scheme for sparse polynomials in Spartan (CRYPTO 2020).
We first provide a stronger security analysis for Spark.
Spartan's security analysis assumed that
certain metadata associated with a sparse polynomial is committed by an
honest party (this is acceptable for its purpose in Spartan, but not
for
Lasso).
We prove that Spark remains secure even when
that metadata is committed by a malicious party. This provides the first ``standard'' commitment scheme for
sparse multilinear polynomials with optimal prover costs.
We then generalize Spark to directly support a lookup argument for
both structured and unstructured tables, with the efficiency characteristics
noted above.

Customizable constraint systems for succinct arguments
Srinath Setty,
Justin Thaler,
and Riad Wahby
Manuscript
Links: [eprint]
Abstract:
This paper introduces customizable constraint system (CCS), a generalization of R1CS that can simultaneously capture R1CS, Plonkish, and AIR without overheads. Unlike existing descriptions of Plonkish and AIR, CCS is not tied to any particular proof system. Furthermore, we observe that the lineartime polynomial IOP for R1CS in Spartan (CRYPTO 20) extends easily to CCS, and when combined with a polynomial commitment scheme, it yields a family of SNARKs for CCS, which we refer to as SuperSpartan. SuperSpartan supports highdegree constraints without its prover incurring cryptographic costs that scale with the degree of constraints (only field operations scale with the constraint degree). Moreover, as in Spartan, it does not employ superlineartime and hardtodistribute operations such as FFTs. Similar properties were achieved for Plonkish by HyperPlonk (EUROCRYPT 23) via a different route. However, it is unclear how to prove CCS instances (or even R1CS instances) with HyperPlonk (or Plonk itself), without overheads. Furthermore, unlike HyperPlonk, SuperSpartan can prove uniform instances of CCS (including AIR) without requiring a lineartime preprocessing for the verifier, and for those instances, SuperSpartan provides “free” addition gates.
SuperSpartan for AIR is the first SNARK for AIR with a lineartime prover, transparent and sublineartime preprocessing, polylogarithmic proof size, and plausible postquantum security. In particular, SuperSpartan for AIR provides a faster prover than existing transparent SNARKs for AIR (which are sometimes referred to as STARKs).

FiatShamir Security of FRI and Related SNARKs
Alexander R. Block,
Albert Garreta,
Jonathan Katz,
Justin Thaler,
Pratyush Ranjan Tiwari, and
Michał Zając
Asiacrypt 2023
Links: [eprint]
Abstract:
We establish new results on the FiatShamir (FS) security of several protocols that
are widely used in practice, and we provide general tools for establishing
similar results for others. More precisely, we: (1) prove the FS security of the FRI and batched FRI protocols;
(2) analyze a general class of protocols, which we call
$\delta$correlated, that use lowdegree proximity testing as a subroutine
(this includes many ``Plonklike'' protocols (e.g., Plonky2 and Redshift), ethSTARK, RISC Zero, etc.); and
(3) prove FS security of the aforementioned "Plonklike" protocols, and sketch how to prove the same for the others.
We obtain our first result by analyzing the roundbyround (RBR) soundness and RBR knowledge soundness of FRI. For the second result, we prove that if a
$\delta$correlated protocol is RBR (knowledge) sound under the assumption that adversaries always send lowdegree polynomials, then it is RBR (knowledge) sound in general. Equipped with this tool, we prove our third result by formally showing that "Plonklike" protocols are RBR (knowledge) sound under the assumption that adversaries always send lowdegree polynomials. We then outline analogous arguments for the remainder of the aforementioned protocols.
To the best of our knowledge, ours is the first formal analysis of the FiatShamir security of FRI and widely deployed protocols that invoke it.

Testudo: Linear Time Prover SNARKs with Constant Size Proofs and Square Root Size Universal Setup
Matteo Campanelli, Nicolas Gailly, Rosario Gennaro, Philipp Jovanovic, Mara Mihali, and Justin Thaler
Latincrypt 2023
Links: [eprint]
Abstract:
We present Testudo,
a new FFTless SNARK with a near lineartime prover, constanttime verifier, constantsize proofs and a squarerootsize universal setup.
Testudo is based on a variant of Spartanand hence does not require FFTsas well as a new, fast multivariate polynomial commitment scheme (PCS) with a squarerootsized trusted setup that is derived from PST (TCC 2013) and IPPs (Asiacrypt 2021).
To achieve constantsize SNARK proofs in Testudo,
we then combine our PCS openings proofs recursively with a Groth16 SNARK.
We also evaluate our construction and its building blocks: to compute a PCS opening proof for a polynomial of size
$2^{25}$, our new scheme opening procedure achieves a 110x speedup compared to PST and 3x compared to Gemini (Eurocrypt 2022),
since opening computations are heavily parallelizable and operate on smaller polynomials. Furthermore, a
proof for a witness of size $2^{30}$
requires a setup of size only $2^{15}$. Finally, we show that a
variant for proving dataparallel computations is almost 10x faster at verifying
$2^{10}$ Poseidonbased Merkle tree opening proofs than the regular version.

Brakedown: Lineartime and postquantum SNARKs for R1CS
Alexander Golovnev,
Jonathan Lee,
Srinath Setty,
Justin Thaler,
and Riad Wahby
CRYPTO 2023
Links: [eprint]
Abstract:
This paper introduces Brakedown, the first built system that provides
lineartime SNARKs for NP, meaning the prover incurs $O(N)$ finite
field operations to prove the satisfiability of an $N$sized R1CS instance.
Brakedown's prover is faster, both concretely and asymptotically, than prior
SNARK implementations.
Brakedown does not require a trusted setup and is plausibly
postquantum secure. Furthermore, it is compatible with arbitrary finite fields of
sufficient size; this property is new amongst implemented arguments with
sublinear proof sizes.
To design Brakedown, we observe that recent work of Bootle, Chiesa, and Groth (BCG, TCC 2020)
provides a polynomial commitment scheme that, when combined with the
lineartime interactive proof system of Spartan (CRYPTO 2020), yields lineartime IOPs and
SNARKs for R1CS %that are transparent and plausibly postquantum secure
(a
similar theoretical result was previously established by BCG, but our approach is
conceptually simpler, and crucial for achieving highspeed SNARKs).
A core ingredient in the polynomial commitment scheme that we distill
from BCG is a lineartime encodable code.
Existing constructions of such codes are believed to be impractical.
Nonetheless,
we design and engineer a new one that is practical in our context.
We also implement a variant of Brakedown that uses ReedSolomon codes instead of
our lineartime encodable codes; we refer to this variant as Shockwave.
Shockwave is not a lineartime SNARK, but it provides shorter proofs and lower verification
times than Brakedown (it also provides a faster prover than prior plausibly
postquantum SNARKs).
As a modest additional contribution, we observe that one can render the aforementioned SNARK zero knowledge
and reduce the proof size and verifier time from $O(\sqrt{N})$ to polylog$(N)$while maintaining a lineartime proverby outsourcing the verifier's work
via one layer of proof
composition with an existing zkSNARK as the ``outer'' proof system.

2022 

Approximate Degree in Quantum and Classical Computing
Mark Bun and Justin Thaler
Foundations and Trends in Theoretical Computer Science
Links: [authors' pdf] [NOW Publishers]
Abstract:
The approximate degree of a Boolean function f captures how well f can be approximated pointwise by lowdegree polynomials. This monograph surveys what is known about approximate degree and illustrates its applications in theoretical computer science.
A particular focus of the survey is a method of proving lower bounds via objects called dual polynomials. These represent a reformulation of approximate degree using linear programming duality. We discuss in detail a recent, powerful technique for constructing dual polynomials, called “dual block composition”.
Note: This is a vastly expanded version of the SIGACT News survey by the same title.

OrderInvariant Cardinality Estimators Are Differentially Private
Charlie Dickens, Daniel Ting, and Justin Thaler
NeurIPS 2022
Links: [arXiv]
Abstract:
We consider privacy in the context of streaming algorithms for cardinality
estimation. We show that a large class of algorithms all satisfy
$\epsilon$differential privacy, so long as (a) the algorithm is combined with
a simple downsampling procedure, and (b) the cardinality of the input stream
is $\Omega(k/\epsilon)$. Here, $k$ is a certain parameter of the sketch that is
always at most the sketch size in bits, but is typically much smaller. We also
show that, even with no modification, algorithms in our class satisfy
$(\epsilon, \delta)$differential privacy, where $\delta$ falls exponentially
with the stream cardinality.
Our analysis applies to essentially all popular cardinality estimation
algorithms, and substantially generalizes and tightens privacy bounds from
earlier works.

2021 

Proofs, Arguments, and ZeroKnowledge
Justin Thaler
Foundations and Trends in Theoretical Computer Science
Links: [book webpage]
Abstract:
Interactive proofs (IPs) and arguments are cryptographic protocols that enable an untrusted prover to provide
a guarantee that it performed a requested computation correctly. IPs and arguments
are zeroknowledge if they reveal nothing other than their own validity.
Within the last decade, generalpurpose zeroknowledge arguments have made the jump from theory to
practice. This has opened up new doors in the design of cryptographic systems, and generated additional
insights into the power of IPs and arguments (zeroknowledge or otherwise). There are now no fewer than
five promising approaches to designing efficient, generalpurpose zeroknowledge arguments. This
comprehensive survey
covers these approaches in a unified manner, emphasizing commonalities between them.
As of January 2021, the survey remains a work in progress, but I hope that readers will find it useful in its current state.

Guest Column: Approximate Degree in Classical and Quantum Computing
Mark Bun
and Justin Thaler
ACM SIGACT News, January 2021
Links: [pdf]
Abstract:
The approximate degree of a Boolean function
$f$ captures how well $f$ can be approximated pointwise by lowdegree polynomials.
This article surveys what we know about approximate degree and illustrates
some of its applications in theoretical computer science.

Quantum Proofs of Proximity
Marcel Dall'Agnol, Tom Gur, Subhayan Roy Moulik, and Justin Thaler
TQC 2021, Quantum
Links: [arXiv] [Quantum journal page]
Abstract:
We initiate the systematic study of QMA algorithms in the setting of property testing,
to which we refer as QMA proofs of proximity (QMAPs).
These are quantum query algorithms that receive explicit access to a
sublinearsize untrusted proof and are required to accept inputs having a property $\Pi$
and reject inputs that are $\epsilon$far from $\Pi$, while only probing a minuscule portion of their input.
Our algorithmic results include a generalpurpose theorem that enables quantum speedups for
testing an expressive class of properties, namely, those that are succinctly decomposable.
Furthermore, we show quantum speedups for properties that lie outside of this family,
such as graph bipartitneness. We also investigate the complexity landscape of this model,
showing that QMAPs can be exponentially stronger than both classical proofs of proximity and
quantum testers. To this end, we extend the methodology of Blais, Brody and Matulef (Computational Complexity, 2012)
to prove quantum property testing lower bounds via reductions from communication complexity,
thereby resolving a problem raised by Montanaro and de Wolf (Theory of Computing, 2016).
Relative Error Streaming Quantiles
Graham Cormode,
Zohar Karnin,
Edo Liberty, Justin Thaler,
and Pavel Veselý
PODS 2021. Best Paper Award. Invited to J. ACM.
Links: [arXiv] [Condensed version in SIGMOD Record] [Technical perspective by Rasmus Pagh]
Abstract:
Approximating ranks, quantiles, and distributions over streaming data is a central task in data analysis and monitoring.
Given a stream of $n$ items from a data universe $U$ equipped with a total order, the task is to compute a sketch (data structure) of
size poly($\log(n)$,$1/\epsilon$). Given the sketch and a query item $y \in U$, one should be able to approximate its rank in the stream,
i.e., the number of stream elements smaller than $y$.
Most works to date focused on additive $\epsilon n$ error approximation, culminating in the KLL sketch that achieved
optimal asymptotic behavior.
This paper investigates multiplicative $(1 \pm \epsilon$)error approximations to the rank.
The motivation stems from practical demand to understand the tails of distributions, and hence for sketches to be more accurate
near extreme values.
The most spaceefficient algorithms that can be derived from prior work store either
$O(\log(\epsilon^2 n)/\epsilon^2)$ or
$O(\log^3(\epsilon n)/\epsilon)$ universe items.
This paper presents a sketch of size $O(\log^{1.5}(\epsilon n)/\epsilon)$
(ignoring poly($\log \log n$, $\log(1/\epsilon$)) factors) that achieves a
($1 \pm \epsilon$) multiplicative error guarantee, without prior knowledge of the stream length or dependence on the size
of the data universe. This is within a $O(\sqrt{\log(\epsilon n)})$ factor of optimal.
Moreover, our sketch is fully mergeable, which makes it suitable for a parallel or distributed environment.

2020 

VanishingError Approximate Degree and QMA Complexity
Alexander A. Sherstov
and Justin Thaler
Accepted to CJTCS
Links: [ECCC]
Abstract:
The $\epsilon$approximate degree of a function $f\colon X \to \{0,1\}$
is the least degree of a multivariate real polynomial $p$ such that $p(x)f(x) \leq \epsilon$ for all $x \in X$.
We determine the $\epsilon$approximate degree of the element distinctness function, the surjectivity function, and the permutation testing problem,
showing they are $\Theta(n^{2/3}
\log^{1/3}(1/\epsilon))$, $\tilde\Theta(n^{3/4} \log^{1/4}(1/\epsilon))$, and
$\Theta(n^{1/3} \log^{2/3}(1/\epsilon))$, respectively. Previously, these bounds were known only for constant $\epsilon.$
We also derive a connection between vanishingerror approximate
degree and quantum MerlinArthur (QMA) query complexity.
We use this connection to show that the QMA complexity of permutation testing is
$\Omega(n^{1/4})$. This
improves on the previous best lower bound of
$\Omega(n^{1/6})$ due to Aaronson (Quantum Information & Computation, 2012),
and comes somewhat close to matching a known upper bound of $O(n^{1/3})$.
 Streaming Verification for Graph Problems: Optimal Tradeoffs and Nonlinear Sketches
Amit Chakrabarti,
Prantar Ghosh,
and Justin Thaler
RANDOM 2020
Links: [ECCC]
Abstract:
We study graph computations in an enhanced data streaming setting, where a spacebounded
client reading the edge stream of a massive graph may delegate some of its work to a cloud
service. We seek algorithms that allow the client to verify a purported
proof sent by the cloud service that the work done in the cloud is correct.
A line of work starting with Chakrabarti et al. (ICALP 2009) has provided such algorithms,
which we call schemes, for several statistical and graphtheoretic problems,
many of which exhibit a tradeoff between the length of the proof and the space used by the streaming verifier.
This work designs new schemes for a number of basic graph problemsincluding triangle counting,
maximum matching, topological sorting, and singlesource shortest pathswhere past work had either
failed to obtain smooth tradeoffs between these two key complexity measures or only obtained suboptimal tradeoffs.
Our key innovation is having the verifier compute certain \emph{nonlinear} sketches of the input stream,
leading to either new or improved tradeoffs. In many cases, our schemes in fact provide optimal tradeoffs up to logarithmic factors.
Specifically, for most graph problems that we study, it is known that the product of the verifier's space
cost $v$ and the proof length $h$ must be at least $\Omega(n^2)$ for $n$vertex graphs.
However, matching upper bounds are only known for a handful of settings of $h$ and $v$
on the curve $h \cdot v=\tilde{\Theta}(n^2)$. For example, for counting triangles and maximum matching,
schemes with costs lying on this curve are only known for $(h=\tilde{O}(n^2), v=\tilde{O}(1))$, $(h=\tilde{O}(n), v=\tilde{O}(n))$,
and the trivial $(h=\tilde{O}(1), v=\tilde{O}(n^2))$. A major message of this work is that by
exploiting nonlinear sketches, a significant "portion" of costs on the tradeoff curve $h \cdot v = n^2$ can be achieved.

Improved Approximate Degree Bounds For kDistinctness
Nikhil Mande,
Justin Thaler, and Shuchen Zhu
TQC 2020
Links: [ECCC] [Conference presentation video by Shuchen]
Abstract:
An open problem that is widely regarded
as one of the most important
in quantum query complexity is to resolve
the quantum query complexity of the $k$distinctness
function on inputs of size $N$. While the case
of $k=2$ (also called Element Distinctness) is wellunderstood,
there is a polynomial gap between the known upper and lower bounds
for all constants $k>2$.
Specifically, the best known upper bound is
$O\left(N^{(3/4)1/(2^{k+2}4)}\right)$ (Belovs, FOCS 2012),
while
the best known lower bound for $k\geq 2$ is
$\tilde{\Omega}\left(N^{2/3} + N^{(3/4)1/(2k)}\right)$
(Aaronson and Shi, J. ACM 2004; Bun, Kothari, and Thaler, STOC 2018).
For any constant $k \geq 4$, we improve the lower bound to
$\tilde{\Omega}\left(N^{(3/4)1/(4k)}\right)$.
This yields, for example, the first proof that $4$distinctness
is strictly harder than Element Distinctness. Our lower bound applies more generally to approximate degree.
As a secondary result, we give a simple construction of an approximating polynomial of degree $\tilde{O}(N^{3/4})$ that applies whenever $k \leq \text{polylog}(N)$.

Quantum Lower Bounds for Approximate Counting via Laurent Polynomials
Scott Aaronson,
Robin Kothari,
William Kretschmer, and Justin Thaler
QIP 2020, CCC 2020
Links: [arXiv] [slides] [30minute talk video by William]
Abstract:
This paper proves new limitations on the power of quantum computers to solve
approximate countingthat is, multiplicatively estimating the
size of a nonempty set $S\subseteq \lbrack N]$.
Given only a membership oracle for $S$, it is well known that approximate
counting takes $\Theta \bigl(\sqrt{N/S}\bigr)$ quantum queries. But what
if a quantum algorithm is also given "QSamples"i.e., copies of the state $\left\vert
S\right\rangle =\sum_{i\in S}i\rangle$or even the ability to apply
reflections about $S\rangle$? Our first main result is that, even then, the
algorithm needs either $\Theta \bigl(\sqrt{N/\left\vert S\right\vert }\bigr)$
queries or else $\Theta \Bigl(\min \bigl\{\left\vert S\right\vert ^{1/3},
\sqrt{N/\left\vert S\right\vert }\bigr\}\Bigr)$ reflections or samples.
We also give matching upper bounds.
We prove the lower bound using a novel generalization of the polynomial
method of Beals et al. to Laurent polynomials, which can have
negative exponents. We lowerbound Laurent polynomial degree using two
methods: a new "explosion argument" that
pits the positive and negativedegree parts of the polynomial against each
other, and a new formulation of the dual polynomials method.
Our second main result rules out the possibility of a blackbox Quantum
MerlinArthur (or $\mathsf{QMA}$) protocol for proving that a set is large.
More precisely, we show that, even if Arthur can make $T$ quantum queries
to the set $S\subseteq \lbrack N]$, and also receives an $m$qubit quantum
witness from Merlin in support of $S$ being large, we have $Tm=\Omega \Bigl(
\min \bigl\{ \left\vert S\right\vert ,\sqrt{N/\left\vert S\right\vert }
\bigr\} \Bigr) $. This resolves the open problem of giving an oracle
separation between $\mathsf{SBP}$, the complexity class that captures
approximate counting, and $\mathsf{QMA}$.
Note that $\mathsf{QMA}$ is "stronger"
than the queries+QSamples model in that Merlin's witness can be anything,
rather than just the specific state $\left\vert S\right\rangle $, but also
"weaker" in that Merlin's witness cannot
be trusted. Intriguingly, Laurent polynomials also play a crucial
role in our $\mathsf{QMA}$\ lower bound, but in a completely different
manner than in the queries+QSamples lower bound. This suggests that the
"Laurent polynomial method" might be
broadly useful in complexity theory.

Ad Hoc MultiInput Functional Encryption
Shweta Agrawal,
Michael Clear,
Ophir Frieder,
Sanjam Garg,
Adam O'Neill,
Justin Thaler
ITCS 2020
Links: [eprint]
Abstract:
Consider sources that supply sensitive data to an aggregator.
Standard encryption only hides the data from eavesdroppers,
but using specialized encryption one can hope to hide the data
(to the extent possible) from the aggregator itself. For flexibility and
security, we envision schemes that allow sources to supply encrypted data,
such that at any point a dynamically chosen subset of sources can allow an
agreedupon joint function of their data to be computed by the aggregator.
A primitive called multiinput functional encryption (MIFE), due to Goldwasser et al.
(EUROCRYPT 2014), comes close, but has two main limitations:
it requires trust in a third party, who is able to decrypt all the data, and
it requires function arity to be fixed at setup time and to be equal to the number of parties.
To drop these limitations, we introduce a new notion of ad hoc MIFE.
In our setting, each source generates its own public key and issues individual,
functionspecific secret keys to an aggregator. For successful decryption, an
aggregator must obtain a separate key from each source whose ciphertext is being
computed upon. The aggregator could obtain multiple such secret keys from a user corresponding
to functions of varying arity. For this primitive, we obtain the following results:
– We show that standard MIFE for general functions can be bootstrapped to ad hoc MIFE for free,
i.e. without making any additional assumption.
– We provide a direct construction of ad hoc MIFE for the inner product functionality based on the Learning with Errors
(LWE) assumption. This yields the first construction of this natural primitive based on a standard assumption.
At a technical level, our results are obtained by combining standard MIFE schemes and
tworound secure multiparty computation (MPC) protocols in novel
ways highlighting an interesting interplay between MIFE and tworound
MPC in the construction of non interactive primitives.

2019 

Approximate Degree, Secret Sharing, and Concentration Phenomena
Andrej Bogdanov,
Nikhil Mande,
Justin Thaler, and Christopher Williamson
RANDOM 2019
Links: [ECCC]
Abstract:
The $\epsilon$approximate degree $\widetilde{\text{deg}}_{\epsilon}(f)$ of a
Boolean function $f$ is the least degree of a realvalued polynomial
that approximates $f$ pointwise to within $\epsilon$. A sound and
complete certificate for approximate degree being at least
$k$ is a pair of probability distributions, also known as a
dual polynomial, that are perfectly $k$wise indistinguishable,
but are distinguishable by $f$ with advantage $1  \epsilon$. Our contributions are:
 We give a simple, explicit new construction of a dual polynomial for the $\mathsf{AND}$ function on $n$ bits, certifying that its $\epsilon$approximate degree is $\Omega\left(\sqrt{n \log 1/\epsilon}\right)$. This construction is the first to extend to the notion of weighted degree, and yields the first explicit certificate that the $1/3$approximate degree of any (possibly unbalanced) readonce DNF is $\Omega(\sqrt{n})$. It draws a novel connection between the approximate degree of $\mathsf{AND}$ and
anticoncentration of the Binomial distribution.
 We show that any pair of symmetric distributions on $n$bit strings that are perfectly $k$wise indistinguishable are also statistically $K$wise indistinguishable with at most $K^{3/2} \cdot \exp\left(\Omega\left(k^2/K\right)\right)$ error for all $k < K \leq n/64$.
This bound is essentially tight, and implies that any symmetric function $f$ is a reconstruction function with constant advantage for a ramp secret sharing scheme that is secure against size$K$ coalitions with statistical error $K^{3/2} \cdot \exp\left(\Omega\left(\widetilde{\text{deg}}_{1/3}(f)^2/K\right)\right)$ for all values of $K$ up to $n/64$ simultaneously.
Previous secret sharing schemes required that $K$ be determined in advance, and only worked for $f=\mathsf{AND}$. Our analysis draws another new connection between approximate degree and concentration phenomena.
As a corollary of this result, we show that for any $d \leq n/64$, any degree $d$ polynomial approximating a symmetric function $f$ to error $1/3$
must have coefficients of $\ell_1$norm at least $K^{3/2} \cdot \exp\left({\Omega\left(\widetilde{\text{deg}}_{1/3}\left(f\right)^2/d\right)}\right)$.
We also show this bound is essentially tight for any $d > \widetilde{\text{deg}}_{1/3}(f)$.
These upper and lower bounds were also previously only known in the case $f=\mathsf{AND}$.

The LargeError Approximate Degree of AC^0
Mark Bun and Justin Thaler
RANDOM 2019; Theory of Computing (Special Issue for RANDOM 2019)
Links: [ECCC]
Abstract:
We prove two new results about the inability of lowdegree polynomials to uniformly approximate
constantdepth circuits, even to slightlybetterthantrivial error.
First, we prove a tight $\tilde{\Omega}(n^{1/2})$ lower bound on the threshold degree of the $\mathsf{SURJECTIVITY}$
function on $n$ variables. This matches the best known threshold degree bound for any AC$^0$ function,
previously exhibited by a much more complicated circuit of larger depth (Sherstov, FOCS 2015).
Our result also extends to a $2^{\tilde{\Omega}(n^{1/2})}$ lower bound on the signrank of an AC$^0$ function,
improving on the previous best bound of $2^{\Omega(n^{2/5})}$ (Bun and Thaler, ICALP 2016).
Second, for any $\delta>0$, we exhibit a function $f \colon \{1, 1\}^n \to \{1, 1\}$
that is computed by a circuit of depth $O(1/\delta)$ and is hard to approximate by polynomials in the
following sense: $f$ cannot be uniformly approximated to error
$\epsilon=12^{\Omega(n^{1\delta})}$, even by polynomials of degree $n^{1\delta}$.
Our recent prior work (Bun and Thaler, FOCS 2017)
proved a similar lower bound, but which held only for error $\epsilon=1/3$.
Our result implies $2^{\Omega(n^{1\delta})}$ lower bounds on the complexity of AC$^0$
under a variety of basic measures such as discrepancy, margin complexity, and threshold weight. This nearly
matches the trivial upper bound of $2^{O(n)}$ that holds for every function. The previous best lower bound on AC$^0$
for these measures was $2^{\Omega(n^{1/2})}$ (Sherstov, FOCS 2015). Additional applications in learning theory, communication
complexity, and cryptography are described.

SignRank Can Increase
Under Intersection
Mark Bun,
Nikhil Mande,
Justin Thaler
ICALP 2019; TOCT
Links: [ECCC]
Abstract:
The communication class UPP$^{\text{cc}}$ is a communication analog of the Turing Machine complexity class PP.
It is characterized by a matrixanalytic complexity measure called signrank (also called dimension complexity),
and is essentially the most powerful communication class against which we know how to prove lower bounds.
For a communication problem $f$, let $f \wedge f$ denote the function that evaluates $f$ on two disjoint
inputs and outputs the AND of the results. We exhibit a communication problem $f$ with UPP$(f)=O(\log n)$, and
UPP$(f \wedge f)=\Theta(\log^2 n)$.
This is the first result showing that UPP communication complexity can increase by more than a constant
factor under intersection.
We view this as a first step toward showing that UPP$^{\text{cc}}$, the class of problems with
polylogarithmiccost UPP communication protocols, is not closed under intersection.
Our result shows that the function class consisting of intersections of
two majorities on $n$ bits has dimension complexity $n^{\Omega(\log n)}$.
This matches an upper bound of (Klivans, O'Donnell, and Servedio, FOCS 2002),
who used it to give a quasipolynomial time algorithm for PAC learning intersections of
polylogarithmically many majorities. Hence, fundamentally new techniques will be needed
to learn this class of functions in polynomial time.

Quantum Algorithms and Approximating Polynomials
for Composed Functions With Shared Inputs
Mark Bun,
Robin Kothari,
Justin Thaler
SODA 2019, full version in Quantum.

2018 

Doublyefficient zkSNARKs without trusted setup
Riad S. Wahby, Ioanna Tzialla, abhi shelat, Justin Thaler, and Michael Walfish
IEEE Symposium on Security and Privacy (S&P) 2018
Links: [eprint]
Abstract:
We present a zeroknowledge argument for NP with low communication complexity,
low concrete cost for both the prover and the verifier, and no trusted setup,
based on standard cryptographic assumptions (DDH). Specifically, communication
is proportional to at most the square root of the witness size, plus $d \cdot log(G)$, for $d$ the depth and
$G$ the width of the verifying circuit.
Moreover, witnessrelated communication can be reduced below square root,
at the cost of increased verifier runtime. When applied to batched or
dataparallel statements, the prover's runtime is linear and the verifier's
is sublinear in the verifying circuit size, both with good constants.
Together, these properties represent a new point in the tradeoffs among setup,
complexity assumptions, proof size, and computational cost.
Our argument is public coin, so we apply the FiatShamir heuristic to produce a
zeroknowledge succinct noninteractive argument of knowledge (zkSNARK), which we
call Hyrax. We evaluate Hyrax on three benchmarks: SHA256 Merkle trees, image
transformation, and matrix multiplication. We find that Hyrax's proofs are 2−10x
smaller than prior work with similar properties, and that Hyrax scales to 6−27x
larger circuits than a highlyoptimized prior system that requires trusted setup.

The Polynomial Method Strikes Back:
Tight Quantum Query Bounds via Dual Polynomials
Mark Bun,
Robin Kothari,
Justin Thaler STOC 2018; Preliminary version in QIP 2018 (plenary talk). Invited to Theory of Computing.
Manuscript link: [arXiv]
Abstract:
The approximate degree of a Boolean function f is the least degree of a real polynomial
that approximates f pointwise to error at most 1/3. The approximate degree of f is known to
be a lower bound on the quantum query complexity of f (Beals et al., FOCS 1998 and J. ACM 2001).
We resolve or nearly resolve the approximate degree and quantum query complexities of
several basic functions. Specifically, we show the following:
 kdistinctness: For any constant k, the approximate degree and quantum query complexity
of the kdistinctness function is $\Omega(n^{3/41/(2k)})$.
This is nearly tight for large $k$, as Belovs (FOCS 2012)
has shown that for any constant k, the approximate degree and quantum query complexity of
kdistinctness is $O(n^{3/41/(2^{k+2}4)})$.
 Image Size Testing: The approximate degree and quantum query complexity of testing the
size of the image of a function $[n] \rightarrow [n]$ is $\tilde{\Omega}(n^{1/2})$. This proves a conjecture of
Ambainis et al. (SODA 2016), and it implies tight lower bounds on the approximate degree and quantum query
complexity of the following natural problems.
 kjunta testing: A tight $\tilde{\Omega}(k^{1/2})$ lower bound for kjunta testing,
answering the main open question of Ambainis et al. (SODA 2016).
 Statistical Distance from Uniform: A tight $\tilde{\Omega}(n^{1/2})$ lower bound for
approximating the statistical distance from uniform of a distribution,
answering the main question left open by Bravyi et al. (STACS 2010 and IEEE Trans. Inf. Theory 2011).
 Shannon entropy: A tight $\tilde{\Omega}(n^{1/2})$ lower bound for approximating
Shannon entropy up to a certain additive constant, answering a question of Li and Wu (2017).
 Surjectivity: The approximate degree of the Surjectivity function is $\tilde{\Omega}(n^{3/4})$.
The best prior lower bound was $\Omega(n^{2/3})$. Our result matches an upper bound of
$\tilde{O}(n^{3/4})$ due to Sherstov, which we reprove using different techniques.
The quantum query complexity of this function is known to be $\Theta(n)$
(Beame and Machmouchi, Quantum Inf. Comput. 2012 and Sherstov, FOCS 2015).
Our upper bound for Surjectivity introduces new techniques for approximating Boolean
functions by lowdegree polynomials. Our lower bounds are proved by significantly
refining techniques recently introduced by Bun and Thaler (FOCS 2017).
Slides: [slides giving unified
overview of BT17 and BKT18 (size: 3MB)]
[slides more focused on BKT18 (size: 11MB)]
Lecture notes: These lecture notes
elucidate various aspects of the proof. Whereas the proof in the paper
takes place entirely in the "dual" world, the notes explain how to replace a key step
of the proof with a "primal" argument that many will find more intuitive.

Approximate Degree and the Complexity of Depth Three Circuits
Mark Bun and
Justin Thaler
RANDOM 2018
Links: [ECCC] [slides]
Abstract:
Threshold weight, margin complexity, and MajorityofThreshold circuit size
are basic complexity measures of Boolean functions that arise in learning
theory, communication complexity, and circuit complexity. Each of these
measures might exhibit a \emph{chasm} at depth three: namely, all
polynomial size Boolean circuits of depth two have polynomial
complexity under the measure, but there may exist Boolean circuits
of depth three that have essentially maximal complexity $\exp(\Omega(n))$.
However, existing techniques are far from showing this:
for all three measures, the best lower bound for depth three circuits is
$\exp(\Omega(n^{2/5}))$ . Moreover, current methods exclusively study blockcomposed
functions. Such methods appear intrinsically unable to prove lower
bounds better than $\exp(\Omega(n^{1/2}))$ even for depth four circuits,
and have yet to prove lower bounds better than $\exp(\Omega(n^{1/2}))$ for circuits of any constant depth.
We take a step toward showing that all of these complexity measures indeed exhibit a chasm at depth three.
Specifically, for any arbitrarily small constant $\delta > 0$, we exhibit a depth three circuit of polynomial size
(in fact, an $O(\log n)$decision list) of complexity $\exp(\Omega(n^{1/2\delta}))$ under each of these measures.
Our methods go beyond the blockcomposed functions studied in prior work,
and hence may not be subject to the same barriers. In particular, we suggest
natural candidate functions that may exhibit stronger bounds, of the form $\exp(\tilde{\Omega}(n))$,
where the notation hides factors polylogarithmic in $n$.

2017 

Full Accounting for Verifiable Outsourcing
Riad S. Wahby, Ye Ji, Andrew J. Blumberg, abhi shelat, Justin Thaler,
Michael Walfish, and Thomas Wies
CCS 2017
Links: [eprint]
Abstract:
Systems for verifiable outsourcing incur costs for a prover, a verifier,
and precomputation; outsourcing makes sense when these costs are cheaper than
not outsourcing. Yet, prover costs are generally ignored. The only exception is
Verifiable ASICs (VA), wherein the prover is a custom chip; however, the only
prior VA system ignores the cost of precomputation.
This paper describes a new VA system, called Giraffe; charges Giraffe for all
three costs; and identifies regimes where outsourcing is worthwhile. Giraffe’s
base is an interactive proof geared to data parallel computation. Giraffe makes
this protocol asymptotically optimal for the prover, which is of independent interest.
Giraffe also develops a design template that produces hardware designs automatically
for a wide range of parameters, introduces hardware primitives molded to the protocol’s
data flows, and incorporates program analyses that expand applicability. Giraffe wins
even when outsourcing several tens of subcomputations, scales to 500x larger
computations than prior work, and can profitably outsource parts of programs that
are not worthwhile to outsource in full.

A HighPerformance Algorithm for Identifying Frequent Items in Data Streams
Daniel Anderson,
Pryce Bevan,
Kevin Lang,
Edo Liberty,
Lee Rhodes, and Justin Thaler
IMC 2017
Links: [arxiv]
Abstract:
Estimating frequencies of items over data streams is a common building block in streaming data measurement and analysis. Misra and Gries introduced their seminal algorithm for the problem in 1982, and the problem has since been revisited many times due its practicality and applicability. We describe a highly optimized version of Misra and Gries’ algorithm that is suitable for deployment in industrial settings. Our code is made public via an open source library called DataSketches that is already used by several companies and production systems.
Our algorithm improves on two theoretical and practical aspects of prior work. First, it handles weighted updates in amortized constant time, a common requirement in practice. Second, it uses a simple and fast method for merging summaries that asymptotically improves on prior work even for unweighted streams. We describe experiments confirming that our algorithms are more efficient than prior proposals.

A Nearly Optimal Lower Bound on the Approximate Degree of AC$^0$
Mark Bun and
Justin Thaler
FOCS 2017. SICOMP (Special Issue for FOCS '17)
Manuscript link: [ECCC]
Abstract:
The approximate degree of a Boolean function $f \colon \{1, 1\}^n \rightarrow \{1, 1\}$ is the least degree of a real polynomial that approximates $f$ pointwise to error at most $1/3$. We introduce a generic method for increasing the approximate degree of a given function, while preserving its computability by constantdepth circuits.
Specifically, we show how to transform any Boolean function $f$ with approximate degree $d$ into a function $F$ on $O(n \cdot \text{polylog}(n))$ variables with approximate degree at least $D = \Omega(n^{1/3} \cdot d^{2/3})$. In particular, if $d= n^{1\Omega(1)}$, then $D$ is polynomially larger than $d$. Moreover, if $f$ is computed by a polynomialsize Boolean circuit of constant depth, then so is $F$.
By recursively applying our transformation, for any constant $\delta > 0$ we exhibit an AC$^0$ function of approximate degree $\Omega(n^{1\delta})$. This improves over the best previous lower bound of $\Omega(n^{2/3})$ due to Aaronson and Shi (J. ACM 2004), and nearly matches the trivial upper bound of $n$ that holds for any function. Our lower bounds also apply to (quasipolynomialsize) DNFs of polylogarithmic width.
We describe several applications of these results. We give:
 For any constant $\delta > 0$, an $\Omega(n^{1\delta})$ lower bound on the quantum communication complexity of a function in AC$^0$.
 A Boolean function $f$ with approximate degree at least $C(f)^{2o(1)}$, where $C(f)$ is the certificate complexity of $f$. This separation is optimal up to the $o(1)$ term in the exponent.
 Improved secret sharing schemes with reconstruction procedures in AC$^0$.
Slides and video: [slides] [BIRS Talk Video (45 minutes)]
Lecture notes: These lecture notes
elucidate various aspects of the proof. Whereas the proof in the paper
takes place entirely in the "dual" world, the notes explain how to replace a key step
of the proof with a "primal" argument that many will find more intuitive.

On the Power of Statistical Zero Knowledge
Adam Bouland,
Lijie Chen,
Dhiraj Holden, Justin Thaler, and
Prashant Nalini Vasudevan
FOCS 2017. SICOMP (special issue for FOCS '17)
Links: [ECCC]
Abstract:
We examine the power of statistical zero knowledge proofs (captured by the complexity class SZK) and their variants. First, we give the strongest known relativized evidence that SZK contains hard problems, by exhibiting an oracle relative to which SZK (indeed, even NISZK) is not contained in the class UPP, containing those problems solvable by randomized algorithms with unbounded error. This answers an open question of Watrous from 2002. Second, we "lift" this oracle separation to the setting of communication complexity, thereby answering a question of Goos et al. (ICALP 2016). Third, we give relativized evidence that perfect zero knowledge proofs (captured by the class PZK) are weaker than general zero knowledge proofs. Specifically, we exhibit oracles relative to which SZK is not contained in PZK, NISZK is not contained in NIPZK, and PZK is not equal to coPZK. The first of these results answers a question raised in 1991 by Aiello and Hastad (Information and Computation), and the second answers a question of Lovett and Zhang (2016). We also describe additional applications of these results outside of structural complexity.
The technical core of our results is a stronger hardness amplification theorem for approximate degree, which roughly says that composing the gappedmajority function with any function of high approximate degree yields a function with high threshold degree.

Reliably Learning the ReLU in Polynomial Time
Surbhi Goel,
Varun Kanade,
Adam Klivans
and Justin Thaler
COLT 2017
Links: [arXiv]
Abstract:
We give the first dimensionefficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form $x \mapsto max(0,w \cdot x)$ with $w \in \mathbb{S}^{n1}$. Our algorithm works in the challenging Reliable Agnostic learning model of Kalai, Kanade, and Mansour (2009) where the learner is given access to a distribution $D$ on labeled examples but the labeling may be arbitrary. We construct a hypothesis that simultaneously minimizes the falsepositive rate and the loss on inputs given positive labels by $D$, for any convex, bounded, and Lipschitz loss function.
The algorithm runs in polynomialtime (in $n$) with respect to any distribution on $\mathbb{S}^{n1}$ (the unit sphere in $n$ dimensions) and for any error parameter $\epsilon=\Omega(1/\log n)$ (this yields a PTAS for a question raised by F. Bach on the complexity of maximizing ReLUs). These results are in contrast to known efficient algorithms for reliably learning linear threshold functions, where $\epsilon$ must be $\Omega(1)$ and strong assumptions are required on the marginal distribution. We can compose our results to obtain the first set of efficient algorithms for learning constantdepth networks of ReLUs.
Our techniques combine kernel methods and polynomial approximations with a "dualloss" approach to convex programming. As a byproduct we obtain a number of applications including the first set of efficient algorithms for "convex piecewiselinear fitting" and the first efficient algorithms for noisy polynomial reconstruction of lowweight polynomials on the unit sphere.

Determining Tournament Payout Structures for Daily Fantasy Sports
Christopher Musco, Maxim Sviridenko, and
Justin Thaler
ALENEX 2017; Invited to special issue of ACM Journal of Experimental Algorithmics for ALENEX 2017
Links: [arXiv]
Abstract:
With an exploding global market and the recent introduction of online cash prize tournaments, fantasy sports contests are quickly becoming a central part of the social gaming and sports industries. For sports fans and online media companies, fantasy sports contests are an opportunity for large financial gains. However, they present a host of technical challenges that arise from the complexities involved in running a webscale, prize driven fantasy sports platform.
We initiate the study of these challenges by examining one concrete problem in particular: how to algorithmically generate contest payout structures that are 1) economically motivating and appealing to contestants and 2) reasonably structured and succinctly representable. We formalize this problem and present a general twostaged approach for producing satisfying payout structures given constraints on contest size, entry fee, prize bucketing, etc.
We then propose and evaluate several potential algorithms for solving the payout problem efficiently, including methods based on dynamic programming, integer programming, and heuristic techniques. Experimental results show that a carefully designed heuristic scales very well, even to contests with over 100,000 prize winners.
Our approach extends beyond fantasy sports  it is suitable for generating engaging payout structures for any contest with a large number of entrants and a large number of prize winners, including other massive online games, poker tournaments, and reallife sports tournaments.

2016 

Dual Polynomials for Collision and Element Distinctness
Mark Bun and
Justin Thaler
Theory of Computing
Links: [ECCC]
Abstract:
The approximate degree of a Boolean function $f \colon \{1, 1\}^n \rightarrow \{1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within error $1/3$ in the $L_{\infty}$ norm. In an influential result, Aaronson and Shi (J. ACM 2004) proved tight $\tilde{\Omega}(n^{1/3})$ and $\tilde{\Omega}(n^{2/3})$ lower bounds on the approximate degree of the Collision and Element Distinctness functions, respectively. Their proof was nonconstructive, using a sophisticated symmetrization argument and tools from approximation theory.
More recently, several open problems in the study of approximate degree have been resolved via the construction of dual polynomials. These are explicit dual solutions to an appropriate linear program that captures the approximate degree of any function. We reprove Aaronson and Shi's results by constructing explicit dual polynomials for the Collision and Element Distinctness functions.

Improved Bounds on the SignRank of AC$^0$
Mark Bun and
Justin Thaler
ICALP 2016
Links: [ECCC] [slides]
Abstract:
The signrank of a matrix $A$ with entries in $\{1, +1\}$ is the least rank of a real matrix $B$ with $A_{ij}\cdot B_{ij} > 0$ for all $i$, $j$. Razborov and Sherstov (2008) gave the first exponential lower bounds on the signrank of a function in AC$^0$, answering an old question of Babai, Frankl, and Simon (1986). Specifically, they exhibited a matrix $A=[F(x, y)]_{x, y}$ for a specific function $F: \{1, 1\}^n \times \{1, 1\}^n \rightarrow \{1, 1\}$ in AC$^0$, such that $A$ has signrank $\exp(\Omega(n^{1/3}))$.
We prove a generalization of Razborov and Sherstov's result, yielding exponential signrank lower bounds for a nontrivial class of functions (that includes the function used by Razborov and Sherstov). As a corollary of our general result, we improve Razborov and Sherstov's lower bound on the signrank of AC$^0$ from $\exp(\Omega(n^{1/3}))$ to $\exp(\tilde{\Omega}(n^{2/5}))$. We also describe several applications to communication complexity, learning theory, and circuit complexity.

Lower Bounds for the Approximate Degree of BlockComposed Functions
Justin Thaler
ICALP 2016
Links: [ECCC] [slides]
Abstract:
We describe a new hardness amplification result for pointwise approximation of Boolean
functions by lowdegree polynomials. Specifically, for any function $f$ on $N$ bits, define
$F(x_1, \dots, x_M)=\textsf{OMB}(f(x_1), ..., f(x_M))$ to be the function on $M\cdot N $bits obtained by
blockcomposing $f$ with a specific DNF known as ODDMAXBIT. We show that, if $f$ requires
large degree to approximate to error $2/3$ in a certain onesided sense (captured by a
complexity measure known as positive onesided approximate degree), then $F$ requires large
degree to approximate even to error $12^{M}$. This generalizes a result of Beigel, who
proved an identical result for the special case $f=\textsf{OR}$.
Unlike related prior work, our result implies strong approximate degree lower bounds even
for many functions $F$ that have low threshold degree.
Our proof is constructive: we exhibit a solution to the dual of an appropriate linear program
capturing the approximate degree of any function.

SemiStreaming Algorithms for Annotated Graph Streams
Justin Thaler
ICALP 2016
Links: [arXiv] [slides]
Abstract:
Considerable effort has been devoted to the development of streaming algorithms for analyzing massive graphs. Unfortunately, many results have been negative, establishing that a wide variety of problems require $\Omega(n^2)$ space to solve. One of the few bright spots has been the development of semistreaming algorithms for a handful of graph problems  these algorithms use space $O(n\cdot\text{polylog}(n))$.
In the annotated data streaming model of Chakrabarti et al., a computationally limited client wants to compute some property of a massive input, but lacks the resources to store even a small fraction of the input, and hence cannot perform the desired computation locally. The client therefore accesses a powerful but untrusted service provider, who not only performs the requested computation, but also proves that the answer is correct.
We put forth the notion of semistreaming algorithms for annotated graph streams (semistreaming annotation schemes for short). These are protocols in which both the client's space usage and the length of the proof are $O(n \cdot \text{polylog}(n))$. We give evidence that semistreaming annotation schemes represent a substantially more robust solution concept than does the standard semistreaming model. On the positive side, we give semistreaming annotation schemes for two dynamic graph problems that are intractable in the standard model: (exactly) counting triangles, and (exactly) computing maximum matchings. The former scheme answers a question of Cormode. On the negative side, we identify for the first time two natural graph problems (connectivity and bipartiteness in a certain edge update model) that can be solved in the standard semistreaming model, but cannot be solved by annotation schemes of "subsemistreaming" cost. That is, these problems are just as hard in the annotations model as they are in the standard model.

Space Lower Bounds for Itemset Frequency Sketches
Edo Liberty,
Michael Mitzenmacher,
Justin Thaler, and
Jonathan Ullman,
PODS 2016
Links: [arXiv]
Abstract:
Given a database, computing the fraction of rows that contain a query itemset or determining whether this fraction is above some threshold are fundamental operations in data mining. A uniform sample of rows is a good sketch of the database in the sense that all sufficiently frequent itemsets and their approximate frequencies are recoverable from the sample, and the sketch size is independent of the number of rows in the original database. For many seemingly similar problems there are better sketching algorithms than uniform sampling. In this paper we show that for itemset frequency sketching this is not the case. That is, we prove that there exist classes of databases for which uniform sampling is a space optimal sketch for approximate itemset frequency analysis, up to constant or iteratedlogarithmic factors.

A Framework for Estimating Stream Expression Cardinalities
Anirban Dasgupta, Kevin Lang, Lee Rhodes, and Justin Thaler
ICDT 2016; Best Newcomer Paper Award; Invited to special issue of TODS for ICDT 2016
Links: [arXiv]
Abstract:
Given $m$ distributed data streams $A_1,\dots,A_m$, we consider the problem of estimating the number of unique identifiers in streams defined by set expressions over $A_1,\dots,A_m$. We identify a broad class of algorithms for solving this problem, and show that the estimators output by any algorithm in this class are perfectly unbiased and satisfy strong variance bounds. Our analysis unifies and generalizes a variety of earlier results in the literature. To demonstrate its generality, we describe several novel sampling algorithms in our class, and show that they achieve a novel tradeoff between accuracy, space usage, update speed, and applicability.

2015 

Streaming Verification in Data Analysis
Samira Daruki, Justin Thaler, and Suresh Venkatasubramanian
ISAAC 2015
Links: [arXiv]
Abstract:
Streaming interactive proofs (SIPs) are a framework to reason about outsourced computation, where a data owner (the verifier) outsources a computation to the cloud (the prover), but wishes to verify the correctness of the solution provided by the cloud service. In this paper we present streaming interactive proofs for problems in data analysis. We present protocols for clustering and shape fitting problems, as well as an improved protocol for rectangular matrix multiplication. The latter can in turn be used to verify $k$ eigenvectors of a (streamed) $n \times n$ matrix. In general our solutions use polylogarithmic rounds of communication and polylogarithmic total communication and verifier space. For special cases (when optimality certificates can be verified easily), we present constant round protocols with similar costs. For rectangular matrix multiplication and eigenvector verification, our protocols work in the more restricted annotated data streaming model, and use sublinear (but not polylogarithmic) communication.

Variable Selection is Hard
Dean Foster, Howard Karloff, and Justin Thaler
COLT 2015
Links: [arXiv] [Slides (Short Talk)]
Abstract:
Variable selection for sparse linear regression is the problem of finding, given an $m \times p$ matrix $B$ and a target vector $y$, a sparse vector $x$ such that $B \cdot x$ approximately equals $y$. Assuming a standard complexity hypothesis, we show that no polynomialtime algorithm can find a $k'$sparse $x$ with $\Bxy\^2 \leq h(m,p)$, where $k'=k \cdot 2^{\log^{1\delta} p}$ and $h(m,p) \leq p^{C_1} \cdot m^{1C_2}$, where $\delta>0, C_1>0,C_2>0$ are arbitrary. This is true even under the promise that there is an unknown $k$sparse vector $x^*$ satisfying $Bx^*=y$. We prove a similar result for a statistical version of the problem in which the data are corrupted by noise.
To the authors' knowledge, these are the first hardness results for sparse regression that apply when the algorithm simultaneously has $k'>k$ and $h(m,p)>0$.

Hardness Amplification and the Approximate Degree of ConstantDepth Circuits
Mark Bun and
Justin Thaler
ICALP 2015
Links: [ECCC]
[Slides]
[Talk Video]
Abstract:
We establish a generic form of hardness amplification for the approximability of constantdepth Boolean circuits by polynomials. Specifically, we show that if a Boolean circuit cannot be pointwise approximated by lowdegree polynomials to within constant error in a certain onesided sense, then an OR of disjoint copies of that circuit cannot be pointwise approximated even with very high error. As our main application, we show that for every sequence of degrees $d(n)$, there is an explicit depththree circuit $F: \{1,1\}^n \to \{1,1\}$ of polynomialsize such that any degree$d$ polynomial cannot pointwise approximate $F$ to error better than $1\exp\left(\tilde{\Omega}(nd^{3/2})\right)$. As a consequence of our main result, we obtain an $\exp\left(\tilde{\Omega}(n^{2/5})\right)$ upper bound on the the discrepancy of a function in AC$^0$, and an $\exp\left(\tilde{\Omega}(n^{2/5})\right)$ lower bound on the threshold weight of AC$^0$, improving over the previous best results of $\exp\left(\Omega(n^{1/3})\right)$ and $\exp\left(\Omega(n^{1/3})\right)$ respectively.
Our techniques also yield a new lower bound of $\Omega\left(n^{1/2}/\log^{(d2)/2}(n)\right)$ on the approximate degree of the ANDOR tree of depth $d$, which is tight up to polylogarithmic factors for any constant $d$, as well as new bounds for readonce DNF formulas. In turn, these results imply new lower bounds on the communication and circuit complexity of these classes, and demonstrate strong limitations on existing PAC learning algorithms.

Verifiable Stream Computation and ArthurMerlin Communication
Amit Chakrabarti, Graham Cormode, Andrew McGregor, Justin Thaler, and Suresh Venkatasubramanian
CCC 2015, SICOMP
Links: [ECCC] [Slides] [Talk Video]
Abstract:
In the setting of streaming interactive proofs (SIPs), a client (verifier) needs to compute a given function on a massive stream of data, arriving online, but is unable to store even a small fraction of the data. It outsources the processing to a third party service (prover), but is unwilling to blindly trust answers returned by this service. Thus, the service cannot simply supply the desired answer; it must convince the verifier of its correctness via a short interaction after the stream has been seen.
In this work we study "barely interactive" SIPs. Specifically, we show that two or three rounds of interaction suffice to solve several query problems  including Index, Median, Nearest Neighbor Search, Pattern Matching, and Range Countingwith polylogarithmic space and communication costs. Such efficiency with $O(1)$ rounds of interaction was thought to be impossible based on previous work.
On the other hand, we initiate a formal study of the limitations of constantround SIPs by introducing a new hierarchy of communication models called Online Interactive Proofs (OIPs). The online nature of these models is analogous to the streaming restriction placed upon the verifier in an SIP. We give upper and lower bounds that (1) characterize, up to quadratic blowups, every finite level of the OIP hierarchy in terms of other wellknown communication complexity classes, (2) separate the first four levels of the hierarchy, and (3) reveal that the hierarchy collapses to the fourth level. Our study of OIPs reveals marked contrasts and some parallels with the classic Turing Machine theory of interactive proofs, establishes limits on the power of existing techniques for developing constantround SIPs, and provides a new characterization of (nononline) ArthurMerlin communication in terms of an online model.

A Note on the GKR Protocol
Justin Thaler
Manuscript
Links: [pdf]
Abstract:
This note describes a simplification of the GKR interactive proof for circuit evaluation (Goldwasser, Kalai, and Rothblum, J. ACM 2015), as efficiently instantiated by Cormode, Mitzenmacher, and Thaler (ITCS 2012). The simplification reduces the prover runtime, round complexity, and total communication cost of the protocol by roughly 33%.

Verifiable Computation Using Multiple Provers
Andrew J. Blumberg, Justin Thaler, Victor Vu, and Michael Walfish
Manuscript
Links: [eprint]
Abstract:
The increasing ubiquity of the cloud computing paradigm has renewed focus on the classical problem of allowing weak clients to check the results of computation delegated to powerful servers. Recent advances in proofbased verifiable computation have led to several nearpractical protocols. Protocols based on interactive proofs (IPs) work with highly restrictive models of computation and are thus efficient only for a limited class of computations. In contrast, protocols based on argument systems apply to a much larger class of computations, but efficiency requires amortization of very expensive setup costs.
This paper initiates the study of the practical efficiency of multiprover interactive proofs (MIPs). We present a new MIP for delegating computation that extends insights from a powerful IP protocol (Goldwasser et al., STOC, 2008). Without reductions or amplification, our protocol uses only two provers (departing from prior work on MIPs), and achieves both the efficiency of interactive proofbased protocols and the generality of argument systembased protocols. Also, this result, together with recently developed machinery, creates a potential avenue toward concretely efficient arguments without setup costs.
We describe Clover, a built system for verifiable computation, based on our protocol. Although Clover does not implement the full theory (it has setup costs), it applies to problems that existing IPs cannot efficiently handle, and achieves performance comparable to, or better than, the best argument systems.

2014 

DistributionIndependent Reliable Learning
Varun Kanade and
Justin Thaler
COLT 2014
Links: [arXiv]
Abstract:
We study several questions in the reliable agnostic learning framework of Kalai et al. (2009), which captures learning tasks in which one type of error is costlier than other types. A positive reliable classifier is one that makes no false positive errors. The goal in the positive reliable agnostic framework is to output a hypothesis with the following properties: (i) its false positive error rate is at most $\epsilon$, (ii) its false negative error rate is at most $\epsilon$ more than that of the best positive reliable classifier from the class. A closely related notion is fully reliable agnostic learning, which considers partial classifiers that are allowed to predict ``unknown'' on some inputs. The best fully reliable partial classifier is one that makes no errors and minimizes the probability of predicting ``unknown'', and the goal in fully reliable learning is to output a hypothesis that is almost as good as the best fully reliable partial classifier from a class.
For distributionindependent learning, the best known algorithms for PAC learning typically utilize polynomial threshold representations, while the state of the art agnostic learning algorithms use pointwise polynomial approximations. We show that onesided polynomial approximations, an intermediate notion between polynomial threshold representations and pointwise polynomial approximations, suffice for learning in the reliable agnostic settings. We then show that majorities can be fully reliably learned and disjunctions of majorities can be positive reliably learned, through constructions of appropriate onesided polynomial approximations. Our fully reliable algorithm for majorities provides the first evidence that fully reliable learning may be strictly easier than agnostic learning. Our algorithms also satisfy strong attributeefficiency properties, and in many cases they provide smooth tradeoffs between sample complexity and running time.

Parallel Peeling Algorithms
Jiayang Jiang, Michael Mitzenmacher, and Justin Thaler
SPAA 2014; Winner of Best Paper Award; full version in TOPC (special issue for SPAA 2014)
Links:
[arXiv] [slides] [Talk Video]
Abstract:
The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than $k$ are removed until there are no vertices of degree less than $k$ left. The remaining hypergraph is known as the $k$core. In this paper, we analyze parallel peeling processes, where in each round, all vertices of degree less than k are removed. It is known that, below a specific edge density threshold, the $k$core is empty with high probability. We show that, with high probability, below this threshold, only $(\log \log n)/\log(k1)(r1) + O(1)$ rounds of peeling are needed to obtain the empty $k$core for $r$uniform hypergraphs. Interestingly, we show that above this threshold, $\Omega(\log n)$ rounds of peeling are required to find the nonempty $k$core. Since most algorithms and data structures aim to peel to an empty $k$core, this asymmetry appears fortunate. We verify the theoretical results both with simulation and with a parallel implementation using graphical processing units (GPUs). Our implementation provides insights into how to structure parallel peeling algorithms for efficiency in practice.

Annotations in Data Streams
Amit Chakrabarti,
Graham Cormode,
Andrew McGregor, and Justin Thaler
ACM Transactions on Algorithms
Links: [ECCC]
Note: This paper supersedes a preliminary version by Chakrabarti, Cormode, and McGregor that appeared in ICALP 2009.
Abstract:
The central goal of data stream algorithms is to process massive streams of data
using sublinear storage space. Motivated by work in the database community on
outsourcing database and data stream processing, we ask whether the space
usage of such algorithms can be further reduced by enlisting a more powerful "helper"
who can annotate the stream as it is read. We do not wish to blindly trust the helper,
so we require that the algorithm be convinced of having computed a correct answer.
We show upper bounds that achieve a nontrivial tradeoff between the amount of
annotation used and the space required to verify it. We also prove lower bounds
on such tradeoffs, often nearly matching the upper bounds, via notions related to
MerlinArthur communication complexity. Our results cover the classic data stream
problems of selection, frequency moments, and fundamental graph problems such as
trianglefreeness and connectivity. Our work is also part of a growing trendincluding recent
studies of multipass streaming, read/write streams and randomly ordered streamsof asking
more complexitytheoretic questions about data stream processing. It is a recognition that,
in addition to practical relevance, the data stream model raises many interesting theoretical
questions in its own right.

Faster Private Release of Marginals on Small Databases
Karthekeyan Chandrasekaran,
Justin Thaler,
Jonathan Ullman, and
Andrew Wan
ITCS 2014
Links: [arXiv]
Abstract:
We study the problem of answering $k$way marginal queries on a
database $D \in (\{0,1\}^d)^n$, while preserving differential privacy. The
answer to a $k$way marginal query is the fraction of the database's records
$x \in \{0,1\}^d$ with a given value in each of a given set of up to $k$
columns. Marginal queries enable a rich class of statistical analyses on a
dataset, and designing efficient algorithms for privately answering marginal
queries has been identified as an important open problem in private data
analysis. For any $k$, we give a differentially private online algorithm that
runs in time $$\mathrm{poly}\left(n, 2^{o(d)} \right)$$ per query and answers
any sequence of $\mathrm{poly}(n)$ many $k$way marginal queries with error at
most $\pm 0.01$ on every query, provided $n \gtrsim d^{0.51} $. To the best of
our knowledge, this is the first algorithm capable of privately answering
marginal queries with a nontrivial worstcase accuracy guarantee for databases
containing $\mathrm{poly}(d, k)$ records in time $\exp(o(d))$. Our algorithm
runs the private multiplicative weights algorithm (Hardt and Rothblum, FOCS '10)
on a new approximate polynomial representation of the database.
We derive our representation for the database by approximating the OR
function restricted to low Hamming weight inputs using lowdegree polynomials
with coefficients of bounded $L_1$norm. In doing so, we show new upper and
lower bounds on the degree of such polynomials, which may be of independent
approximationtheoretic interest.

Annotations for Sparse Data Streams
Amit Chakrabarti,
Graham Cormode,
Navin Goyal.
and Justin Thaler
SODA 2014
Links:[arXiv] [Slides]
Abstract:
Motivated by cloud computing, a number of recent works have studied annotated data streams and variants thereof. In this setting, a computationally weak verifier (cloud user), lacking the resources to store and manipulate his massive input locally, accesses a powerful but untrusted prover (cloud service). The verifier must work within the restrictive data streaming paradigm. The prover, who can annotate the data stream as it is read, must not just supply the answer but also convince the verifier of its correctness. Ideally, both the amount of annotation and the space used by the verifier should be sublinear in the relevant input size parameters.
A rich theory of such algorithmswhich we call schemeshas emerged. Prior work has shown how to leverage the prover's power to efficiently solve problems that have no nontrivial standard data stream algorithms. However, while optimal schemes are now known for several basic problems, such optimality holds only for streams whose length is commensurate with the size of the data universe. In contrast, many realworld datasets are relatively sparse, including graphs that contain only $O(n^2)$ edges, and IP traffic streams that contain much fewer than the total number of possible IP addresses, $2^{128}$ in IPv6.
We design the first schemes that allow both the annotation and the space usage to be sublinear in the total number of stream updates rather than the size of the data universe. We solve significant problems, including variations of INDEX, SETDISJOINTNESS, and FREQUENCYMOMENTS, plus several natural problems on graphs. On the other hand, we give a new lower bound that, for the first time, rules out smooth tradeoffs between annotation and space usage for a specific problem. Our technique brings out new nuances in MerlinArthur communication complexity models, and provides a separation between online versions of the MA and AMA models.

2013 

TimeOptimal Interactive Proofs for Circuit Evaluation
Justin Thaler
CRYPTO 2013
Links: [Extended Abstract] [Full Paper (arXiv)] [Source Code] [Slides] [Talk Video]
Abstract:
Recently, researchers have been working toward the development of practical generalpurpose protocols for verifiable computation. These protocols enable a computationally weak verifier to offload computations to a powerful but untrusted prover, while providing the verifier with a guarantee that the prover performed the computations correctly. Despite substantial progress, existing implementations are not yet practical. The main bottleneck is typically the extra effort required by the prover to return an answer with a guarantee of correctness, compared to returning an answer with no guarantee.
We describe a refinement of a powerful interactive proof protocol originally due to Goldwasser, Kalai, and Rothblum. Cormode, Mitzenmacher, and Thaler show how to implement the prover in this protocol in time $O(S \log S)$, where $S$ is the size of an arithmetic circuit computing the function of interest. Our refinements apply to circuits whose wiring pattern is sufficiently "regular"; for these circuits, we bring the runtime of the prover down to $O(S)$. That is, our prover can evaluate the circuit with a guarantee of correctness, with only a constantfactor blowup in work compared to evaluating the circuit with no guarantee.
We argue that our refinements capture a large class of circuits, and prove some theorems formalizing this. Experimentally, our refinements yield a 200x speedup for the prover over the implementation of Cormode et al., and our prover is less than 10x slower than a C++ program that simply evaluates the circuit. Along the way, we describe a specialpurpose protocol for matrix multiplication that is of interest in its own right.
Our final contribution is a protocol targeted at general data parallel computation. Compared to prior work, this protocol can more efficiently verify complicated computations as long as that computation is applied independently to many pieces of data.

Dual Lower Bounds for Approximate Degree and MarkovBernstein Inequalities
Mark Bun and
Justin Thaler
ICALP 2013; Winner of Best Paper award for Track A; Journal version in Information and Computation (special issue for ICALP 2013)
Links: [arXiv] [Expository Blog Post]
Abstract:
The $\epsilon$approximate degree of a Boolean function $f: \{1, 1\}^n \to \{1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within $\epsilon$ in the $\ell_\infty$ norm. We prove several lower bounds on this important complexity measure by explicitly constructing solutions to the dual of an appropriate linear program. Our first result resolves the $\epsilon$approximate degree of the twolevel ANDOR tree for any constant $\epsilon > 0$. We show that this quantity is $\Theta(\sqrt{n})$, closing a line of incrementally larger lower bounds. The same lower bound was recently obtained independently by Sherstov using related techniques. Our second result gives an explicit dual polynomial that witnesses a tight lower bound for the approximate degree of any symmetric Boolean function, addressing a question of \v{S}palek. Our final contribution is to reprove several Markovtype inequalities from approximation theory by constructing explicit dual solutions to natural linear programs. These inequalities underly the proofs of many of the bestknown approximate degree lower bounds, and have important uses throughout theoretical computer science.

2012 

Faster Algorithms for Privately Releasing Marginals
Justin Thaler,
Jonathan Ullman, and
Salil Vadhan
ICALP 2012
Links: [arXiv]
Abstract:
We study the problem of releasing $k$way marginals of a
database $D \in (\{0,1\}^d)^n$, while preserving differential privacy.
The answer to a $k$way marginal query is the fraction of $D$'s records
$x \in \{0,1\}^d$ with a given value in each of a given set of up to $k$
columns. Marginal queries enable a rich class of statistical analyses of
a dataset, and designing efficient algorithms for privately releasing
marginal queries has been identified as an important open problem in
private data analysis (cf.~Barak et.~al., PODS '07).
We give an algorithm that runs in time $d^{O(\sqrt{k})}$ and releases
a private summary capable of answering any $k$way marginal query with at
most $\pm .01$ error on every query as long as $n \geq d^{O(\sqrt{k})}$.
To our knowledge, ours is the first algorithm capable of privately releasing
marginal queries with nontrivial worstcase accuracy guarantees in time
substantially smaller than the number of $k$way marginal queries, which
is $d^{\Theta(k)}$ (for $k \ll d$).

AttributeEfficient Learning and WeightDegree Tradeoffs for Polynomial Threshold Functions
LiYang Tan,
Rocco Servedio,
and Justin Thaler
COLT 2012
Links: [ECCC version] [Expository Blog Post] [Slides (20 Minute Presentation)] [LiYang's Slides (50 Minute Presentation)]
Abstract:
We study the challenging problem of learning decision lists
attributeefficiently, giving both positive and negative results.
Our main positive result is a
new tradeoff between the running time and
mistake bound for learning length$k$ decision
lists over $n$ Boolean variables. When the allowed running time
is relatively high, our new mistake bound improves significantly
on the mistake bound of the best previous algorithm of Klivans and Servedio.
Our main negative result is a new lower bound on the weight of any
degree$d$ polynomial threshold function (PTF) that computes a particular
decision list over $k$ variables (the ``OddMaxBit'' function). The main result of
Beigel (Computational Complexity, 1994) is a weight lower bound of $2^{\Omega(k/d^2)}$,
which was shown to be essentially optimal for $d \leq k^{1/3}$ by Klivans and Servedio.
Here we prove a $2^{\Omega(\sqrt{k/d})}$ lower bound, which improves on Beigel's lower bound for
$d > k^{1/3}.$ This lower bound establishes strong limitations on the effectiveness of the
Klivans and Servedio approach and suggests that it may be difficult to improve on our positive result.
The main tool used in our lower bound is a new variant of Markov's classical inequality which may
be of independent interest; it provides a bound on the derivative of a univariate polynomial in terms of both its
degree and the size of its coefficients.

Peeling Arguments and Double Hashing
Michael Mitzenmacher,
and Justin Thaler
Allerton 2012 (Invited Paper)
Links: [pdf]
Abstract:
The analysis of several algorithms and data structures can be reduced to the analysis of the following greedy
"peeling" process: start with a random hypergraph; find a vertex of degree at most $k$, and
remove it and all of its adjacent hyperedges from the graph; repeat until there is no suitable vertex.
This specific process finds the $k$core of a hypergraph, and variations on this theme have proven useful
in analyzing for example decoding from lowdensity paritycheck codes,
several hashbased data structures such as cuckoo hashing, and algorithms
for satisfiability of random formulae. This approach can be analyzed several
ways, with two common approaches being via a corresponding branching process
or a fluid limit family of differential equations.
In this paper, we make note of an interesting aspect of these types of processes:
the results are generally the same when the randomness is structured in the manner of
double hashing. This phenomenon allows us to use less randomness and simplify the
implementation for several hashbased data structures and algorithms. We explore this
approach from both an empirical and theoretical perspective, examining theoretical
justifications as well as simulation results for specific problems.

Verifiable Computation with Massively Parallel Interactive Proofs
Justin Thaler, Mike Roberts, Michael Mitzenmacher,
and Hanspeter Pfister
HotCloud 2012
Links: [arXiv] [source code] [extended abstract] [Talk Video] [Slides]
Abstract:
As the cloud computing paradigm has gained prominence, the need for verifiable computation has grown increasingly urgent. The concept of verifiable computation enables a weak client to outsource difficult computations to a powerful, but untrusted, server. Protocols for verifiable computation aim to provide the client with a guarantee that the server performed the requested computations correctly, without requiring the client to perform the computations herself. By design, these protocols impose a minimal computational burden on the client. However, existing protocols require the server to perform a large amount of extra bookkeeping in order to enable a client to easily verify the results. Verifiable computation has thus remained a theoretical curiosity, and protocols for it have not been implemented in real cloud computing systems.
Our goal is to leverage GPUs to reduce the serverside slowdown for verifiable computation. To this end, we identify abundant data parallelism in a stateoftheart generalpurpose protocol for verifiable computation, originally due to Goldwasser, Kalai, and Rothblum, and recently extended by Cormode, Mitzenmacher, and Thaler. We implement this protocol on the GPU, obtaining 40120x serverside speedups relative to a stateoftheart sequential implementation. For benchmark problems, our implementation reduces the slowdown of the server to factors of 100500x relative to the original computations requested by the client. Furthermore, we reduce the already small runtime of the client by 100x. Similarly, we obtain 2050x serverside and clientside speedups for related protocols targeted at specific streaming problems. We believe our results demonstrate the immediate practicality of using GPUs for verifiable computation, and more generally that protocols for verifiable computation have become sufficiently mature to deploy in real cloud computing systems.

Continuous Time Channels with Interference
Ioanna Ivan, Michael Mitzenmacher, Justin Thaler, and Henry Yuen
ISIT 2012
Links: [arXiv]
Abstract:
Khanna and Sudan studied a natural model of continuous time channels where signals are corrupted by the effects of both noise and delay, and showed that, surprisingly, in some cases both are not enough to prevent such channels from achieving unbounded capacity. Inspired by their work, we consider channels that model continuous time communication with adversarial delay errors. The sender is allowed to subdivide time into an arbitrarily large number $M$ of microunits in which binary symbols may be sent, but the symbols are subject to unpredictable delays and may interfere with each other. We model interference by having symbols that land in the same microunit of time be summed, and we study $k$interference channels, which allow receivers to distinguish sums up to the value $k$. We consider both a channel adversary that has a limit on the maximum number of steps it can delay each symbol, and a more powerful adversary that only has a bound on the average delay.
We give precise characterizations of the threshold between finite and infinite capacity depending on the interference behavior and on the type of channel adversary: for maxbounded delay, the threshold is at $D_{\text{max}}=\Theta(M \log\min{k, M}))$, and for average bounded delay the threshold is at $D_{\text{avg}} = \Theta(\sqrt{M \cdot \min\{k, M\}})$.

Hierarchical Heavy Hitters with the Space Saving Algorithm
Michael Mitzenmacher, Thomas Steinke, and Justin Thaler
ALENEX 2012
Links: [arXiv] [source code]
Abstract:
The Hierarchical Heavy Hitters problem extends the notion of frequent items to data arranged in a hierarchy. This problem has applications to network traffic monitoring, anomaly detection, and DDoS detection. We present a new streaming approximation algorithm for computing Hierarchical Heavy Hitters that has several advantages over previous algorithms. It improves on the worstcase time and space bounds of earlier algorithms, is conceptually simple and substantially easier to implement, offers improved accuracy guarantees, is easily adopted to a distributed or parallel setting, and can be efficiently implemented in commodity hardware such as ternary content addressable memory (TCAMs). We present experimental results showing that for parameters of primary practical interest, our twodimensional algorithm is superior to existing algorithms in terms of speed and accuracy, and competitive in terms of space, while our onedimensional algorithm is also superior in terms of speed and accuracy for a more limited range of parameters.

Practical Verified Computation with Streaming Interactive Proofs
Graham Cormode, Michael Mitzenmacher, and Justin Thaler
ITCS 2012
Links: [arXiv] [source code] [slides]
Abstract:
When delegating computation to a service provider, as in cloud computing, we seek some reassurance that the output is correct and complete. Yet recomputing the output as a check is inefficient and expensive, and it may not even be feasible to store all the data locally. We are therefore interested in proof systems which allow a service provider to prove the correctness of its output to a streaming (sublinear space) user, who cannot store the full input or perform the full computation herself.
Our approach is twofold. First, we describe a carefully chosen instantiation of one of the most efficient generalpurpose constructions for arbitrary computations (streaming or otherwise), due to Goldwasser, Kalai, and Rothblum. This requires several new insights to make the methodology more practical. Our main contribution is in achieving a prover who runs in time $O(S(n) \log S(n))$, where $S(n)$ is the size of an arithmetic circuit computing the function of interest. Our experimental results demonstrate that a practical generalpurpose protocol for verifiable computation may be significantly closer to reality than previously realized.
Second, we describe techniques that achieve genuine scalability for protocols finetuned for specific important problems in streaming and database processing. Focusing in particular on noninteractive protocols for problems ranging from matrixvector multiplication to bipartite perfect matching, we build on prior work to achieve a prover who runs in nearly lineartime, while obtaining optimal tradeoffs between communication cost and the user's working memory. Existing techniques required (substantially) superlinear time for the prover. We argue that even if generalpurpose methods improve, finetuned protocols will remain valuable in realworld settings for key problems, and hence special attention to specific problems is warranted.

Verifying Computations with Streaming Interactive Proofs
Graham Cormode, Justin Thaler, and Ke Yi
VLDB 2012
Links: [arXiv] [Conference slides]
Abstract:
When computation is outsourced, the data owner would like to be assured that the desired computation has been performed correctly by the service provider. In theory, proof systems can give the necessary assurance, but prior work is not sufficiently scalable or practical. In this paper, we develop new proof protocols for verifying computations which are streaming in nature: the verifier (data owner) needs only logarithmic space and a single pass over the input, and after observing the input follows a simple protocol with a prover (service provider) that takes logarithmic communication spread over a logarithmic number of rounds. These ensure that the computation is performed correctly: that the service provider has not made any errors or missed out some data. The guarantee is very strong: even if the service provider deliberately tries to cheat, there is only vanishingly small probability of doing so undetected, while a correct computation is always accepted. We first observe that some theoretical results can be modified to work with streaming verifiers, showing that there are efficient protocols for problems in the complexity classes NP and NC. Our main results then seek to bridge the gap between theory and practice by developing usable protocols for a variety of problems of central importance in streaming and database processing. All these problems require linear space in the traditional streaming model, and therefore our protocols demonstrate that adding a prover can exponentially reduce the effort needed by the verifier. Our experimental results show that our protocols are practical and scalable.

CacheOblivious Dictionaries and Multimaps with Negligible Failure Probability
Michael T. Goodrich, Dan Hirschberg, Michael Mitzenmacher, and Justin Thaler
MedAlg 2012
Links: [pdf] [slides]
Abstract:
A dictionary (or map) is a keyvalue store that requires all keys be unique, and a multimap is a keyvalue store that allows for multiple values to be associated with the same key. We design hashing based indexing schemes for dictionaries and multimaps that achieve worstcase optimal performance for lookups and updates, with minimal space overhead and subpolynomial probability that the data structure will require a rehash operation. Our dictionary structure is designed for the Random Access Machine (RAM) model, while our multimap implementation is designed for the cacheoblivious external memory (I/O) model. The failure probabilities for our structures are subpolynomial, which can be useful in cryptographic or dataintensive applications.

2011 

On the ZeroError Capacity Threshold for Deletion Channels
Ian Kash,
Michael Mitzenmacher,
Justin Thaler, and
Jonathan Ullman
ITA 2011
Links: [arXiv]
Abstract:
We consider the zeroerror capacity of deletion channels. Specifically,
we consider the setting where we choose a codebook $C$ consisting of strings
of n bits, and our model of the channel corresponds to an adversary who may
delete up to $pn$ of these bits for a constant $p$. Our goal is to decode
correctly without error regardless of the actions of the adversary. We
consider what values of $p$ allow nonzero capacity in this setting. We
suggest multiple approaches, one of which makes use of the natural
connection between this problem and the problem of finding the expected
length of the longest common subsequence of two random sequences.

External Memory Multimaps
Elaine Angelino,
Michael Mitzenmacher, and
Justin Thaler
ISAAC 2011; Algorithmica (special issue for ISAAC 2011)
Links:[arXiv]
Abstract:
Many data structures support dictionaries, also known as maps or associative arrays, which store and manage a set of keyvalue pairs. A \emph{multimap} is generalization that allows multiple values to be associated with the same key. For example, the inverted file data structure that is used prevalently in the infrastructure supporting search engines is a type of multimap, where words are used as keys and document pointers are used as values. We study the multimap abstract data type and how it can be implemented efficiently online in external memory frameworks, with constant expected I/O performance. The key technique used to achieve our results is a combination of cuckoo hashing using buckets that hold multiple items with a multiqueue implementation to cope with varying numbers of values per key. Our externalmemory results are for the standard twolevel memory model.

2010 

Streaming Graph Computations with a Helpful Advisor
Graham Cormode, Michael Mitzenmacher, and Justin Thaler
ESA 2010; Algorithmica 2013
Links: [ECCC] [Conference Slides]
Abstract:
Motivated by the trend to outsource work to commercial cloud computing services,
we consider a variation of the streaming paradigm where a streaming algorithm
can be assisted by a powerful helper that can provide annotations to the data stream.
We extend previous work on such annotation models by considering a number of graph
streaming problems. Without annotations, streaming algorithms for graph problems
generally require significant memory; we show that for many standard problems,
including all graph problems that can be expressed with totally unimodular
integer programming formulations, only constant space (measured in words)
is needed for singlepass algorithms given linearsized annotations. We
also obtain protocols achieving essentially optimal tradeoffs between
annotation length and memory usage for several important problems,
including integer matrixvector multiplication, as well as shortest $s$$t$
path in smalldiameter graphs. We also obtain nontrivial tradeoffs for
minimum weight bipartite perfect matching and shortest $s$$t$ path in general graphs.

2009 

Graph Covers and Quadratic Minimization
Nicholas Ruozzi, Sekhar Tatikonda, and Justin Thaler
Allerton 2009
Links: [pdf]
[slides]
Abstract:
We formulate a new approach to understanding the behavior of the minsum algorithm
by exploiting the properties of graph covers. First, we present a new, natural
characterization of scaled diagonally dominant matrices in terms of graph covers;
this result motivates our approach because scaled diagonal dominance is a known
sufficient condition for the convergence of minsum in the case of quadratic minimization.
We use our understanding of graph covers to characterize the periodic behavior of the
minsum algorithm on a single cycle. Lastly, we explain how to extend the single
cycle results to understand the 2periodic behavior of minsum for general pairwise MRFs.
Some of our techniques apply more broadly, and we believe that by capturing the notion
of indistinguishability, graph covers represent a valuable tool for understanding
the abilities and limitations of general messagepassing algorithms.

Expository Articles 
