Justin Thaler

Associate Professor
Georgetown University
Department of Computer Science
St. Mary's Hall, Room 354
3700 Reservoir Road NW
Washington, DC 20057

e-mail: justin [dot] thaler [at] georgetown [dot] edu

About Me
Teaching
Publications
Service
Open Source Software
CV

About Me

I am broadly interested in algorithms and computational complexity, primarily focusing on the following three research goals.

Understanding the power of low-degree polynomials to approximate Boolean functions. Answering these questions has a variety of applications, especially to quantum computing, learning theory, and computational complexity theory.
Designing protocols for proving the correctness of computations (possibly in zero-knowlege), in which the prover and verifier are highly efficient both in theory and in practice.
Developing efficient streaming and sketching algorithms for basic tasks that are often performed on large data sets.

I am an Associate Professor at Georgetown University and a Research Partner at a16z crypto research. Prior to joining Georgetown, I spent two years as a Research Scientist at Yahoo Labs in New York. Before that, I was a Research Fellow at the Simons Institute for the Theory of Computing at UC Berkeley. I received my Ph.D. from the Theory of Computation Group at Harvard University, where I was fortunate to be advised by Michael Mitzenmacher, and I graduated from Yale University in 2009 with a B.S. in Computer Science and a second major in Mathematics.

I am currently at a16z crypto research and on leave from Georgetown.

Teaching

COSC 544 -- Probabilistic Proof Systems (Fall 2020, Fall 2017)
COSC 547 -- Analytic Techniques in Computer Science (Fall 2022)
COSC 548 -- Streaming Algorithms (Fall 2018, Fall 2016)
ANLY 550 -- Structures and Algorithms for Analytics (Spring 2019, Spring 2018, Spring 2017)
ANLY 558 -- Advanced Algorithms for Analytics (Spring 2021)

Current and Former Students and Postdocs

Shuchen Zhu (PhD Student)
Sidhant Saraogi (PHD Student, co-advised with Sasha Golovnev)
Alex Block (postdoc)
Prantar Ghosh (postdoc)
Nikhil Mande (former postdoc, now a postdoc at CWI)

Research Papers

2023	The Sum-Check Protocol over Fields of Small Characteristic Justin Thaler Manuscript Links: [pdf] Abstract: The sum-check protocol of Lund, Fortnow, Karloff, and Nisan underlies SNARKs with the fastest known prover. In many of its applications, the prover can be implemented with a number of field operations that is linear in the number, n, of terms being summed. We describe an optimized prover implementation when the protocol is applied over an extension field of a much smaller base field. The rough idea is to keep most of the prover’s multiplications over the base field (at the cost of performing more total field multiplications). When the sum-check protocol is applied to a product of polynomials that all output values in the base field, our algorithm reduces the number of extension field operations by multiple orders of magnitude. In other settings, our improvements are more modest but nonetheless meaningful. In SNARK design, the sum-check protocol is often combined with a polynomial commitment scheme, which are growing faster, especially when the values being committed are small. These improved commitment schemes may render the sum-check prover the overall bottleneck, which our results help to mitigate. BabySpartan: Lasso-based SNARK for non-uniform computation Srinath Setty, and Justin Thaler Manuscript Links: [e-print] Abstract: Lasso (Setty, Thaler, Wahby, ePrint 2023/1216) is a recent lookup argument that ensures that the prover cryptographically commits to only "small" values. This note describes BabySpartan, a SNARK for a large class of constraint systems that achieves the same property. The SNARK is a simple combination of SuperSpartan and Lasso. The specific class of constraint systems supported is a generalization of so-called Plonkish constraint systems (and a special case of customizable constraint systems (CCS)). Whereas a recent work called Jolt (Arun, Setty, and Thaler, ePrint 2023/1217) can be viewed as an application of Lasso to uniform computation, BabySpartan can be viewed as applying Lasso to non-uniform computation. Jolt: SNARKs for Virtual Machines via Lookups Arasu Arun Srinath Setty, and Justin Thaler Manuscript Links: [e-print] Abstract: Succinct Non-interactive Arguments of Knowledge (SNARKs) allow an untrusted prover to establish that it correctly ran some "witness-checking procedure" on a witness. A zkVM (short for zero-knowledge Virtual Machine) is a SNARK that allows the witness-checking procedure to be specified as a computer program written in the assembly language of a specific instruction set architecture (ISA). A front-end converts computer programs into a lower-level representation such as an arithmetic circuit or generalization thereof. A SNARK for circuit-satisfiability can then be applied to the resulting circuit. We describe a new front-end technique called Jolt that applies to a variety of ISAs. Jolt arguably realizes a vision called the "lookup singularity", which seeks to produce circuits that only perform lookups into pre-determined lookup tables. The circuits output by Jolt primarily perform lookups into a gigantic lookup table, of size more than $2^{128}$, that depends only on the ISA. The validity of the lookups are proved via a new lookup argument called Lasso described in a companion work (Setty, Thaler, and Wahby, e-print 2023). Although size-$2^{128}$ tables are vastly too large to materialize in full, the tables arising in Jolt are structured, avoiding costs that grow linearly with the table size. We describe performance and auditability benefits of Jolt compared to prior zkVMs, focusing on the popular RISC-V ISA as a concrete example. The dominant cost for the Jolt prover applied to this ISA (on 64-bit data types) is cryptographically committing to about six 256-bit field elements per step of the RISC-V CPU. This compares favorably to prior zkVM provers, even those focused on far simpler VMs. Unlocking the lookup singularity with Lasso Srinath Setty, Justin Thaler, and Riad Wahby Manuscript Links: [e-print] Abstract: This paper introduces Lasso, a new family of lookup arguments, which allow an untrusted prover to commit to a vector $a \in \mathbb{F}^m$ and prove that all entries of $a$ reside in some predetermined table $t \in \mathbb{F}^n$. Lasso's performance characteristics unlock the so-called ``lookup singularity''. Lasso works with any multilinear polynomial commitment scheme, and provides the following efficiency properties. For $m$ lookups into a table of size $n$, Lasso's prover commits to just $m+n$ field elements. Moreover, the committed field elements are \emph{small}, meaning that, no matter how big the field $\mathbb{F}$ is, they are all in the set $\{0, \dots, m\}$. When using a multiexponentiation-based commitment scheme, this results in the prover's costs dominated by only $O(m+n)$ group \emph{operations} (e.g., elliptic curve point additions), plus the cost to prove an evaluation of a multilinear polynomial whose evaluations over the Boolean hypercube are the table entries. This represents a significant improvement in prover costs over prior lookup arguments (e.g., plookup, Halo2's lookups, lookup arguments based on logarithmic derivatives). Unlike all prior lookup arguments, if the table $t$ is structured (in a precise sense that we define), then no party needs to commit to $t$, enabling the use of much larger tables than prior works (e.g., of size $2^{128}$ or larger). Moreover, Lasso's prover only ``pays'' in runtime for table entries that are accessed by the lookup operations. This applies to tables commonly used to implement range checks, bitwise operations, big-number arithmetic, and even transitions of a full-fledged CPU such as RISC-V. Specifically, for any integer parameter $c>1$, Lasso's prover's dominant cost is committing to $3 \cdot c \cdot m + c \cdot n^{1/c}$ field elements. Furthermore, all these field elements are ``small'', meaning they are in the set $\{0, \dots, \max\{m, n^{1/c}, q\}-1\}$, where $q$ is the maximum value in $a$. Lasso's starting point is Spark, a time-optimal polynomial commitment scheme for sparse polynomials in Spartan (CRYPTO 2020). We first provide a stronger security analysis for Spark. Spartan's security analysis assumed that certain metadata associated with a sparse polynomial is committed by an honest party (this is acceptable for its purpose in Spartan, but not for Lasso). We prove that Spark remains secure even when that metadata is committed by a malicious party. This provides the first ``standard'' commitment scheme for sparse multilinear polynomials with optimal prover costs. We then generalize Spark to directly support a lookup argument for both structured and unstructured tables, with the efficiency characteristics noted above. Customizable constraint systems for succinct arguments Srinath Setty, Justin Thaler, and Riad Wahby Manuscript Links: [e-print] Abstract: This paper introduces customizable constraint system (CCS), a generalization of R1CS that can simultaneously capture R1CS, Plonkish, and AIR without overheads. Unlike existing descriptions of Plonkish and AIR, CCS is not tied to any particular proof system. Furthermore, we observe that the linear-time polynomial IOP for R1CS in Spartan (CRYPTO 20) extends easily to CCS, and when combined with a polynomial commitment scheme, it yields a family of SNARKs for CCS, which we refer to as SuperSpartan. SuperSpartan supports high-degree constraints without its prover incurring cryptographic costs that scale with the degree of constraints (only field operations scale with the constraint degree). Moreover, as in Spartan, it does not employ superlinear-time and hard-to-distribute operations such as FFTs. Similar properties were achieved for Plonkish by HyperPlonk (EUROCRYPT 23) via a different route. However, it is unclear how to prove CCS instances (or even R1CS instances) with HyperPlonk (or Plonk itself), without overheads. Furthermore, unlike HyperPlonk, SuperSpartan can prove uniform instances of CCS (including AIR) without requiring a linear-time preprocessing for the verifier, and for those instances, SuperSpartan provides “free” addition gates. SuperSpartan for AIR is the first SNARK for AIR with a linear-time prover, transparent and sublinear-time pre-processing, polylogarithmic proof size, and plausible post-quantum security. In particular, SuperSpartan for AIR provides a faster prover than existing transparent SNARKs for AIR (which are sometimes referred to as STARKs). Fiat-Shamir Security of FRI and Related SNARKs Alexander R. Block, Albert Garreta, Jonathan Katz, Justin Thaler, Pratyush Ranjan Tiwari, and Michał Zając Asiacrypt 2023 Links: [e-print] Abstract: We establish new results on the Fiat-Shamir (FS) security of several protocols that are widely used in practice, and we provide general tools for establishing similar results for others. More precisely, we: (1) prove the FS security of the FRI and batched FRI protocols; (2) analyze a general class of protocols, which we call $\delta$-correlated, that use low-degree proximity testing as a subroutine (this includes many ``Plonk-like'' protocols (e.g., Plonky2 and Redshift), ethSTARK, RISC Zero, etc.); and (3) prove FS security of the aforementioned "Plonk-like" protocols, and sketch how to prove the same for the others. We obtain our first result by analyzing the round-by-round (RBR) soundness and RBR knowledge soundness of FRI. For the second result, we prove that if a $\delta$-correlated protocol is RBR (knowledge) sound under the assumption that adversaries always send low-degree polynomials, then it is RBR (knowledge) sound in general. Equipped with this tool, we prove our third result by formally showing that "Plonk-like" protocols are RBR (knowledge) sound under the assumption that adversaries always send low-degree polynomials. We then outline analogous arguments for the remainder of the aforementioned protocols. To the best of our knowledge, ours is the first formal analysis of the Fiat-Shamir security of FRI and widely deployed protocols that invoke it. Testudo: Linear Time Prover SNARKs with Constant Size Proofs and Square Root Size Universal Setup Matteo Campanelli, Nicolas Gailly, Rosario Gennaro, Philipp Jovanovic, Mara Mihali, and Justin Thaler Latincrypt 2023 Links: [e-print] Abstract: We present Testudo, a new FFT-less SNARK with a near linear-time prover, constant-time verifier, constant-size proofs and a square-root-size universal setup. Testudo is based on a variant of Spartan--and hence does not require FFTs--as well as a new, fast multivariate polynomial commitment scheme (PCS) with a square-root-sized trusted setup that is derived from PST (TCC 2013) and IPPs (Asiacrypt 2021). To achieve constant-size SNARK proofs in Testudo, we then combine our PCS openings proofs recursively with a Groth16 SNARK. We also evaluate our construction and its building blocks: to compute a PCS opening proof for a polynomial of size $2^{25}$, our new scheme opening procedure achieves a 110x speed-up compared to PST and 3x compared to Gemini (Eurocrypt 2022), since opening computations are heavily parallelizable and operate on smaller polynomials. Furthermore, a proof for a witness of size $2^{30}$ requires a setup of size only $2^{15}$. Finally, we show that a variant for proving data-parallel computations is almost 10x faster at verifying $2^{10}$ Poseidon-based Merkle tree opening proofs than the regular version. Brakedown: Linear-time and post-quantum SNARKs for R1CS Alexander Golovnev, Jonathan Lee, Srinath Setty, Justin Thaler, and Riad Wahby CRYPTO 2023 Links: [e-print] Abstract: This paper introduces Brakedown, the first built system that provides linear-time SNARKs for NP, meaning the prover incurs $O(N)$ finite field operations to prove the satisfiability of an $N$-sized R1CS instance. Brakedown's prover is faster, both concretely and asymptotically, than prior SNARK implementations. Brakedown does not require a trusted setup and is plausibly post-quantum secure. Furthermore, it is compatible with arbitrary finite fields of sufficient size; this property is new amongst implemented arguments with sublinear proof sizes. To design Brakedown, we observe that recent work of Bootle, Chiesa, and Groth (BCG, TCC 2020) provides a polynomial commitment scheme that, when combined with the linear-time interactive proof system of Spartan (CRYPTO 2020), yields linear-time IOPs and SNARKs for R1CS %that are transparent and plausibly post-quantum secure (a similar theoretical result was previously established by BCG, but our approach is conceptually simpler, and crucial for achieving high-speed SNARKs). A core ingredient in the polynomial commitment scheme that we distill from BCG is a linear-time encodable code. Existing constructions of such codes are believed to be impractical. Nonetheless, we design and engineer a new one that is practical in our context. We also implement a variant of Brakedown that uses Reed-Solomon codes instead of our linear-time encodable codes; we refer to this variant as Shockwave. Shockwave is not a linear-time SNARK, but it provides shorter proofs and lower verification times than Brakedown (it also provides a faster prover than prior plausibly post-quantum SNARKs). As a modest additional contribution, we observe that one can render the aforementioned SNARK zero knowledge and reduce the proof size and verifier time from $O(\sqrt{N})$ to polylog$(N)$---while maintaining a linear-time prover---by outsourcing the verifier's work via one layer of proof composition with an existing zkSNARK as the ``outer'' proof system.
2022	Approximate Degree in Quantum and Classical Computing Mark Bun and Justin Thaler Foundations and Trends in Theoretical Computer Science Links: [authors' pdf] [NOW Publishers] Abstract: The approximate degree of a Boolean function f captures how well f can be approximated pointwise by low-degree polynomials. This monograph surveys what is known about approximate degree and illustrates its applications in theoretical computer science. A particular focus of the survey is a method of proving lower bounds via objects called dual polynomials. These represent a reformulation of approximate degree using linear programming duality. We discuss in detail a recent, powerful technique for constructing dual polynomials, called “dual block composition”. Note: This is a vastly expanded version of the SIGACT News survey by the same title. Order-Invariant Cardinality Estimators Are Differentially Private Charlie Dickens, Daniel Ting, and Justin Thaler NeurIPS 2022 Links: [arXiv] Abstract: We consider privacy in the context of streaming algorithms for cardinality estimation. We show that a large class of algorithms all satisfy $\epsilon$-differential privacy, so long as (a) the algorithm is combined with a simple down-sampling procedure, and (b) the cardinality of the input stream is $\Omega(k/\epsilon)$. Here, $k$ is a certain parameter of the sketch that is always at most the sketch size in bits, but is typically much smaller. We also show that, even with no modification, algorithms in our class satisfy $(\epsilon, \delta)$-differential privacy, where $\delta$ falls exponentially with the stream cardinality. Our analysis applies to essentially all popular cardinality estimation algorithms, and substantially generalizes and tightens privacy bounds from earlier works.
2021	Proofs, Arguments, and Zero-Knowledge Justin Thaler Foundations and Trends in Theoretical Computer Science Links: [book webpage] Abstract: Interactive proofs (IPs) and arguments are cryptographic protocols that enable an untrusted prover to provide a guarantee that it performed a requested computation correctly. IPs and arguments are zero-knowledge if they reveal nothing other than their own validity. Within the last decade, general-purpose zero-knowledge arguments have made the jump from theory to practice. This has opened up new doors in the design of cryptographic systems, and generated additional insights into the power of IPs and arguments (zero-knowledge or otherwise). There are now no fewer than five promising approaches to designing efficient, general-purpose zero-knowledge arguments. This comprehensive survey covers these approaches in a unified manner, emphasizing commonalities between them. As of January 2021, the survey remains a work in progress, but I hope that readers will find it useful in its current state. Guest Column: Approximate Degree in Classical and Quantum Computing Mark Bun and Justin Thaler ACM SIGACT News, January 2021 Links: [pdf] Abstract: The approximate degree of a Boolean function $f$ captures how well $f$ can be approximated pointwise by low-degree polynomials. This article surveys what we know about approximate degree and illustrates some of its applications in theoretical computer science. Quantum Proofs of Proximity Marcel Dall'Agnol, Tom Gur, Subhayan Roy Moulik, and Justin Thaler TQC 2021, Quantum Links: [arXiv] [Quantum journal page] Abstract: We initiate the systematic study of QMA algorithms in the setting of property testing, to which we refer as QMA proofs of proximity (QMAPs). These are quantum query algorithms that receive explicit access to a sublinear-size untrusted proof and are required to accept inputs having a property $\Pi$ and reject inputs that are $\epsilon$-far from $\Pi$, while only probing a minuscule portion of their input. Our algorithmic results include a general-purpose theorem that enables quantum speedups for testing an expressive class of properties, namely, those that are succinctly decomposable. Furthermore, we show quantum speedups for properties that lie outside of this family, such as graph bipartitneness. We also investigate the complexity landscape of this model, showing that QMAPs can be exponentially stronger than both classical proofs of proximity and quantum testers. To this end, we extend the methodology of Blais, Brody and Matulef (Computational Complexity, 2012) to prove quantum property testing lower bounds via reductions from communication complexity, thereby resolving a problem raised by Montanaro and de Wolf (Theory of Computing, 2016). Relative Error Streaming Quantiles Graham Cormode, Zohar Karnin, Edo Liberty, Justin Thaler, and Pavel Veselý PODS 2021. Best Paper Award. Invited to J. ACM. Links: [arXiv] [Condensed version in SIGMOD Record] [Technical perspective by Rasmus Pagh] Abstract: Approximating ranks, quantiles, and distributions over streaming data is a central task in data analysis and monitoring. Given a stream of $n$ items from a data universe $U$ equipped with a total order, the task is to compute a sketch (data structure) of size poly($\log(n)$,$1/\epsilon$). Given the sketch and a query item $y \in U$, one should be able to approximate its rank in the stream, i.e., the number of stream elements smaller than $y$. Most works to date focused on additive $\epsilon n$ error approximation, culminating in the KLL sketch that achieved optimal asymptotic behavior. This paper investigates multiplicative $(1 \pm \epsilon$)-error approximations to the rank. The motivation stems from practical demand to understand the tails of distributions, and hence for sketches to be more accurate near extreme values. The most space-efficient algorithms that can be derived from prior work store either $O(\log(\epsilon^2 n)/\epsilon^2)$ or $O(\log^3(\epsilon n)/\epsilon)$ universe items. This paper presents a sketch of size $O(\log^{1.5}(\epsilon n)/\epsilon)$ (ignoring poly($\log \log n$, $\log(1/\epsilon$)) factors) that achieves a ($1 \pm \epsilon$) multiplicative error guarantee, without prior knowledge of the stream length or dependence on the size of the data universe. This is within a $O(\sqrt{\log(\epsilon n)})$ factor of optimal. Moreover, our sketch is fully mergeable, which makes it suitable for a parallel or distributed environment.
2020	Vanishing-Error Approximate Degree and QMA Complexity Alexander A. Sherstov and Justin Thaler Accepted to CJTCS Links: [ECCC] Abstract: The $\epsilon$-approximate degree of a function $f\colon X \to \{0,1\}$ is the least degree of a multivariate real polynomial $p$ such that $\|p(x)-f(x)\| \leq \epsilon$ for all $x \in X$. We determine the $\epsilon$-approximate degree of the element distinctness function, the surjectivity function, and the permutation testing problem, showing they are $\Theta(n^{2/3} \log^{1/3}(1/\epsilon))$, $\tilde\Theta(n^{3/4} \log^{1/4}(1/\epsilon))$, and $\Theta(n^{1/3} \log^{2/3}(1/\epsilon))$, respectively. Previously, these bounds were known only for constant $\epsilon.$ We also derive a connection between vanishing-error approximate degree and quantum Merlin--Arthur (QMA) query complexity. We use this connection to show that the QMA complexity of permutation testing is $\Omega(n^{1/4})$. This improves on the previous best lower bound of $\Omega(n^{1/6})$ due to Aaronson (Quantum Information & Computation, 2012), and comes somewhat close to matching a known upper bound of $O(n^{1/3})$. Streaming Verification for Graph Problems: Optimal Tradeoffs and Nonlinear Sketches Amit Chakrabarti, Prantar Ghosh, and Justin Thaler RANDOM 2020 Links: [ECCC] Abstract: We study graph computations in an enhanced data streaming setting, where a space-bounded client reading the edge stream of a massive graph may delegate some of its work to a cloud service. We seek algorithms that allow the client to verify a purported proof sent by the cloud service that the work done in the cloud is correct. A line of work starting with Chakrabarti et al. (ICALP 2009) has provided such algorithms, which we call schemes, for several statistical and graph-theoretic problems, many of which exhibit a tradeoff between the length of the proof and the space used by the streaming verifier. This work designs new schemes for a number of basic graph problems---including triangle counting, maximum matching, topological sorting, and single-source shortest paths---where past work had either failed to obtain smooth tradeoffs between these two key complexity measures or only obtained suboptimal tradeoffs. Our key innovation is having the verifier compute certain \emph{nonlinear} sketches of the input stream, leading to either new or improved tradeoffs. In many cases, our schemes in fact provide optimal tradeoffs up to logarithmic factors. Specifically, for most graph problems that we study, it is known that the product of the verifier's space cost $v$ and the proof length $h$ must be at least $\Omega(n^2)$ for $n$-vertex graphs. However, matching upper bounds are only known for a handful of settings of $h$ and $v$ on the curve $h \cdot v=\tilde{\Theta}(n^2)$. For example, for counting triangles and maximum matching, schemes with costs lying on this curve are only known for $(h=\tilde{O}(n^2), v=\tilde{O}(1))$, $(h=\tilde{O}(n), v=\tilde{O}(n))$, and the trivial $(h=\tilde{O}(1), v=\tilde{O}(n^2))$. A major message of this work is that by exploiting nonlinear sketches, a significant "portion" of costs on the tradeoff curve $h \cdot v = n^2$ can be achieved. Improved Approximate Degree Bounds For k-Distinctness Nikhil Mande, Justin Thaler, and Shuchen Zhu TQC 2020 Links: [ECCC] [Conference presentation video by Shuchen] Abstract: An open problem that is widely regarded as one of the most important in quantum query complexity is to resolve the quantum query complexity of the $k$-distinctness function on inputs of size $N$. While the case of $k=2$ (also called Element Distinctness) is well-understood, there is a polynomial gap between the known upper and lower bounds for all constants $k>2$. Specifically, the best known upper bound is $O\left(N^{(3/4)-1/(2^{k+2}-4)}\right)$ (Belovs, FOCS 2012), while the best known lower bound for $k\geq 2$ is $\tilde{\Omega}\left(N^{2/3} + N^{(3/4)-1/(2k)}\right)$ (Aaronson and Shi, J. ACM 2004; Bun, Kothari, and Thaler, STOC 2018). For any constant $k \geq 4$, we improve the lower bound to $\tilde{\Omega}\left(N^{(3/4)-1/(4k)}\right)$. This yields, for example, the first proof that $4$-distinctness is strictly harder than Element Distinctness. Our lower bound applies more generally to approximate degree. As a secondary result, we give a simple construction of an approximating polynomial of degree $\tilde{O}(N^{3/4})$ that applies whenever $k \leq \text{polylog}(N)$. Quantum Lower Bounds for Approximate Counting via Laurent Polynomials Scott Aaronson, Robin Kothari, William Kretschmer, and Justin Thaler QIP 2020, CCC 2020 Links: [arXiv] [slides] [30-minute talk video by William] Abstract: This paper proves new limitations on the power of quantum computers to solve approximate counting---that is, multiplicatively estimating the size of a nonempty set $S\subseteq \lbrack N]$. Given only a membership oracle for $S$, it is well known that approximate counting takes $\Theta \bigl(\sqrt{N/\|S\|}\bigr)$ quantum queries. But what if a quantum algorithm is also given "QSamples"---i.e., copies of the state $\left\vert S\right\rangle =\sum_{i\in S}\|i\rangle$---or even the ability to apply reflections about $\|S\rangle$? Our first main result is that, even then, the algorithm needs either $\Theta \bigl(\sqrt{N/\left\vert S\right\vert }\bigr)$ queries or else $\Theta \Bigl(\min \bigl\{\left\vert S\right\vert ^{1/3}, \sqrt{N/\left\vert S\right\vert }\bigr\}\Bigr)$ reflections or samples. We also give matching upper bounds. We prove the lower bound using a novel generalization of the polynomial method of Beals et al. to Laurent polynomials, which can have negative exponents. We lower-bound Laurent polynomial degree using two methods: a new "explosion argument" that pits the positive- and negative-degree parts of the polynomial against each other, and a new formulation of the dual polynomials method. Our second main result rules out the possibility of a black-box Quantum Merlin-Arthur (or $\mathsf{QMA}$) protocol for proving that a set is large. More precisely, we show that, even if Arthur can make $T$ quantum queries to the set $S\subseteq \lbrack N]$, and also receives an $m$-qubit quantum witness from Merlin in support of $S$ being large, we have $Tm=\Omega \Bigl( \min \bigl\{ \left\vert S\right\vert ,\sqrt{N/\left\vert S\right\vert } \bigr\} \Bigr) $. This resolves the open problem of giving an oracle separation between $\mathsf{SBP}$, the complexity class that captures approximate counting, and $\mathsf{QMA}$. Note that $\mathsf{QMA}$ is "stronger" than the queries+QSamples model in that Merlin's witness can be anything, rather than just the specific state $\left\vert S\right\rangle $, but also "weaker" in that Merlin's witness cannot be trusted. Intriguingly, Laurent polynomials also play a crucial role in our $\mathsf{QMA}$\ lower bound, but in a completely different manner than in the queries+QSamples lower bound. This suggests that the "Laurent polynomial method" might be broadly useful in complexity theory. Ad Hoc Multi-Input Functional Encryption Shweta Agrawal, Michael Clear, Ophir Frieder, Sanjam Garg, Adam O'Neill, Justin Thaler ITCS 2020 Links: [e-print] Abstract: Consider sources that supply sensitive data to an aggregator. Standard encryption only hides the data from eavesdroppers, but using specialized encryption one can hope to hide the data (to the extent possible) from the aggregator itself. For flexibility and security, we envision schemes that allow sources to supply encrypted data, such that at any point a dynamically chosen subset of sources can allow an agreed-upon joint function of their data to be computed by the aggregator. A primitive called multi-input functional encryption (MIFE), due to Goldwasser et al. (EUROCRYPT 2014), comes close, but has two main limitations: it requires trust in a third party, who is able to decrypt all the data, and it requires function arity to be fixed at setup time and to be equal to the number of parties. To drop these limitations, we introduce a new notion of ad hoc MIFE. In our setting, each source generates its own public key and issues individual, function-specific secret keys to an aggregator. For successful decryption, an aggregator must obtain a separate key from each source whose ciphertext is being computed upon. The aggregator could obtain multiple such secret keys from a user corresponding to functions of varying arity. For this primitive, we obtain the following results: – We show that standard MIFE for general functions can be bootstrapped to ad hoc MIFE for free, i.e. without making any additional assumption. – We provide a direct construction of ad hoc MIFE for the inner product functionality based on the Learning with Errors (LWE) assumption. This yields the first construction of this natural primitive based on a standard assumption. At a technical level, our results are obtained by combining standard MIFE schemes and two-round secure multiparty computation (MPC) protocols in novel ways highlighting an interesting interplay between MIFE and two-round MPC in the construction of non interactive primitives.
2019	Approximate Degree, Secret Sharing, and Concentration Phenomena Andrej Bogdanov, Nikhil Mande, Justin Thaler, and Christopher Williamson RANDOM 2019 Links: [ECCC] Abstract: The $\epsilon$-approximate degree $\widetilde{\text{deg}}_{\epsilon}(f)$ of a Boolean function $f$ is the least degree of a real-valued polynomial that approximates $f$ pointwise to within $\epsilon$. A sound and complete certificate for approximate degree being at least $k$ is a pair of probability distributions, also known as a dual polynomial, that are perfectly $k$-wise indistinguishable, but are distinguishable by $f$ with advantage $1 - \epsilon$. Our contributions are: We give a simple, explicit new construction of a dual polynomial for the $\mathsf{AND}$ function on $n$ bits, certifying that its $\epsilon$-approximate degree is $\Omega\left(\sqrt{n \log 1/\epsilon}\right)$. This construction is the first to extend to the notion of weighted degree, and yields the first explicit certificate that the $1/3$-approximate degree of any (possibly unbalanced) read-once DNF is $\Omega(\sqrt{n})$. It draws a novel connection between the approximate degree of $\mathsf{AND}$ and anti-concentration of the Binomial distribution. We show that any pair of symmetric distributions on $n$-bit strings that are perfectly $k$-wise indistinguishable are also statistically $K$-wise indistinguishable with at most $K^{3/2} \cdot \exp\left(-\Omega\left(k^2/K\right)\right)$ error for all $k < K \leq n/64$. This bound is essentially tight, and implies that any symmetric function $f$ is a reconstruction function with constant advantage for a ramp secret sharing scheme that is secure against size-$K$ coalitions with statistical error $K^{3/2} \cdot \exp\left(-\Omega\left(\widetilde{\text{deg}}_{1/3}(f)^2/K\right)\right)$ for all values of $K$ up to $n/64$ simultaneously. Previous secret sharing schemes required that $K$ be determined in advance, and only worked for $f=\mathsf{AND}$. Our analysis draws another new connection between approximate degree and concentration phenomena. As a corollary of this result, we show that for any $d \leq n/64$, any degree $d$ polynomial approximating a symmetric function $f$ to error $1/3$ must have coefficients of $\ell_1$-norm at least $K^{-3/2} \cdot \exp\left({\Omega\left(\widetilde{\text{deg}}_{1/3}\left(f\right)^2/d\right)}\right)$. We also show this bound is essentially tight for any $d > \widetilde{\text{deg}}_{1/3}(f)$. These upper and lower bounds were also previously only known in the case $f=\mathsf{AND}$. The Large-Error Approximate Degree of AC^0 Mark Bun and Justin Thaler RANDOM 2019; Theory of Computing (Special Issue for RANDOM 2019) Links: [ECCC] Abstract: We prove two new results about the inability of low-degree polynomials to uniformly approximate constant-depth circuits, even to slightly-better-than-trivial error. First, we prove a tight $\tilde{\Omega}(n^{1/2})$ lower bound on the threshold degree of the $\mathsf{SURJECTIVITY}$ function on $n$ variables. This matches the best known threshold degree bound for any AC$^0$ function, previously exhibited by a much more complicated circuit of larger depth (Sherstov, FOCS 2015). Our result also extends to a $2^{\tilde{\Omega}(n^{1/2})}$ lower bound on the sign-rank of an AC$^0$ function, improving on the previous best bound of $2^{\Omega(n^{2/5})}$ (Bun and Thaler, ICALP 2016). Second, for any $\delta>0$, we exhibit a function $f \colon \{-1, 1\}^n \to \{-1, 1\}$ that is computed by a circuit of depth $O(1/\delta)$ and is hard to approximate by polynomials in the following sense: $f$ cannot be uniformly approximated to error $\epsilon=1-2^{-\Omega(n^{1-\delta})}$, even by polynomials of degree $n^{1-\delta}$. Our recent prior work (Bun and Thaler, FOCS 2017) proved a similar lower bound, but which held only for error $\epsilon=1/3$. Our result implies $2^{\Omega(n^{1-\delta})}$ lower bounds on the complexity of AC$^0$ under a variety of basic measures such as discrepancy, margin complexity, and threshold weight. This nearly matches the trivial upper bound of $2^{O(n)}$ that holds for every function. The previous best lower bound on AC$^0$ for these measures was $2^{\Omega(n^{1/2})}$ (Sherstov, FOCS 2015). Additional applications in learning theory, communication complexity, and cryptography are described. Sign-Rank Can Increase Under Intersection Mark Bun, Nikhil Mande, Justin Thaler ICALP 2019; TOCT Links: [ECCC] Abstract: The communication class UPP$^{\text{cc}}$ is a communication analog of the Turing Machine complexity class PP. It is characterized by a matrix-analytic complexity measure called sign-rank (also called dimension complexity), and is essentially the most powerful communication class against which we know how to prove lower bounds. For a communication problem $f$, let $f \wedge f$ denote the function that evaluates $f$ on two disjoint inputs and outputs the AND of the results. We exhibit a communication problem $f$ with UPP$(f)=O(\log n)$, and UPP$(f \wedge f)=\Theta(\log^2 n)$. This is the first result showing that UPP communication complexity can increase by more than a constant factor under intersection. We view this as a first step toward showing that UPP$^{\text{cc}}$, the class of problems with polylogarithmic-cost UPP communication protocols, is not closed under intersection. Our result shows that the function class consisting of intersections of two majorities on $n$ bits has dimension complexity $n^{\Omega(\log n)}$. This matches an upper bound of (Klivans, O'Donnell, and Servedio, FOCS 2002), who used it to give a quasipolynomial time algorithm for PAC learning intersections of polylogarithmically many majorities. Hence, fundamentally new techniques will be needed to learn this class of functions in polynomial time. Quantum Algorithms and Approximating Polynomials for Composed Functions With Shared Inputs Mark Bun, Robin Kothari, Justin Thaler SODA 2019, full version in Quantum. Links: [arXiv] [Slides] [Talk Video (30 minutes)] Abstract: We give new quantum algorithms for evaluating composed functions whose inputs may be shared between bottom-level gates. Let $f$ be a Boolean function and consider a function $F$ obtained by applying $f$ to conjunctions of possibly overlapping subsets of $n$ variables. If $f$ has quantum query complexity $Q(f)$, we give an algorithm for evaluating $F$ using $\tilde{O}(\sqrt{Q(f) \cdot n})$ quantum queries. This improves on the bound of $O(Q(f) \cdot \sqrt{n})$ that follows by treating each conjunction independently, and is tight for worst-case choices of $f$. Using completely different techniques, we prove a similar tight composition theorem for the approximate degree of $f$. By recursively applying our composition theorems, we obtain a nearly optimal $\tilde{O}(n^{1-2^{-d}})$ upper bound on the quantum query complexity and approximate degree of linear-size depth-$d$ AC$^0$ circuits. As a consequence, such circuits can be PAC learned in subexponential time, even in the challenging agnostic setting. Prior to our work, a subexponential-time algorithm was not known even for linear-size depth-3 AC$^0$ circuits. We also show that any substantially faster learning algorithm will require fundamentally new techniques. As an additional consequence, we show that AC0 circuits of depth $d+1$ require size $\tilde{\Omega}(n^{1/(1−2^{-d})})=\Omega(n^{1+2^{-d}})$ to compute the Inner Product function even on average. The previous best size lower bound was $\Omega(n^{1+4^{-(d+1)}})$ and only held in the worst case (Cheraghchi et al., JCSS 2018).
2018	Doubly-efficient zkSNARKs without trusted setup Riad S. Wahby, Ioanna Tzialla, abhi shelat, Justin Thaler, and Michael Walfish IEEE Symposium on Security and Privacy (S&P) 2018 Links: [e-print] Abstract: We present a zero-knowledge argument for NP with low communication complexity, low concrete cost for both the prover and the verifier, and no trusted setup, based on standard cryptographic assumptions (DDH). Specifically, communication is proportional to at most the square root of the witness size, plus $d \cdot log(G)$, for $d$ the depth and $G$ the width of the verifying circuit. Moreover, witness-related communication can be reduced below square root, at the cost of increased verifier runtime. When applied to batched or data-parallel statements, the prover's runtime is linear and the verifier's is sub-linear in the verifying circuit size, both with good constants. Together, these properties represent a new point in the tradeoffs among setup, complexity assumptions, proof size, and computational cost. Our argument is public coin, so we apply the Fiat-Shamir heuristic to produce a zero-knowledge succinct non-interactive argument of knowledge (zkSNARK), which we call Hyrax. We evaluate Hyrax on three benchmarks: SHA-256 Merkle trees, image transformation, and matrix multiplication. We find that Hyrax's proofs are 2−10x smaller than prior work with similar properties, and that Hyrax scales to 6−27x larger circuits than a highly-optimized prior system that requires trusted setup. The Polynomial Method Strikes Back: Tight Quantum Query Bounds via Dual Polynomials Mark Bun, Robin Kothari, Justin Thaler STOC 2018; Preliminary version in QIP 2018 (plenary talk). Invited to Theory of Computing. Manuscript link: [arXiv] Abstract: The approximate degree of a Boolean function f is the least degree of a real polynomial that approximates f pointwise to error at most 1/3. The approximate degree of f is known to be a lower bound on the quantum query complexity of f (Beals et al., FOCS 1998 and J. ACM 2001). We resolve or nearly resolve the approximate degree and quantum query complexities of several basic functions. Specifically, we show the following: k-distinctness: For any constant k, the approximate degree and quantum query complexity of the k-distinctness function is $\Omega(n^{3/4-1/(2k)})$. This is nearly tight for large $k$, as Belovs (FOCS 2012) has shown that for any constant k, the approximate degree and quantum query complexity of k-distinctness is $O(n^{3/4-1/(2^{k+2}-4)})$. Image Size Testing: The approximate degree and quantum query complexity of testing the size of the image of a function $[n] \rightarrow [n]$ is $\tilde{\Omega}(n^{1/2})$. This proves a conjecture of Ambainis et al. (SODA 2016), and it implies tight lower bounds on the approximate degree and quantum query complexity of the following natural problems. k-junta testing: A tight $\tilde{\Omega}(k^{1/2})$ lower bound for k-junta testing, answering the main open question of Ambainis et al. (SODA 2016). Statistical Distance from Uniform: A tight $\tilde{\Omega}(n^{1/2})$ lower bound for approximating the statistical distance from uniform of a distribution, answering the main question left open by Bravyi et al. (STACS 2010 and IEEE Trans. Inf. Theory 2011). Shannon entropy: A tight $\tilde{\Omega}(n^{1/2})$ lower bound for approximating Shannon entropy up to a certain additive constant, answering a question of Li and Wu (2017). Surjectivity: The approximate degree of the Surjectivity function is $\tilde{\Omega}(n^{3/4})$. The best prior lower bound was $\Omega(n^{2/3})$. Our result matches an upper bound of $\tilde{O}(n^{3/4})$ due to Sherstov, which we reprove using different techniques. The quantum query complexity of this function is known to be $\Theta(n)$ (Beame and Machmouchi, Quantum Inf. Comput. 2012 and Sherstov, FOCS 2015). Our upper bound for Surjectivity introduces new techniques for approximating Boolean functions by low-degree polynomials. Our lower bounds are proved by significantly refining techniques recently introduced by Bun and Thaler (FOCS 2017). Slides: [slides giving unified overview of BT17 and BKT18 (size: 3MB)] [slides more focused on BKT18 (size: 11MB)] Lecture notes: These lecture notes elucidate various aspects of the proof. Whereas the proof in the paper takes place entirely in the "dual" world, the notes explain how to replace a key step of the proof with a "primal" argument that many will find more intuitive. Approximate Degree and the Complexity of Depth Three Circuits Mark Bun and Justin Thaler RANDOM 2018 Links: [ECCC] [slides] Abstract: Threshold weight, margin complexity, and Majority-of-Threshold circuit size are basic complexity measures of Boolean functions that arise in learning theory, communication complexity, and circuit complexity. Each of these measures might exhibit a \emph{chasm} at depth three: namely, all polynomial size Boolean circuits of depth two have polynomial complexity under the measure, but there may exist Boolean circuits of depth three that have essentially maximal complexity $\exp(\Omega(n))$. However, existing techniques are far from showing this: for all three measures, the best lower bound for depth three circuits is $\exp(\Omega(n^{2/5}))$ . Moreover, current methods exclusively study block-composed functions. Such methods appear intrinsically unable to prove lower bounds better than $\exp(\Omega(n^{1/2}))$ even for depth four circuits, and have yet to prove lower bounds better than $\exp(\Omega(n^{1/2}))$ for circuits of any constant depth. We take a step toward showing that all of these complexity measures indeed exhibit a chasm at depth three. Specifically, for any arbitrarily small constant $\delta > 0$, we exhibit a depth three circuit of polynomial size (in fact, an $O(\log n)$-decision list) of complexity $\exp(\Omega(n^{1/2-\delta}))$ under each of these measures. Our methods go beyond the block-composed functions studied in prior work, and hence may not be subject to the same barriers. In particular, we suggest natural candidate functions that may exhibit stronger bounds, of the form $\exp(\tilde{\Omega}(n))$, where the notation hides factors polylogarithmic in $n$.
2017	Full Accounting for Verifiable Outsourcing Riad S. Wahby, Ye Ji, Andrew J. Blumberg, abhi shelat, Justin Thaler, Michael Walfish, and Thomas Wies CCS 2017 Links: [e-print] Abstract: Systems for verifiable outsourcing incur costs for a prover, a verifier, and precomputation; outsourcing makes sense when these costs are cheaper than not outsourcing. Yet, prover costs are generally ignored. The only exception is Verifiable ASICs (VA), wherein the prover is a custom chip; however, the only prior VA system ignores the cost of precomputation. This paper describes a new VA system, called Giraffe; charges Giraffe for all three costs; and identifies regimes where outsourcing is worthwhile. Giraffe’s base is an interactive proof geared to data parallel computation. Giraffe makes this protocol asymptotically optimal for the prover, which is of independent interest. Giraffe also develops a design template that produces hardware designs automatically for a wide range of parameters, introduces hardware primitives molded to the protocol’s data flows, and incorporates program analyses that expand applicability. Giraffe wins even when outsourcing several tens of sub-computations, scales to 500x larger computations than prior work, and can profitably outsource parts of programs that are not worthwhile to outsource in full. A High-Performance Algorithm for Identifying Frequent Items in Data Streams Daniel Anderson, Pryce Bevan, Kevin Lang, Edo Liberty, Lee Rhodes, and Justin Thaler IMC 2017 Links: [arxiv] Abstract: Estimating frequencies of items over data streams is a common building block in streaming data measurement and analysis. Misra and Gries introduced their seminal algorithm for the problem in 1982, and the problem has since been revisited many times due its practicality and applicability. We describe a highly optimized version of Misra and Gries’ algorithm that is suitable for deployment in industrial settings. Our code is made public via an open source library called DataSketches that is already used by several companies and production systems. Our algorithm improves on two theoretical and practical aspects of prior work. First, it handles weighted updates in amortized constant time, a common requirement in practice. Second, it uses a simple and fast method for merging summaries that asymptotically improves on prior work even for unweighted streams. We describe experiments confirming that our algorithms are more efficient than prior proposals. A Nearly Optimal Lower Bound on the Approximate Degree of AC$^0$ Mark Bun and Justin Thaler FOCS 2017. SICOMP (Special Issue for FOCS '17) Manuscript link: [ECCC] Abstract: The approximate degree of a Boolean function $f \colon \{-1, 1\}^n \rightarrow \{-1, 1\}$ is the least degree of a real polynomial that approximates $f$ pointwise to error at most $1/3$. We introduce a generic method for increasing the approximate degree of a given function, while preserving its computability by constant-depth circuits. Specifically, we show how to transform any Boolean function $f$ with approximate degree $d$ into a function $F$ on $O(n \cdot \text{polylog}(n))$ variables with approximate degree at least $D = \Omega(n^{1/3} \cdot d^{2/3})$. In particular, if $d= n^{1-\Omega(1)}$, then $D$ is polynomially larger than $d$. Moreover, if $f$ is computed by a polynomial-size Boolean circuit of constant depth, then so is $F$. By recursively applying our transformation, for any constant $\delta > 0$ we exhibit an AC$^0$ function of approximate degree $\Omega(n^{1-\delta})$. This improves over the best previous lower bound of $\Omega(n^{2/3})$ due to Aaronson and Shi (J. ACM 2004), and nearly matches the trivial upper bound of $n$ that holds for any function. Our lower bounds also apply to (quasipolynomial-size) DNFs of polylogarithmic width. We describe several applications of these results. We give: For any constant $\delta > 0$, an $\Omega(n^{1-\delta})$ lower bound on the quantum communication complexity of a function in AC$^0$. A Boolean function $f$ with approximate degree at least $C(f)^{2-o(1)}$, where $C(f)$ is the certificate complexity of $f$. This separation is optimal up to the $o(1)$ term in the exponent. Improved secret sharing schemes with reconstruction procedures in AC$^0$. Slides and video: [slides] [BIRS Talk Video (45 minutes)] Lecture notes: These lecture notes elucidate various aspects of the proof. Whereas the proof in the paper takes place entirely in the "dual" world, the notes explain how to replace a key step of the proof with a "primal" argument that many will find more intuitive. On the Power of Statistical Zero Knowledge Adam Bouland, Lijie Chen, Dhiraj Holden, Justin Thaler, and Prashant Nalini Vasudevan FOCS 2017. SICOMP (special issue for FOCS '17) Links: [ECCC] Abstract: We examine the power of statistical zero knowledge proofs (captured by the complexity class SZK) and their variants. First, we give the strongest known relativized evidence that SZK contains hard problems, by exhibiting an oracle relative to which SZK (indeed, even NISZK) is not contained in the class UPP, containing those problems solvable by randomized algorithms with unbounded error. This answers an open question of Watrous from 2002. Second, we "lift" this oracle separation to the setting of communication complexity, thereby answering a question of Goos et al. (ICALP 2016). Third, we give relativized evidence that perfect zero knowledge proofs (captured by the class PZK) are weaker than general zero knowledge proofs. Specifically, we exhibit oracles relative to which SZK is not contained in PZK, NISZK is not contained in NIPZK, and PZK is not equal to coPZK. The first of these results answers a question raised in 1991 by Aiello and Hastad (Information and Computation), and the second answers a question of Lovett and Zhang (2016). We also describe additional applications of these results outside of structural complexity. The technical core of our results is a stronger hardness amplification theorem for approximate degree, which roughly says that composing the gapped-majority function with any function of high approximate degree yields a function with high threshold degree. Reliably Learning the ReLU in Polynomial Time Surbhi Goel, Varun Kanade, Adam Klivans and Justin Thaler COLT 2017 Links: [arXiv] Abstract: We give the first dimension-efficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form $x \mapsto max(0,w \cdot x)$ with $w \in \mathbb{S}^{n-1}$. Our algorithm works in the challenging Reliable Agnostic learning model of Kalai, Kanade, and Mansour (2009) where the learner is given access to a distribution $D$ on labeled examples but the labeling may be arbitrary. We construct a hypothesis that simultaneously minimizes the false-positive rate and the loss on inputs given positive labels by $D$, for any convex, bounded, and Lipschitz loss function. The algorithm runs in polynomial-time (in $n$) with respect to any distribution on $\mathbb{S}^{n-1}$ (the unit sphere in $n$ dimensions) and for any error parameter $\epsilon=\Omega(1/\log n)$ (this yields a PTAS for a question raised by F. Bach on the complexity of maximizing ReLUs). These results are in contrast to known efficient algorithms for reliably learning linear threshold functions, where $\epsilon$ must be $\Omega(1)$ and strong assumptions are required on the marginal distribution. We can compose our results to obtain the first set of efficient algorithms for learning constant-depth networks of ReLUs. Our techniques combine kernel methods and polynomial approximations with a "dual-loss" approach to convex programming. As a byproduct we obtain a number of applications including the first set of efficient algorithms for "convex piecewise-linear fitting" and the first efficient algorithms for noisy polynomial reconstruction of low-weight polynomials on the unit sphere. Determining Tournament Payout Structures for Daily Fantasy Sports Christopher Musco, Maxim Sviridenko, and Justin Thaler ALENEX 2017; Invited to special issue of ACM Journal of Experimental Algorithmics for ALENEX 2017 Links: [arXiv] Abstract: With an exploding global market and the recent introduction of online cash prize tournaments, fantasy sports contests are quickly becoming a central part of the social gaming and sports industries. For sports fans and online media companies, fantasy sports contests are an opportunity for large financial gains. However, they present a host of technical challenges that arise from the complexities involved in running a web-scale, prize driven fantasy sports platform. We initiate the study of these challenges by examining one concrete problem in particular: how to algorithmically generate contest payout structures that are 1) economically motivating and appealing to contestants and 2) reasonably structured and succinctly representable. We formalize this problem and present a general two-staged approach for producing satisfying payout structures given constraints on contest size, entry fee, prize bucketing, etc. We then propose and evaluate several potential algorithms for solving the payout problem efficiently, including methods based on dynamic programming, integer programming, and heuristic techniques. Experimental results show that a carefully designed heuristic scales very well, even to contests with over 100,000 prize winners. Our approach extends beyond fantasy sports -- it is suitable for generating engaging payout structures for any contest with a large number of entrants and a large number of prize winners, including other massive online games, poker tournaments, and real-life sports tournaments.
2016	Dual Polynomials for Collision and Element Distinctness Mark Bun and Justin Thaler Theory of Computing Links: [ECCC] Abstract: The approximate degree of a Boolean function $f \colon \{-1, 1\}^n \rightarrow \{-1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within error $1/3$ in the $L_{\infty}$ norm. In an influential result, Aaronson and Shi (J. ACM 2004) proved tight $\tilde{\Omega}(n^{1/3})$ and $\tilde{\Omega}(n^{2/3})$ lower bounds on the approximate degree of the Collision and Element Distinctness functions, respectively. Their proof was non-constructive, using a sophisticated symmetrization argument and tools from approximation theory. More recently, several open problems in the study of approximate degree have been resolved via the construction of dual polynomials. These are explicit dual solutions to an appropriate linear program that captures the approximate degree of any function. We reprove Aaronson and Shi's results by constructing explicit dual polynomials for the Collision and Element Distinctness functions. Improved Bounds on the Sign-Rank of AC$^0$ Mark Bun and Justin Thaler ICALP 2016 Links: [ECCC] [slides] Abstract: The sign-rank of a matrix $A$ with entries in $\{-1, +1\}$ is the least rank of a real matrix $B$ with $A_{ij}\cdot B_{ij} > 0$ for all $i$, $j$. Razborov and Sherstov (2008) gave the first exponential lower bounds on the sign-rank of a function in AC$^0$, answering an old question of Babai, Frankl, and Simon (1986). Specifically, they exhibited a matrix $A=[F(x, y)]_{x, y}$ for a specific function $F: \{-1, 1\}^n \times \{-1, 1\}^n \rightarrow \{-1, 1\}$ in AC$^0$, such that $A$ has sign-rank $\exp(\Omega(n^{1/3}))$. We prove a generalization of Razborov and Sherstov's result, yielding exponential sign-rank lower bounds for a non-trivial class of functions (that includes the function used by Razborov and Sherstov). As a corollary of our general result, we improve Razborov and Sherstov's lower bound on the sign-rank of AC$^0$ from $\exp(\Omega(n^{1/3}))$ to $\exp(\tilde{\Omega}(n^{2/5}))$. We also describe several applications to communication complexity, learning theory, and circuit complexity. Lower Bounds for the Approximate Degree of Block-Composed Functions Justin Thaler ICALP 2016 Links: [ECCC] [slides] Abstract: We describe a new hardness amplification result for point-wise approximation of Boolean functions by low-degree polynomials. Specifically, for any function $f$ on $N$ bits, define $F(x_1, \dots, x_M)=\textsf{OMB}(f(x_1), ..., f(x_M))$ to be the function on $M\cdot N $bits obtained by block-composing $f$ with a specific DNF known as ODD-MAX-BIT. We show that, if $f$ requires large degree to approximate to error $2/3$ in a certain one-sided sense (captured by a complexity measure known as positive one-sided approximate degree), then $F$ requires large degree to approximate even to error $1-2^{-M}$. This generalizes a result of Beigel, who proved an identical result for the special case $f=\textsf{OR}$. Unlike related prior work, our result implies strong approximate degree lower bounds even for many functions $F$ that have low threshold degree. Our proof is constructive: we exhibit a solution to the dual of an appropriate linear program capturing the approximate degree of any function. Semi-Streaming Algorithms for Annotated Graph Streams Justin Thaler ICALP 2016 Links: [arXiv] [slides] Abstract: Considerable effort has been devoted to the development of streaming algorithms for analyzing massive graphs. Unfortunately, many results have been negative, establishing that a wide variety of problems require $\Omega(n^2)$ space to solve. One of the few bright spots has been the development of semi-streaming algorithms for a handful of graph problems -- these algorithms use space $O(n\cdot\text{polylog}(n))$. In the annotated data streaming model of Chakrabarti et al., a computationally limited client wants to compute some property of a massive input, but lacks the resources to store even a small fraction of the input, and hence cannot perform the desired computation locally. The client therefore accesses a powerful but untrusted service provider, who not only performs the requested computation, but also proves that the answer is correct. We put forth the notion of semi-streaming algorithms for annotated graph streams (semi-streaming annotation schemes for short). These are protocols in which both the client's space usage and the length of the proof are $O(n \cdot \text{polylog}(n))$. We give evidence that semi-streaming annotation schemes represent a substantially more robust solution concept than does the standard semi-streaming model. On the positive side, we give semi-streaming annotation schemes for two dynamic graph problems that are intractable in the standard model: (exactly) counting triangles, and (exactly) computing maximum matchings. The former scheme answers a question of Cormode. On the negative side, we identify for the first time two natural graph problems (connectivity and bipartiteness in a certain edge update model) that can be solved in the standard semi-streaming model, but cannot be solved by annotation schemes of "sub-semi-streaming" cost. That is, these problems are just as hard in the annotations model as they are in the standard model. Space Lower Bounds for Itemset Frequency Sketches Edo Liberty, Michael Mitzenmacher, Justin Thaler, and Jonathan Ullman, PODS 2016 Links: [arXiv] Abstract: Given a database, computing the fraction of rows that contain a query itemset or determining whether this fraction is above some threshold are fundamental operations in data mining. A uniform sample of rows is a good sketch of the database in the sense that all sufficiently frequent itemsets and their approximate frequencies are recoverable from the sample, and the sketch size is independent of the number of rows in the original database. For many seemingly similar problems there are better sketching algorithms than uniform sampling. In this paper we show that for itemset frequency sketching this is not the case. That is, we prove that there exist classes of databases for which uniform sampling is a space optimal sketch for approximate itemset frequency analysis, up to constant or iterated-logarithmic factors. A Framework for Estimating Stream Expression Cardinalities Anirban Dasgupta, Kevin Lang, Lee Rhodes, and Justin Thaler ICDT 2016; Best Newcomer Paper Award; Invited to special issue of TODS for ICDT 2016 Links: [arXiv] Abstract: Given $m$ distributed data streams $A_1,\dots,A_m$, we consider the problem of estimating the number of unique identifiers in streams defined by set expressions over $A_1,\dots,A_m$. We identify a broad class of algorithms for solving this problem, and show that the estimators output by any algorithm in this class are perfectly unbiased and satisfy strong variance bounds. Our analysis unifies and generalizes a variety of earlier results in the literature. To demonstrate its generality, we describe several novel sampling algorithms in our class, and show that they achieve a novel tradeoff between accuracy, space usage, update speed, and applicability.
2015	Streaming Verification in Data Analysis Samira Daruki, Justin Thaler, and Suresh Venkatasubramanian ISAAC 2015 Links: [arXiv] Abstract: Streaming interactive proofs (SIPs) are a framework to reason about outsourced computation, where a data owner (the verifier) outsources a computation to the cloud (the prover), but wishes to verify the correctness of the solution provided by the cloud service. In this paper we present streaming interactive proofs for problems in data analysis. We present protocols for clustering and shape fitting problems, as well as an improved protocol for rectangular matrix multiplication. The latter can in turn be used to verify $k$ eigenvectors of a (streamed) $n \times n$ matrix. In general our solutions use polylogarithmic rounds of communication and polylogarithmic total communication and verifier space. For special cases (when optimality certificates can be verified easily), we present constant round protocols with similar costs. For rectangular matrix multiplication and eigenvector verification, our protocols work in the more restricted annotated data streaming model, and use sublinear (but not polylogarithmic) communication. Variable Selection is Hard Dean Foster, Howard Karloff, and Justin Thaler COLT 2015 Links: [arXiv] [Slides (Short Talk)] Abstract: Variable selection for sparse linear regression is the problem of finding, given an $m \times p$ matrix $B$ and a target vector $y$, a sparse vector $x$ such that $B \cdot x$ approximately equals $y$. Assuming a standard complexity hypothesis, we show that no polynomial-time algorithm can find a $k'$-sparse $x$ with $\\|Bx-y\\|^2 \leq h(m,p)$, where $k'=k \cdot 2^{\log^{1-\delta} p}$ and $h(m,p) \leq p^{C_1} \cdot m^{1-C_2}$, where $\delta>0, C_1>0,C_2>0$ are arbitrary. This is true even under the promise that there is an unknown $k$-sparse vector $x^$ satisfying $Bx^=y$. We prove a similar result for a statistical version of the problem in which the data are corrupted by noise. To the authors' knowledge, these are the first hardness results for sparse regression that apply when the algorithm simultaneously has $k'>k$ and $h(m,p)>0$. Hardness Amplification and the Approximate Degree of Constant-Depth Circuits Mark Bun and Justin Thaler ICALP 2015 Links: [ECCC] [Slides] [Talk Video] Abstract: We establish a generic form of hardness amplification for the approximability of constant-depth Boolean circuits by polynomials. Specifically, we show that if a Boolean circuit cannot be pointwise approximated by low-degree polynomials to within constant error in a certain one-sided sense, then an OR of disjoint copies of that circuit cannot be pointwise approximated even with very high error. As our main application, we show that for every sequence of degrees $d(n)$, there is an explicit depth-three circuit $F: \{-1,1\}^n \to \{-1,1\}$ of polynomial-size such that any degree-$d$ polynomial cannot pointwise approximate $F$ to error better than $1-\exp\left(-\tilde{\Omega}(nd^{-3/2})\right)$. As a consequence of our main result, we obtain an $\exp\left(-\tilde{\Omega}(n^{2/5})\right)$ upper bound on the the discrepancy of a function in AC$^0$, and an $\exp\left(\tilde{\Omega}(n^{2/5})\right)$ lower bound on the threshold weight of AC$^0$, improving over the previous best results of $\exp\left(-\Omega(n^{1/3})\right)$ and $\exp\left(\Omega(n^{1/3})\right)$ respectively. Our techniques also yield a new lower bound of $\Omega\left(n^{1/2}/\log^{(d-2)/2}(n)\right)$ on the approximate degree of the AND-OR tree of depth $d$, which is tight up to polylogarithmic factors for any constant $d$, as well as new bounds for read-once DNF formulas. In turn, these results imply new lower bounds on the communication and circuit complexity of these classes, and demonstrate strong limitations on existing PAC learning algorithms. Verifiable Stream Computation and Arthur-Merlin Communication Amit Chakrabarti, Graham Cormode, Andrew McGregor, Justin Thaler, and Suresh Venkatasubramanian CCC 2015, SICOMP Links: [ECCC] [Slides] [Talk Video] Abstract: In the setting of streaming interactive proofs (SIPs), a client (verifier) needs to compute a given function on a massive stream of data, arriving online, but is unable to store even a small fraction of the data. It outsources the processing to a third party service (prover), but is unwilling to blindly trust answers returned by this service. Thus, the service cannot simply supply the desired answer; it must convince the verifier of its correctness via a short interaction after the stream has been seen. In this work we study "barely interactive" SIPs. Specifically, we show that two or three rounds of interaction suffice to solve several query problems --- including Index, Median, Nearest Neighbor Search, Pattern Matching, and Range Counting---with polylogarithmic space and communication costs. Such efficiency with $O(1)$ rounds of interaction was thought to be impossible based on previous work. On the other hand, we initiate a formal study of the limitations of constant-round SIPs by introducing a new hierarchy of communication models called Online Interactive Proofs (OIPs). The online nature of these models is analogous to the streaming restriction placed upon the verifier in an SIP. We give upper and lower bounds that (1) characterize, up to quadratic blowups, every finite level of the OIP hierarchy in terms of other well-known communication complexity classes, (2) separate the first four levels of the hierarchy, and (3) reveal that the hierarchy collapses to the fourth level. Our study of OIPs reveals marked contrasts and some parallels with the classic Turing Machine theory of interactive proofs, establishes limits on the power of existing techniques for developing constant-round SIPs, and provides a new characterization of (non-online) Arthur-Merlin communication in terms of an online model. A Note on the GKR Protocol Justin Thaler Manuscript Links: [pdf] Abstract: This note describes a simplification of the GKR interactive proof for circuit evaluation (Goldwasser, Kalai, and Rothblum, J. ACM 2015), as efficiently instantiated by Cormode, Mitzenmacher, and Thaler (ITCS 2012). The simplification reduces the prover runtime, round complexity, and total communication cost of the protocol by roughly 33%. Verifiable Computation Using Multiple Provers Andrew J. Blumberg, Justin Thaler, Victor Vu, and Michael Walfish Manuscript Links: [e-print] Abstract: The increasing ubiquity of the cloud computing paradigm has renewed focus on the classical problem of allowing weak clients to check the results of computation delegated to powerful servers. Recent advances in proof-based verifiable computation have led to several near-practical protocols. Protocols based on interactive proofs (IPs) work with highly restrictive models of computation and are thus efficient only for a limited class of computations. In contrast, protocols based on argument systems apply to a much larger class of computations, but efficiency requires amortization of very expensive setup costs. This paper initiates the study of the practical efficiency of multiprover interactive proofs (MIPs). We present a new MIP for delegating computation that extends insights from a powerful IP protocol (Goldwasser et al., STOC, 2008). Without reductions or amplification, our protocol uses only two provers (departing from prior work on MIPs), and achieves both the efficiency of interactive proof-based protocols and the generality of argument system-based protocols. Also, this result, together with recently developed machinery, creates a potential avenue toward concretely efficient arguments without setup costs. We describe Clover, a built system for verifiable computation, based on our protocol. Although Clover does not implement the full theory (it has setup costs), it applies to problems that existing IPs cannot efficiently handle, and achieves performance comparable to, or better than, the best argument systems.
2014	Distribution-Independent Reliable Learning Varun Kanade and Justin Thaler COLT 2014 Links: [arXiv] Abstract: We study several questions in the reliable agnostic learning framework of Kalai et al. (2009), which captures learning tasks in which one type of error is costlier than other types. A positive reliable classifier is one that makes no false positive errors. The goal in the positive reliable agnostic framework is to output a hypothesis with the following properties: (i) its false positive error rate is at most $\epsilon$, (ii) its false negative error rate is at most $\epsilon$ more than that of the best positive reliable classifier from the class. A closely related notion is fully reliable agnostic learning, which considers partial classifiers that are allowed to predict ``unknown'' on some inputs. The best fully reliable partial classifier is one that makes no errors and minimizes the probability of predicting ``unknown'', and the goal in fully reliable learning is to output a hypothesis that is almost as good as the best fully reliable partial classifier from a class. For distribution-independent learning, the best known algorithms for PAC learning typically utilize polynomial threshold representations, while the state of the art agnostic learning algorithms use point-wise polynomial approximations. We show that one-sided polynomial approximations, an intermediate notion between polynomial threshold representations and point-wise polynomial approximations, suffice for learning in the reliable agnostic settings. We then show that majorities can be fully reliably learned and disjunctions of majorities can be positive reliably learned, through constructions of appropriate one-sided polynomial approximations. Our fully reliable algorithm for majorities provides the first evidence that fully reliable learning may be strictly easier than agnostic learning. Our algorithms also satisfy strong attribute-efficiency properties, and in many cases they provide smooth tradeoffs between sample complexity and running time. Parallel Peeling Algorithms Jiayang Jiang, Michael Mitzenmacher, and Justin Thaler SPAA 2014; Winner of Best Paper Award; full version in TOPC (special issue for SPAA 2014) Links: [arXiv] [slides] [Talk Video] Abstract: The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than $k$ are removed until there are no vertices of degree less than $k$ left. The remaining hypergraph is known as the $k$-core. In this paper, we analyze parallel peeling processes, where in each round, all vertices of degree less than k are removed. It is known that, below a specific edge density threshold, the $k$-core is empty with high probability. We show that, with high probability, below this threshold, only $(\log \log n)/\log(k-1)(r-1) + O(1)$ rounds of peeling are needed to obtain the empty $k$-core for $r$-uniform hypergraphs. Interestingly, we show that above this threshold, $\Omega(\log n)$ rounds of peeling are required to find the non-empty $k$-core. Since most algorithms and data structures aim to peel to an empty $k$-core, this asymmetry appears fortunate. We verify the theoretical results both with simulation and with a parallel implementation using graphical processing units (GPUs). Our implementation provides insights into how to structure parallel peeling algorithms for efficiency in practice. Annotations in Data Streams Amit Chakrabarti, Graham Cormode, Andrew McGregor, and Justin Thaler ACM Transactions on Algorithms Links: [ECCC] Note: This paper supersedes a preliminary version by Chakrabarti, Cormode, and McGregor that appeared in ICALP 2009. Abstract: The central goal of data stream algorithms is to process massive streams of data using sublinear storage space. Motivated by work in the database community on outsourcing database and data stream processing, we ask whether the space usage of such algorithms can be further reduced by enlisting a more powerful "helper" who can annotate the stream as it is read. We do not wish to blindly trust the helper, so we require that the algorithm be convinced of having computed a correct answer. We show upper bounds that achieve a non-trivial tradeoff between the amount of annotation used and the space required to verify it. We also prove lower bounds on such tradeoffs, often nearly matching the upper bounds, via notions related to Merlin-Arthur communication complexity. Our results cover the classic data stream problems of selection, frequency moments, and fundamental graph problems such as triangle-freeness and connectivity. Our work is also part of a growing trend---including recent studies of multi-pass streaming, read/write streams and randomly ordered streams---of asking more complexity-theoretic questions about data stream processing. It is a recognition that, in addition to practical relevance, the data stream model raises many interesting theoretical questions in its own right. Faster Private Release of Marginals on Small Databases Karthekeyan Chandrasekaran, Justin Thaler, Jonathan Ullman, and Andrew Wan ITCS 2014 Links: [arXiv] Abstract: We study the problem of answering $k$-way marginal queries on a database $D \in (\{0,1\}^d)^n$, while preserving differential privacy. The answer to a $k$-way marginal query is the fraction of the database's records $x \in \{0,1\}^d$ with a given value in each of a given set of up to $k$ columns. Marginal queries enable a rich class of statistical analyses on a dataset, and designing efficient algorithms for privately answering marginal queries has been identified as an important open problem in private data analysis. For any $k$, we give a differentially private online algorithm that runs in time $$\mathrm{poly}\left(n, 2^{o(d)} \right)$$ per query and answers any sequence of $\mathrm{poly}(n)$ many $k$-way marginal queries with error at most $\pm 0.01$ on every query, provided $n \gtrsim d^{0.51} $. To the best of our knowledge, this is the first algorithm capable of privately answering marginal queries with a non-trivial worst-case accuracy guarantee for databases containing $\mathrm{poly}(d, k)$ records in time $\exp(o(d))$. Our algorithm runs the private multiplicative weights algorithm (Hardt and Rothblum, FOCS '10) on a new approximate polynomial representation of the database. We derive our representation for the database by approximating the OR function restricted to low Hamming weight inputs using low-degree polynomials with coefficients of bounded $L_1$-norm. In doing so, we show new upper and lower bounds on the degree of such polynomials, which may be of independent approximation-theoretic interest. Annotations for Sparse Data Streams Amit Chakrabarti, Graham Cormode, Navin Goyal. and Justin Thaler SODA 2014 Links:[arXiv] [Slides] Abstract: Motivated by cloud computing, a number of recent works have studied annotated data streams and variants thereof. In this setting, a computationally weak verifier (cloud user), lacking the resources to store and manipulate his massive input locally, accesses a powerful but untrusted prover (cloud service). The verifier must work within the restrictive data streaming paradigm. The prover, who can annotate the data stream as it is read, must not just supply the answer but also convince the verifier of its correctness. Ideally, both the amount of annotation and the space used by the verifier should be sublinear in the relevant input size parameters. A rich theory of such algorithms---which we call schemes---has emerged. Prior work has shown how to leverage the prover's power to efficiently solve problems that have no non-trivial standard data stream algorithms. However, while optimal schemes are now known for several basic problems, such optimality holds only for streams whose length is commensurate with the size of the data universe. In contrast, many real-world datasets are relatively sparse, including graphs that contain only $O(n^2)$ edges, and IP traffic streams that contain much fewer than the total number of possible IP addresses, $2^{128}$ in IPv6. We design the first schemes that allow both the annotation and the space usage to be sublinear in the total number of stream updates rather than the size of the data universe. We solve significant problems, including variations of INDEX, SET-DISJOINTNESS, and FREQUENCY-MOMENTS, plus several natural problems on graphs. On the other hand, we give a new lower bound that, for the first time, rules out smooth tradeoffs between annotation and space usage for a specific problem. Our technique brings out new nuances in Merlin-Arthur communication complexity models, and provides a separation between online versions of the MA and AMA models.
2013	Time-Optimal Interactive Proofs for Circuit Evaluation Justin Thaler CRYPTO 2013 Links: [Extended Abstract] [Full Paper (arXiv)] [Source Code] [Slides] [Talk Video] Abstract: Recently, researchers have been working toward the development of practical general-purpose protocols for verifiable computation. These protocols enable a computationally weak verifier to offload computations to a powerful but untrusted prover, while providing the verifier with a guarantee that the prover performed the computations correctly. Despite substantial progress, existing implementations are not yet practical. The main bottleneck is typically the extra effort required by the prover to return an answer with a guarantee of correctness, compared to returning an answer with no guarantee. We describe a refinement of a powerful interactive proof protocol originally due to Goldwasser, Kalai, and Rothblum. Cormode, Mitzenmacher, and Thaler show how to implement the prover in this protocol in time $O(S \log S)$, where $S$ is the size of an arithmetic circuit computing the function of interest. Our refinements apply to circuits whose wiring pattern is sufficiently "regular"; for these circuits, we bring the runtime of the prover down to $O(S)$. That is, our prover can evaluate the circuit with a guarantee of correctness, with only a constant-factor blowup in work compared to evaluating the circuit with no guarantee. We argue that our refinements capture a large class of circuits, and prove some theorems formalizing this. Experimentally, our refinements yield a 200x speedup for the prover over the implementation of Cormode et al., and our prover is less than 10x slower than a C++ program that simply evaluates the circuit. Along the way, we describe a special-purpose protocol for matrix multiplication that is of interest in its own right. Our final contribution is a protocol targeted at general data parallel computation. Compared to prior work, this protocol can more efficiently verify complicated computations as long as that computation is applied independently to many pieces of data. Dual Lower Bounds for Approximate Degree and Markov-Bernstein Inequalities Mark Bun and Justin Thaler ICALP 2013; Winner of Best Paper award for Track A; Journal version in Information and Computation (special issue for ICALP 2013) Links: [arXiv] [Expository Blog Post] Abstract: The $\epsilon$-approximate degree of a Boolean function $f: \{-1, 1\}^n \to \{-1, 1\}$ is the minimum degree of a real polynomial that approximates $f$ to within $\epsilon$ in the $\ell_\infty$ norm. We prove several lower bounds on this important complexity measure by explicitly constructing solutions to the dual of an appropriate linear program. Our first result resolves the $\epsilon$-approximate degree of the two-level AND-OR tree for any constant $\epsilon > 0$. We show that this quantity is $\Theta(\sqrt{n})$, closing a line of incrementally larger lower bounds. The same lower bound was recently obtained independently by Sherstov using related techniques. Our second result gives an explicit dual polynomial that witnesses a tight lower bound for the approximate degree of any symmetric Boolean function, addressing a question of \v{S}palek. Our final contribution is to reprove several Markov-type inequalities from approximation theory by constructing explicit dual solutions to natural linear programs. These inequalities underly the proofs of many of the best-known approximate degree lower bounds, and have important uses throughout theoretical computer science.
2012	Faster Algorithms for Privately Releasing Marginals Justin Thaler, Jonathan Ullman, and Salil Vadhan ICALP 2012 Links: [arXiv] Abstract: We study the problem of releasing $k$-way marginals of a database $D \in (\{0,1\}^d)^n$, while preserving differential privacy. The answer to a $k$-way marginal query is the fraction of $D$'s records $x \in \{0,1\}^d$ with a given value in each of a given set of up to $k$ columns. Marginal queries enable a rich class of statistical analyses of a dataset, and designing efficient algorithms for privately releasing marginal queries has been identified as an important open problem in private data analysis (cf.~Barak et.~al., PODS '07). We give an algorithm that runs in time $d^{O(\sqrt{k})}$ and releases a private summary capable of answering any $k$-way marginal query with at most $\pm .01$ error on every query as long as $n \geq d^{O(\sqrt{k})}$. To our knowledge, ours is the first algorithm capable of privately releasing marginal queries with non-trivial worst-case accuracy guarantees in time substantially smaller than the number of $k$-way marginal queries, which is $d^{\Theta(k)}$ (for $k \ll d$). Attribute-Efficient Learning and Weight-Degree Tradeoffs for Polynomial Threshold Functions Li-Yang Tan, Rocco Servedio, and Justin Thaler COLT 2012 Links: [ECCC version] [Expository Blog Post] [Slides (20 Minute Presentation)] [Li-Yang's Slides (50 Minute Presentation)] Abstract: We study the challenging problem of learning decision lists attribute-efficiently, giving both positive and negative results. Our main positive result is a new tradeoff between the running time and mistake bound for learning length-$k$ decision lists over $n$ Boolean variables. When the allowed running time is relatively high, our new mistake bound improves significantly on the mistake bound of the best previous algorithm of Klivans and Servedio. Our main negative result is a new lower bound on the weight of any degree-$d$ polynomial threshold function (PTF) that computes a particular decision list over $k$ variables (the ``Odd-Max-Bit'' function). The main result of Beigel (Computational Complexity, 1994) is a weight lower bound of $2^{\Omega(k/d^2)}$, which was shown to be essentially optimal for $d \leq k^{1/3}$ by Klivans and Servedio. Here we prove a $2^{\Omega(\sqrt{k/d})}$ lower bound, which improves on Beigel's lower bound for $d > k^{1/3}.$ This lower bound establishes strong limitations on the effectiveness of the Klivans and Servedio approach and suggests that it may be difficult to improve on our positive result. The main tool used in our lower bound is a new variant of Markov's classical inequality which may be of independent interest; it provides a bound on the derivative of a univariate polynomial in terms of both its degree and the size of its coefficients. Peeling Arguments and Double Hashing Michael Mitzenmacher, and Justin Thaler Allerton 2012 (Invited Paper) Links: [pdf] Abstract: The analysis of several algorithms and data structures can be reduced to the analysis of the following greedy "peeling" process: start with a random hypergraph; find a vertex of degree at most $k$, and remove it and all of its adjacent hyperedges from the graph; repeat until there is no suitable vertex. This specific process finds the $k$-core of a hypergraph, and variations on this theme have proven useful in analyzing for example decoding from low-density parity-check codes, several hash-based data structures such as cuckoo hashing, and algorithms for satisfiability of random formulae. This approach can be analyzed several ways, with two common approaches being via a corresponding branching process or a fluid limit family of differential equations. In this paper, we make note of an interesting aspect of these types of processes: the results are generally the same when the randomness is structured in the manner of double hashing. This phenomenon allows us to use less randomness and simplify the implementation for several hash-based data structures and algorithms. We explore this approach from both an empirical and theoretical perspective, examining theoretical justifications as well as simulation results for specific problems. Verifiable Computation with Massively Parallel Interactive Proofs Justin Thaler, Mike Roberts, Michael Mitzenmacher, and Hanspeter Pfister HotCloud 2012 Links: [arXiv] [source code] [extended abstract] [Talk Video] [Slides] Abstract: As the cloud computing paradigm has gained prominence, the need for verifiable computation has grown increasingly urgent. The concept of verifiable computation enables a weak client to outsource difficult computations to a powerful, but untrusted, server. Protocols for verifiable computation aim to provide the client with a guarantee that the server performed the requested computations correctly, without requiring the client to perform the computations herself. By design, these protocols impose a minimal computational burden on the client. However, existing protocols require the server to perform a large amount of extra bookkeeping in order to enable a client to easily verify the results. Verifiable computation has thus remained a theoretical curiosity, and protocols for it have not been implemented in real cloud computing systems. Our goal is to leverage GPUs to reduce the server-side slowdown for verifiable computation. To this end, we identify abundant data parallelism in a state-of-the-art general-purpose protocol for verifiable computation, originally due to Goldwasser, Kalai, and Rothblum, and recently extended by Cormode, Mitzenmacher, and Thaler. We implement this protocol on the GPU, obtaining 40-120x server-side speedups relative to a state-of-the-art sequential implementation. For benchmark problems, our implementation reduces the slowdown of the server to factors of 100-500x relative to the original computations requested by the client. Furthermore, we reduce the already small runtime of the client by 100x. Similarly, we obtain 20-50x server-side and client-side speedups for related protocols targeted at specific streaming problems. We believe our results demonstrate the immediate practicality of using GPUs for verifiable computation, and more generally that protocols for verifiable computation have become sufficiently mature to deploy in real cloud computing systems. Continuous Time Channels with Interference Ioanna Ivan, Michael Mitzenmacher, Justin Thaler, and Henry Yuen ISIT 2012 Links: [arXiv] Abstract: Khanna and Sudan studied a natural model of continuous time channels where signals are corrupted by the effects of both noise and delay, and showed that, surprisingly, in some cases both are not enough to prevent such channels from achieving unbounded capacity. Inspired by their work, we consider channels that model continuous time communication with adversarial delay errors. The sender is allowed to subdivide time into an arbitrarily large number $M$ of micro-units in which binary symbols may be sent, but the symbols are subject to unpredictable delays and may interfere with each other. We model interference by having symbols that land in the same micro-unit of time be summed, and we study $k$-interference channels, which allow receivers to distinguish sums up to the value $k$. We consider both a channel adversary that has a limit on the maximum number of steps it can delay each symbol, and a more powerful adversary that only has a bound on the average delay. We give precise characterizations of the threshold between finite and infinite capacity depending on the interference behavior and on the type of channel adversary: for max-bounded delay, the threshold is at $D_{\text{max}}=\Theta(M \log\min{k, M}))$, and for average bounded delay the threshold is at $D_{\text{avg}} = \Theta(\sqrt{M \cdot \min\{k, M\}})$. Hierarchical Heavy Hitters with the Space Saving Algorithm Michael Mitzenmacher, Thomas Steinke, and Justin Thaler ALENEX 2012 Links: [arXiv] [source code] Abstract: The Hierarchical Heavy Hitters problem extends the notion of frequent items to data arranged in a hierarchy. This problem has applications to network traffic monitoring, anomaly detection, and DDoS detection. We present a new streaming approximation algorithm for computing Hierarchical Heavy Hitters that has several advantages over previous algorithms. It improves on the worst-case time and space bounds of earlier algorithms, is conceptually simple and substantially easier to implement, offers improved accuracy guarantees, is easily adopted to a distributed or parallel setting, and can be efficiently implemented in commodity hardware such as ternary content addressable memory (TCAMs). We present experimental results showing that for parameters of primary practical interest, our two-dimensional algorithm is superior to existing algorithms in terms of speed and accuracy, and competitive in terms of space, while our one-dimensional algorithm is also superior in terms of speed and accuracy for a more limited range of parameters. Practical Verified Computation with Streaming Interactive Proofs Graham Cormode, Michael Mitzenmacher, and Justin Thaler ITCS 2012 Links: [arXiv] [source code] [slides] Abstract: When delegating computation to a service provider, as in cloud computing, we seek some reassurance that the output is correct and complete. Yet recomputing the output as a check is inefficient and expensive, and it may not even be feasible to store all the data locally. We are therefore interested in proof systems which allow a service provider to prove the correctness of its output to a streaming (sublinear space) user, who cannot store the full input or perform the full computation herself. Our approach is two-fold. First, we describe a carefully chosen instantiation of one of the most efficient general-purpose constructions for arbitrary computations (streaming or otherwise), due to Goldwasser, Kalai, and Rothblum. This requires several new insights to make the methodology more practical. Our main contribution is in achieving a prover who runs in time $O(S(n) \log S(n))$, where $S(n)$ is the size of an arithmetic circuit computing the function of interest. Our experimental results demonstrate that a practical general-purpose protocol for verifiable computation may be significantly closer to reality than previously realized. Second, we describe techniques that achieve genuine scalability for protocols fine-tuned for specific important problems in streaming and database processing. Focusing in particular on non-interactive protocols for problems ranging from matrix-vector multiplication to bipartite perfect matching, we build on prior work to achieve a prover who runs in nearly linear-time, while obtaining optimal tradeoffs between communication cost and the user's working memory. Existing techniques required (substantially) superlinear time for the prover. We argue that even if general-purpose methods improve, fine-tuned protocols will remain valuable in real-world settings for key problems, and hence special attention to specific problems is warranted. Verifying Computations with Streaming Interactive Proofs Graham Cormode, Justin Thaler, and Ke Yi VLDB 2012 Links: [arXiv] [Conference slides] Abstract: When computation is outsourced, the data owner would like to be assured that the desired computation has been performed correctly by the service provider. In theory, proof systems can give the necessary assurance, but prior work is not sufficiently scalable or practical. In this paper, we develop new proof protocols for verifying computations which are streaming in nature: the verifier (data owner) needs only logarithmic space and a single pass over the input, and after observing the input follows a simple protocol with a prover (service provider) that takes logarithmic communication spread over a logarithmic number of rounds. These ensure that the computation is performed correctly: that the service provider has not made any errors or missed out some data. The guarantee is very strong: even if the service provider deliberately tries to cheat, there is only vanishingly small probability of doing so undetected, while a correct computation is always accepted. We first observe that some theoretical results can be modified to work with streaming verifiers, showing that there are efficient protocols for problems in the complexity classes NP and NC. Our main results then seek to bridge the gap between theory and practice by developing usable protocols for a variety of problems of central importance in streaming and database processing. All these problems require linear space in the traditional streaming model, and therefore our protocols demonstrate that adding a prover can exponentially reduce the effort needed by the verifier. Our experimental results show that our protocols are practical and scalable. Cache-Oblivious Dictionaries and Multimaps with Negligible Failure Probability Michael T. Goodrich, Dan Hirschberg, Michael Mitzenmacher, and Justin Thaler MedAlg 2012 Links: [pdf] [slides] Abstract: A dictionary (or map) is a key-value store that requires all keys be unique, and a multimap is a key-value store that allows for multiple values to be associated with the same key. We design hashing- based indexing schemes for dictionaries and multimaps that achieve worst-case optimal performance for lookups and updates, with minimal space overhead and sub-polynomial probability that the data structure will require a rehash operation. Our dictionary structure is designed for the Random Access Machine (RAM) model, while our multimap implementation is designed for the cache-oblivious external memory (I/O) model. The failure probabilities for our structures are sub-polynomial, which can be useful in cryptographic or data-intensive applications.
2011	On the Zero-Error Capacity Threshold for Deletion Channels Ian Kash, Michael Mitzenmacher, Justin Thaler, and Jonathan Ullman ITA 2011 Links: [arXiv] Abstract: We consider the zero-error capacity of deletion channels. Specifically, we consider the setting where we choose a codebook $C$ consisting of strings of n bits, and our model of the channel corresponds to an adversary who may delete up to $pn$ of these bits for a constant $p$. Our goal is to decode correctly without error regardless of the actions of the adversary. We consider what values of $p$ allow non-zero capacity in this setting. We suggest multiple approaches, one of which makes use of the natural connection between this problem and the problem of finding the expected length of the longest common subsequence of two random sequences. External Memory Multimaps Elaine Angelino, Michael Mitzenmacher, and Justin Thaler ISAAC 2011; Algorithmica (special issue for ISAAC 2011) Links:[arXiv] Abstract: Many data structures support dictionaries, also known as maps or associative arrays, which store and manage a set of key-value pairs. A \emph{multimap} is generalization that allows multiple values to be associated with the same key. For example, the inverted file data structure that is used prevalently in the infrastructure supporting search engines is a type of multimap, where words are used as keys and document pointers are used as values. We study the multimap abstract data type and how it can be implemented efficiently online in external memory frameworks, with constant expected I/O performance. The key technique used to achieve our results is a combination of cuckoo hashing using buckets that hold multiple items with a multiqueue implementation to cope with varying numbers of values per key. Our external-memory results are for the standard two-level memory model.
2010	Streaming Graph Computations with a Helpful Advisor Graham Cormode, Michael Mitzenmacher, and Justin Thaler ESA 2010; Algorithmica 2013 Links: [ECCC] [Conference Slides] Abstract: Motivated by the trend to outsource work to commercial cloud computing services, we consider a variation of the streaming paradigm where a streaming algorithm can be assisted by a powerful helper that can provide annotations to the data stream. We extend previous work on such annotation models by considering a number of graph streaming problems. Without annotations, streaming algorithms for graph problems generally require significant memory; we show that for many standard problems, including all graph problems that can be expressed with totally unimodular integer programming formulations, only constant space (measured in words) is needed for single-pass algorithms given linear-sized annotations. We also obtain protocols achieving essentially optimal tradeoffs between annotation length and memory usage for several important problems, including integer matrix-vector multiplication, as well as shortest $s$-$t$ path in small-diameter graphs. We also obtain non-trivial tradeoffs for minimum weight bipartite perfect matching and shortest $s$-$t$ path in general graphs.
2009	Graph Covers and Quadratic Minimization Nicholas Ruozzi, Sekhar Tatikonda, and Justin Thaler Allerton 2009 Links: [pdf] [slides] Abstract: We formulate a new approach to understanding the behavior of the min-sum algorithm by exploiting the properties of graph covers. First, we present a new, natural characterization of scaled diagonally dominant matrices in terms of graph covers; this result motivates our approach because scaled diagonal dominance is a known sufficient condition for the convergence of min-sum in the case of quadratic minimization. We use our understanding of graph covers to characterize the periodic behavior of the min-sum algorithm on a single cycle. Lastly, we explain how to extend the single cycle results to understand the 2-periodic behavior of min-sum for general pairwise MRFs. Some of our techniques apply more broadly, and we believe that by capturing the notion of indistinguishability, graph covers represent a valuable tool for understanding the abilities and limitations of general message-passing algorithms.
Expository Articles	Technical Perspective: Catching Lies (and Mistakes) in Offloaded Computation Michael Mitzenmacher, and Justin Thaler CACM, February 2016 Links: [article on CACM website] Stream Verification Justin Thaler Springer's Encyclopedia of Algorithms, 2015 Links: [arXiv] [slides] Abstract: In this short article, we survey models and algorithms for stream verification.

Academic Service

I have been, or currently am, a member of the following program committees:

RANDOM 2022
TCC 2021
ESA 2021
TCC 2020
STOC 2019
SOSA 2019
SODA 2018
FSTTCS 2017
ICALP 2016
ALENEX 2016
SDM 2015

Open Source Software

I am a co-creator and core contributor to DataSketches, an open source library of production quality streaming algorithms for a variety of important query classes, including identifying frequent items, unique count queries, quantiles, and subset sum queries.

Slides From Survey Talks

Interactive Proofs. Presented at the Proofs, Consensus, and Decentralizing Society Boot Camp, Simons Institute for the Theory of Computing, August 2019. Video of Part 1 Video of Part 2
A Crash Course on Fast Interactive Proofs. A followup to the talk above, presented in September 2019 during the Simons Institute semester on Proofs, Consensus, and Decentralizing Society.
Approximate Degree: A Survey. Presented at the 2018 CMO workshop on Analytic Techniques in Theoretical Computer Science.
Verifiable Computing: Between Theory and Practice. Presented at the STOC 2017 Theory Fest Workshop on "Probabilistically checkable and interactive proofs (PCP/IP): between theory and practice".
Chebyshev Polynomials, Approximate Degree, and their Applications. FOCS 2016 Workshop on "(Some) Orthogonal Polynomials and their Applications to TCS". [talk video available here]
Approximate Degree and the Method of Dual Polynomials. Columbia University Theory Seminar, November 2014.
Provides a unified overview of [BT13], [BT14], and [Tha14].
Modern Verifiable Computation. A 3-hour tutorial on interactive proofs and argument systems. Presented at 2016 Summer School on Secure and Oblivious Computation and Outsourcing at University of Notre Dame.

Miscellaneous Links

Yale Probabilistic Networks Group.
A website describing my project on streaming entropy computation during the 2007 REU at DIMACS. Note that I can no longer maintain the webpage, and the links to code on the site are now broken; however, the code can be accessed here. The site contains implementations of the algorithms described in A Near-Optimal Algorithm for Computing the Entropy of a Stream, by A. Chakrabarti, G. Cormode, and A. McGregor (In ACM Transactions on Algorithms, 6 (2010), no. 3, pg. 1-21). The REU was a terrific experience.