Authors are listed in alphabetical order of last name. Safe to assume equal contribution unless the primary author is highlighted explicitly.

Publications

Refutations of Perfect Matchings In Expanders Is Hard

Ari Biswas    Rajko Nenadov
This work studies the complexity of refuting the existence of a perfect matching in spectral expanders with an odd number of vertices, in the Polynomial Calculus (PC) and Sum of Squares (SoS) proof system. Austrin and Risse [SODA, 2021] showed that refuting perfect matchings in sparse \(d\)-regular random graphs in the above proof systems requires proofs with degree \(\Omega(n/\log n)\) on average. We extend their result by showing the same lower bound holds for all \(d\)-regular graphs with a mild spectral gap.Additionally, as Austrin and Risse do for random graphs, we show that the hardness of refuting a more general class of formulae, which includes both perfect matchings and even-colouring formulas, also reduces to the hardness of refuting perfect matchings in spectral expanders, thereby settling an open problem of Buss and Nordstrom.

Differentially Private Hierarchical Heavy Hitters

Ari Biswas    Graham Cormode    Yaron Kanza    Divesh Srivastava    Zhengyi Zhou
The task of finding Hierarchical Heavy Hitters (HHH) was introduced by Cormode et al. [12] as a generalisation of the heavy hitter problem. While finding HHH in data streams has been studied extensively, the question of releasing HHH when the underlying data is private remains unexplored. In this paper, we formalise and study the notion of differentially private HHH, in both the streaming and non-streaming setting. In the non-streaming setting, we show the surprising result that the relative error in estimating the count for any prefix is independent of the height of the hierarchy and the number of heavy hitters in the stream. Additionally, our algorithms also improve the error guarantees of Ghazi et al. [24] for the problem of counting over trees. Meanwhile, in the streaming setting, the main issue is that although the exact version of HHH has low global sensitivity (as counting queries are 1-sensitive), the approximation functions due to streaming have high global sensitivity, linear in the available space. Despite this obstacle, we show that the absolute error for estimating frequencies in the streaming setting is independent of the available space.
The poster appeared at TPDP 2024

Interactive Proofs For Differentially Private Counting

Ari Biswas    Graham Cormode
Differential Privacy (DP) is often presented as a strong privacy-enhancing technology with broad applicability and advocated as a de facto standard for releasing aggregate statistics on sensitive data. However, in many embodiments, DP introduces a new attack surface: a malicious entity entrusted with releasing statistics could manipulate the results and use the randomness of DP as a convenient smokescreen to mask its nefariousness. Since revealing the random noise would obviate the purpose of introducing it, the miscreant may have a perfect alibi. To close this loophole, we introduce the idea of Interactive Proofs For Differential Privacy, which requires the publishing entity to output a zero knowledge proof that convinces an efficient verifier that the output is both DP and reliable. Such a definition might seem unachievable, as a verifier must validate that DP randomness was generated faithfully without learning anything about the randomness itself. We resolve this paradox by carefully mixing private and public randomness to compute verifiable DP counting queries with theoretical guarantees and show that it is also practical for real-world deployment. We also demonstrate that computational assumptions are necessary by showing a separation between information-theoretic DP and computational DP under our definition of verifiability.
The extended abstract appeared at TPDP 2023

Seeker: Real-Time Interactive Search

Ari Biswas    Thai Pham    Houssam Nassif    Ben Snyder    Michael Vogelsong
This paper introduces Seeker, a system that allows users to interactively refine search rankings in real time, through feedback in the form of likes and dislikes. When searching online, users may not know how to accurately describe their product of choice in words. An alternative approach is to search an embedding space, allowing the user to query using a representation of the item (like a tune for a song, or a picture for an object). However, this approach requires the user to possess an example representation of their desired item. Additionally, most current search systems do not allow the user to dynamically adapt the results with further feedback. On the other hand, users often have a mental picture of the desired item and are able to answer ordinal questions of the form: Is this item similar to what you have in mind? With this assumption, our algorithm allows for users to provide sequential feedback on search results to adapt the search feed. We show that our proposed approach works well both qualitatively and quantitatively. Unlike most previous representation-based search systems, we can quantify the quality of our algorithm by evaluating humans-in-the-loop experiments.

Adversarial Principal Component Analysis

Ari Biswas    Daniel Pimentel    Claudia Solís-Lemus
This paper studies the following question: where should an adversary place an outlier of a given magnitude in order to maximize the error of the subspace estimated by PCA? We give the exact location of this worst possible outlier, and the exact expression of the maximum possible error. Equivalently, we determine the information-theoretic bounds on how much an outlier can tilt a subspace in its direction. This in turn provides universal (worst-case) error bounds for PCA under arbitrary noisy settings. Our results also have several implications on adaptive PCA, online PCA, and rank-one updates. We illustrate our results with a subspace tracking experiment.

Hands-on Introduction To Computer Science At The Freshman Level

This paper details the creation of a hands-on introduction course that reflects the dramatic growth and diversity in computer science. Our aim was to enable students to get an end-to-end perspective on computer system design by building one. We report on a two-year exercise in using the Arduino platform to build a series of hands-on projects. We have used these projects in two course instances, and have obtained detailed student feedback, which we analyze and present in this paper. The instructions, code and videos developed are available open-source.