Ari | Last updated: Mon 19 Feb 2024 15:19:42 GMT


Hybrid Argument

Computational Indistinguishability

Two Ensembles \{ X_n\}_{n \in \N} and \{ Y_n\}_{n \in \N} are computationally indistinguisble if for every polynomial time distinguisher D, and every polynomial p(\cdot) and large enough n we have

\begin{equation} \Bigg| \Pr{\Big[D(X_n, 1^n ) = 1 \Big]} - \Pr{\Big[D(Y_n, 1^n ) = 1 \Big]}\Bigg| \leq \frac{1}{p(n)} \end{equation}

Alternatively, consider the following mental model, when thinking about distinguishing between two ensembles. For \alpha \in \{ 0, 1\}^\star, let d(\alpha) denote the probability D outputs 1. Then define d_X(n) = \mathbb{E}_{x \in X_n}[d(x)] and d_Y(n) = \mathbb{E}_{y \in Y_n}[d(y)]. Then X and Y are computationally indistinguishable if |d_X(n) - d_Y(n)| is negligible.

Statistical Indistinguishability

Two ensembles \{ X_n\}_{n \in \N} and \{ Y_n\}_{n \in \N} are statistically indistinguishable if their statistical distance is negligible i.e. for every polynomial p(\cdot) in \N there is exists n_0 \in \N such that

\begin{equation} \frac{1}{2}\sum_{\alpha }\Big| \Pr{[X_n=\alpha]} - \Pr{[Y_n=\alpha]}\Big| \leq \frac{1}{p(n_0)} \end{equation}

Another interesting resource

Indistinguishability By Repeated Sampling

Two ensembles \{ X_n\}_{n \in \N} and \{ Y_n\}_{n \in \N} are indistinguishable by polynomial time sampling, if for every probabilistic distinguisher D, and every polynomial m(\cdot) and p(\cdot), all sufficiently large n’s

\begin{equation} \Bigg| \Pr{\Big[D(X_n^{(1)}, \dots, X_n^{(m)}, 1^n ) = 1 \Big]} - \Pr{\Big[D(Y_n^{(1)}, \dots, Y_n^{(m)}, 1^n ) = 1 \Big]}\Bigg| \leq \frac{1}{p(n)} \end{equation}

where X_n^{(1)}, \dots, X_n^{(m)} are i.i.d samples from X_n and where Y_n^{(1)}, \dots, Y_n^{(m)} are i.i.d samples from Y_n, and m=m(n).

Efficiently Constructible Ensembles

An ensemble \{ X_n\}_{n \in \N} is efficiently constructible if there exists a polynomial time algorithm S such that the samples from S(1^n) and X_n are indistinguishable for every n.

Examples Of Proofs Using The Hybrid Technique

If two ensembles are indistinguishable, then they are also indistinguisble by repeated sampling. For two ensembles X=\{ X_n\}_{n \in \N} and Y=\{ Y_n\}_{n \in \N}, if X and Y are indistinguishable, then X and Y are also indistinguishable by polynomial repeated sampling.

Proof:

Prove the contrapositive. Assume that there is an distinguisher D such that X and Y are not poly time indistinguishable. Define a hybrid random variable H_n^k = \Bigg(X_n^{(1)}, \dots, X_n^{(k)}, Y_n^{(k+1)}, \dots, Y_n^{(m)}\Bigg), where X_n^{(\cdot)} are i.i.d samples from X_n and Y_n^{(\cdot)} are i.i.d samples from Y_n. We can use D as a subroutine in another PPT algorithm D^\prime and show that X and Y are not computationally indistinguishable. Define D^\prime as follows:

  • Sample k \xleftarrow{R} \{0, 1, \dots, m-1\}
  • Remember X_n and Y_n are efficiently construnctible, therefore it is possible to sample i.i.d from X_n. Denote these samples with x_1, \dots, x_k. Similarly, let y_{k+2}, \dots, y_m denote m-k-1 i.i.d samples from Y_n.
  • Define D^\prime(\alpha) = (x_1, \dots, x_k, \alpha, y_{k+2}, y_m) where \alpha is in the range of X_n and Y_n

By Bayes rule we have

\begin{align} \Pr{[D^\prime(X_n) = 1]} &= \frac{1}{m}\sum_{k}\Pr{\Big[ D(H_n^{(k+1)}) = 1\Big]} \end{align}

\begin{align} \Pr{[D^\prime(Y_n) = 1]} &= \frac{1}{m}\sum_{k}\Pr{\Big[ D(H_n^{(k)}) = 1\Big]} \end{align}

Therefore we get

\begin{align} |\Pr{[D^\prime(X_n) = 1]} - \Pr{[D^\prime(Y_n) = 1]}| &= \frac{1}{m}\Bigg| \sum_{k}\Pr{\Big[ D(H_n^{(k+1)}) = 1\Big]} - \Pr{\Big[ D(H_n^{(k)}) = 1\Big]}\Bigg| \\ &= \frac{1}{m}\Bigg|\Pr{\Big[ D(H_n^{(m)}) = 1\Big]} - \Pr{\Big[ D(H_n^{(0)}) = 1\Big]}\Bigg| \\ &\geq \frac{1}{m p(n)} \end{align}

The last line comes from our assumption that X and Y are not repeated time indistinguishable.

The general idea of the hybrid argument that we will see more off is the ability to reduce a complex ensemeble to statment about a simple ensemble. In the proof above, we actually did the opposite, we assumed that the complex ensemble was hard, and then showed that if it’s the case a much simpler ensemble is also hard. To be able to use Hybrids, it’s essential that the two extremes of hybrid variables are the complex ensembles we want to say something about. Furthermore, pairwise hybrid variables are easy to analyse. Also the number of hybrids must be polynomial in n. The more the number of hybrids the more distinguishable the distributions are.