Quantized sub-Gaussian random matrices are still RIP!

I have always been intrigued by the fact that, in Compressed Sensing (CS), beyond Gaussian random matrices, a couple of other unstructured random matrices respecting, with high probability (whp), the Restricted Isometry Property (RIP) look like “quantized” version of the Gaussian case, i.e., their discrete entries have a probability density function (pdf) that seems induced by a discretized version of the Gaussian pdf.

For instance, two random constructions that are known to allow CS of sparse signals by respecting the RIP, namely the Bernoulli random matrix, with entries identically and independently distributed (iid) as a rv taking \pm 1 with equal probability, and the ternary matrix with iid random entries selected within \{0, \pm 1\} with probability 2/3 and 1/6 [3], respectively, look like a certain “quantization” of a Gaussian pdf.

This short post aims simply to show that this fact can be easily understood thanks to known results showing that sub-Gaussian random matrices respect the RIP [3]. Certain of the relations described below are probably very well known in the statistical literature but, as we are always re-inventing the wheel (at least me ;-)), I found anyway interesting to share this through this blog post.

Let’s first recall what a sub-Gaussian random variable (rv) is. For this, I’m following [1]. A random variable X is sub-Gaussian if its sub-Gaussian norm

\|X\|_{\psi_2} := \sup_{p\geq 1} p^{-1/2}\,(\mathbb E |X|^p)^{1/p}

is finite. In particular, any rv with such a finite norm has a tail bound that decays as fast as the one of a Gaussian rv, i.e., for some c>0 independent of X, if \|X\|_{\psi_2} \leq L, then

\mathbb P[|X| \geq t] \lesssim \exp(- c t^2/L^2).

The set of sub-Gaussian random variables includes for instance the Gaussian, the Bernoulli and the bounded rv’s, as \|X\|_{\psi_2} \leq {\rm inf}\{t > 0: \mathbb P(|X|<t) = 1\}.

In CS theory, it is now well known that if a random matrix \Phi \in \mathbb R^{M \times N} is generated randomly with iid entries drawn as a sub-Gaussian rv with zero mean and unit variance, then, with high probability and provided M \gtrsim K \log N/K,the matrix \Phi respects the RIP of order K and constant \delta \in (0,1), i.e.,

\forall u \in \mathbb R^N: \|u\|_0 := |{\rm supp}\,u| \leq K,\ (1-\delta)\|u\| \leq \|\Phi u\| \leq (1+\delta) \|u\|.

In fact, if \Phi has the RIP(2K,\delta), any K-sparse signal x (with \|x\|_0 \leq K) can be stably estimated (e.g., using Basis Pursuit or greedy algorithms) from y = \Phi x + n, possibly with some moderate noise n [2].

The simple point I’d like to show here is that a large class of RIP matrices can be generated by quantizing (elementwise) Gaussian and sub-Gaussian random matrices. I’m going to show this by proving that (i) quantizing a Gaussian rv leads to a sub-Gaussian rv, (ii) its sub-Gaussian norm can be easily upper bounded.

But what do I mean by “quantizing”?

This operation is actually defined here through a partition  \mathcal P = \{\mathcal P_i: i \in \mathbb Z\} of \mathbb R, i.e., \mathcal P_i \cap \mathcal P_j = \emptyset if i \neq j and \cup_i \mathcal P_i = \mathbb R.

Given a rv X, this partition determines a set of countable levels (or codebook) \mathcal C =\{c_i: i\in \mathbb Z\} such that

c_i(X) := \mathbb E[X|\mathcal P_i] = \mathbb E[X|X\in \mathcal P_i].

The quantized version Z = \mathcal Q(X) of X is then defined as

Z = \alpha^{-1/2} c_i\quad\Leftrightarrow\quad X \in \mathcal P_i,

with \alpha := \sum_i c_i^2 p_i and p_i = \mathbb P(Z = c_i) = \mathbb P(X \in \mathcal P_i) = \mathbb E[1|\mathcal P_i]. Note that we follow above a definition of levels that are known to lead (with an additional optimization of the partition \mathcal P) to an optimal quantizer of the rv X, as induced by the Lloyd-Max condition for minimizing the distortion \mathbb E|X - \mathcal Q[X]|^2 of a rv X [5].

We can thus deduce that, thanks to the definition of the levels c_i,

\mathbb E Z = \alpha^{-1/2} \sum_i c_i p_i = \alpha^{-1/2} \mathbb E X

and \mathbb E Z^2 = \alpha^{-1} \sum_i c^2_i p_i = 1, this last equality actually justifying the specific normalization in \alpha^{-1/2} of the levels.Therefore, \mathbb E X = 0 induces that \mathbb E Z = 0.

Moreover, for any p \geq 1, using  Jensen inequality and the definition of the c_i,

\mathbb E |Z|^p = \alpha^{-p/2}\,\sum_i |c_i|^p p_i \leq \alpha^{-p/2}\,\sum_i \mathbb E[|X|^p|\mathcal P_i]\, p_i \leq \alpha^{-p/2}\,\mathbb E |X|^p.

Therefore, by definition of the sub-Gaussian norm above, we find

\|Z\|_{\psi_2} = \|\mathcal Q(X)\|_{\psi_2} \leq \|X\|_{\psi_2}/\sqrt{\alpha},

which shows that Z = \mathcal Q(X) is also sub-Gaussian!

For instance, for a Gaussian rv X, taking \mathcal{P} = \{(-\infty, 0], (0, +\infty)\}, leads to \mathcal C = \{\pm 1\} and p_0 = p_1 = 1/2, since \mathbb E|X| = \sqrt{2/\pi} = 2 |c_i| and \alpha = 1/2\pi. We recover then, by quantization, a Bernoulli rv!

In consequence, a matrix \tfrac{1}{\sqrt M}\mathcal Q(\Phi) \in \mathbb R^{M \times N} obtained by quantizing the entries of a random Gaussian matrix \Phi will satisfy, whp, the RIP of order K (as for \tfrac{1}{\sqrt M} \Phi)  provided M \gtrsim K \log N/K, where the hidden constant depends only on \alpha (and implicitly on the quantizing partition \mathcal P).

From what is described above, it is clear that the entries of \mathcal Q(\Phi) are iid as \mathcal Q(X) with X \sim \mathcal N(0,1). Then, as explained in [1, 4, 3], it remains to show that  the rows of \mathcal Q(\Phi) are isotropic, i.e., that for any row r of \mathcal Q(\Phi), \mathbb E(r^T x)^2 = \|x\|^2 for any vector x \in \mathbb R^N. This trivially holds as r is composed of N entries iid as Z that has unit variance.

Note that nowhere above we have used the Gaussianity of X. Therefore the whole development still holds if we quantify a random matrix whose entries are generated by a sub-Gaussian distribution, possibility already discrete or quantized. In other words, the class of sub-Gaussian random matrices (with iid entries) that are RIP with high probability is somehow closed under entrywise quantization.

Open question:

  • What will happen if we quantize a structured random matrix, such as a random Fourier ensemble [2], a spreadspectrum sensing matrix [7] or a random convolution [6]? Or more simply random matrices were only the rows (or the columns) are guaranteed to be independent [1]? Do we recover (almost) known random matrix construction such as random partial Hadamard ensembles ?

References:

[1] R. Vershynin, “Introduction to the non-asymptotic analysis of random matrices”, http://arxiv.org/abs/1011.3027

[2] S. Foucart, H. Rauhut. A mathematical introduction to compressive sensing. Vol. 1. No. 3. Basel: Birkhäuser, 2013.

[3] R. Baraniuk, M. Davenport, R. DeVore, M. Wakin, “A simple proof of the restricted isometry property for random matrices”. Constructive Approximation, 28(3), 253-263, 2008.

[4] S. Mendelson, A. Pajor, N. Tomczak-Jaegermann, “Uniform uncertainty principle for bernoulli and subgaussian ensembles. Constructive Approximation, 28(3):277-289, 2008.

[5] R. M. Gray, D. L. Neuhoff, “Quantization”, IEEE Transactions on Information Theory, 44(6), 2325-2383, 1998.

[6] J. Romberg, “Compressive sensing by random convolution”. SIAM Journal on Imaging Sciences, 2(4), 1098-1128, 2009.
[7] G. Puy, P. Vandergheynst, R. Gribonval, Y. Wiaux, “Universal and efficient compressed sensing by spread spectrum and application to realistic Fourier imaging techniques”, EURASIP Journal on Advances in Signal Processing, 2012(1), 1-13.
Advertisements
This entry was posted in General. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s