Last January, I was honored to be invited in RWTH Aachen University by Holger Rauhut and Sjoerd Dirksen to give a talk on the general topic of quantized compressed sensing. In particular, I decided to focus my presentation on the **quasi-isometric** embeddings arising in 1-bit compressed sensing, as developed by a few researchers in this field (e.g., Petros Boufounos, Richard Baraniuk, Yaniv Plan, Roman Vershynin and myself).

In comparison with isometric (bi-Lipschitz) embeddings suffering only from a multiplicative distortion, as for the restricted isometry property (RIP) for sparse vectors sets or Johnson-Lindenstrauss Lemma for finite sets, a quasi-isometric embedding between two metric spaces and is characterized by both a *multiplicative* **and** an *additive* distortion, i.e., for some values , respects

for all .

In the case of 1-bit Compressed Sensing, one observes a quasi-isometric relation between the *angular distance* of two vectors (or signals) in and the Hamming distance of their 1-bit quantized projections in . This mapping is simply obtained by keeping the sign (componentwise) of their multiplication by a random Gaussian matrix. Mathematically,

with and . Notice that for a general sensing matrix (e.g., possibly non-Gaussian), if reachable, can only induce a quasi-isometric relation. This is due to the information loss induced by the “sign” quantization in, e.g., the loss of the signal amplitude. Moreover, it is known that for a Bernoulli sensing matrix , two vectors can be arbitrarily close and share the same quantization point [2,5].

Interestingly, it has been shown in a few works that quasi-isometric 1-bit embeddings exist for -sparse signals [1] or for any bounded subset of provided the typical dimension of this set is bounded [2]. This dimension is nothing but the Gaussian mean width of the set [2] defined by

Since the 80’s, this dimension has been recognized as central for instance in characterizing random processes [9], shrinkage estimators in signal denoising and high-dimensional statistics [10], linear inverse problem solving with convex optimization [11] or classification efficiency in randomly projected signal sets [12]. Moreover, for the set of bounded *K*-sparse signals, we have , which is the quantity of interest for characterizing the RIP of order of random Gaussian matrices (with other interesting characterizations for the *compressible* signals set or for “signals” consisting of rank-*r* matrices).

In [2], by connecting the problem to the context of random tessellation of the sphere , the authors have shown that if is larger than

then, for all ,

For -sparse vectors, this condition is even reduced to as shown in [1].

As explained in my previous post on quantized embedding and the funny connection with Buffon’s needle problem, I have recently noticed that for finite sets , quasi-isometric relations also exist with high probability between the Euclidean distance (or -distance) of vectors and the -distance of their (dithered) quantized random projections. Generalizing above, this new mapping reads

for a (“round-off”) scalar quantizer with resolution applied componentwize on vectors, a random Gaussian (as above) and a *dithering *, with uniformizing the action of the quantizer (as used in [6] for more general quantizers).

In particular, provided , with high probability, for all ,

for some general constants .

One observes that the additive distortion of this last relation reads , i.e., it can be made arbitrarily low with ! However, directly integrating the quantization in the RIP satisfied by would have rather led to a constant additive distortion, only function of .

*Remark*: *For information, as provided by an anonymous expert reviewer, it happens that a much shorter and elegant proof of this fact exists using the sub-Gaussian properties of the quantized mapping (see Appendix A in the associated revised paper).*

**In that work, the question remained, however, on how to extend (1) for vectors taken in any (bounded) subset of , hence generalizing the quasi-isometric embeddings observed for 1-bit random projections? Moreover, it was still unclear if the sensing matrix could be non-Gaussian, i.e., if any sub-Gaussian matrix (e.g., Bernoulli ) could define a suitable
**

While I was in Aachen discussing with Holger and Sjoerd during and after my presentation, I realized that the works [2-5] of Plan, Verhsynin, Ai and collaborators provided already many important tools whose adaptation to the mapping above could answer those questions.

At the heart of [2] lies the equivalence between and a counting procedure associated to

where is the row of , is the indicator set function equals to 1 if is non-empty and 0 otherwise, and

.

In words, counts the number of hyperplanes defined by the normals that separate from .

Without giving too many details in this post, the work [2] leverages this equivalence to make a connection with tessellation of the -sphere where it is shown that the number of such separating random hyperplanes is somehow close (up to some distortions) to the angular distance between the two vecors. They obtain this by “softening” the Hamming distance above, i.e., by introducing, for some , the soft Hamming distance

with now

.

This distance has an interesting continuity property compared to . In particular, if one allows to vary, it is continuous up to small () perturbations of and .

In our context, considering the quantized mapping specified above, we can define the generalized *soften* distance :

When we actually recover the distance used in the quasi-isometry (1), i.e.,

similarly to the recovering of the Hamming distance in 1-bit case.

The interpretation of when is now that we again count the number of hyperplanes defined by the normals that separate from , but now, for each measurement index this count can be bigger than 1 as we allow multiple parallel hyperplanes for each normal , all far apart. This is in connection with the “hyperplane wave partitions” defined in [7,8], e.g., for explaining quantization of vector frame coefficients. As explained in the paper, when , then relaxes (for ) or strengthens (for ) the conditions allowing to count a separating hyperplane.

As for 1-bit projections, we can show the continuity of with respect to small ( now) perturbation of and , and this fact allows to study the random (sub-Gaussian) concentration of , i.e., studying it for a coverage of the set of interest, and then extending it to the whole set from this continuity.

From this observation, and after a few months of works, I have gathered all these developments in the following paper, now submitted on arxiv:

Briefly, benefiting of the context defined above, this work contains** two main results**.

**First**, it shows that given a symmetric sub-Gaussian distribution and a precision , if

and if the sensing matrix has entries i.i.d. as , then, *with high probability*, the mapping above provides a -quasi-isometry between vector pair of whose difference are “*not too sparse*” (as already noticed in [5] for 1-bit CS) and their images in . More precisely, for some , if for ,

then, given some constant depending on the sub-Gaussian distribution (with if is Gaussian),

Interestingly, in this quasi-isometric embedding, we notice that the additive distortion is driven by as in (1) while the multiplicative distortion reads now

In addition to its common dependence in the precision , this distortion is also function of the “antisparse” nature of . Indeed, if then it cannot be sparser than with anyway . In other words, when the matrix is non-Gaussian (but sub-Gaussian anyway), the distortion is smaller for “anti-sparse” vector differences.

In the particular case where is the set of **bounded -sparse vectors** in any orthonormal basis, then only

measurements suffice for defining the same embedding with high probability. As explained in the paper, this case allows somehow to mitigate the anti-sparse requirement as sparse vectors in some basis can be made less sparse in another, e.g., for dual bases such as DCT/Canonical, Noiselet/Haar, etc.

**The second result** concerns the **consistency width** of the mapping , i.e., the biggest distance separating distinct vectors that are projected by on the same quantization point. With high probability, it happens that the consistency width of any pair of vectors whose difference is again“not too sparse” decays as

for a general subset and, up to some log factors, as for sparse vector sets.

**Open problem**: I’m now wondering if the tools and results above could be extended to other quantizers , such as the *universal* quantizer defined in [6]. This periodic quantizer has been shown to provide exponentially decaying distortions [6,13] for embedding of sparse vectors, with interesting applications (e.g., information retrieval). Knowing if this holds for other vector sets and for other (sub-Gaussian) sensing matrix is an appealing open question.

**References:**

[1] L. Jacques, J. N. Laska, P. T. Boufounos, and R. G Baraniuk. “Robust 1-bit compressive sensing via binary

stable embeddings of sparse vectors”. IEEE Transactions on Information Theory, 59(4):2082–2102, 2013.

[2] Y. Plan and R. Vershynin. “Dimension reduction by random hyperplane tessellations”. Discrete & Computational Geometry, 51(2):438–461, 2014, Springer.

[3] Y. Plan and R. Vershynin. “Robust 1-bit compressed sensing and sparse logistic regression: A convex programming approach”. IEEE Transactions on Information Theory, 59(1): 482–494, 2013.

[4] Y. Plan and R. Vershynin. “One-bit compressed sensing by linear programming”. Communications on Pure and Applied Mathematics, 66(8):1275–1297, 2013.

[5] A. Ai, A. Lapanowski, Y. Plan, and R. Vershynin. “One-bit compressed sensing with non-gaussian measurements”. Linear Algebra and its Applications, 441:222–239, 2014.

[6] P. T. Boufounos. Universal rate-efficient scalar quantization. IEEE Trans. Info. Theory, 58(3):1861–1872, March 2012.

[7] V. K Goyal, M. Vetterli, and N. T. Thao. Quantized overcomplete expansions in : Analysis, synthesis, and algorithms. IEEE Trans. Info. Theory, 44(1):16–31, 1998.

[8] N. T . Thao and M. Vetterli. “Lower bound on the mean-squared error in oversampled quantization of periodic signals using vector quantization analysis”. IEEE Trans. Info. Theory, 42(2):469–479, March 1996.

[9] A. W. Vaart and J. A. Wellner. Weak convergence and empirical processes. Springer, 1996.

[10] V. Chandrasekaran and M. I. Jordan. Computational and statistical tradeoffs via convex relaxation. arXiv preprint arXiv:1211.1073, 2012.

[11] V. Chandrasekaran, B. Recht, P. A Parrilo, and A. S. Willsky. The convex geometry of linear inverse problems. Foundations of Computational mathematics, 12(6):805–849, 2012.

[12] A. Bandeira, D. G Mixon, and B. Recht. Compressive classification and the rare eclipse problem. arXiv preprint arXiv:1404.3203, 2014.

[13] P. T. Boufounos and S. Rane. Efficient coding of signal distances using universal quantized embeddings. In Proc. Data Compression Conference (DCC), Snowbird, UT, March 20-22 2013.