HOME

TheInfoList



OR:

Pyramid vector quantization (PVQ) is a method used in audio and video
codec A codec is a computer hardware or software component that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder. In electronic communications, an endec is a device that acts as both an encoder and a decoder o ...
s to quantize and transmit
unit vector In mathematics, a unit vector in a normed vector space is a Vector (mathematics and physics), vector (often a vector (geometry), spatial vector) of Norm (mathematics), length 1. A unit vector is often denoted by a lowercase letter with a circumfle ...
s, i.e. vectors whose magnitudes are known to the decoder but whose directions are unknown. PVQ may also be used as part of a gain/shape quantization scheme, whereby the magnitude and direction of a vector are quantized separately from each other. PVQ was initially described in 1986 in the paper "A Pyramid Vector Quantizer" by Thomas R. Fischer. One caveat of PVQ is that it operates under the taxicab distance (L1-norm). Conversion to/from the more familiar
Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of the line segment between them. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, and therefore is o ...
(L2-norm) is possible via
vector projection The vector projection (also known as the vector component or vector resolution) of a vector on (or onto) a nonzero vector is the orthogonal projection of onto a straight line parallel to . The projection of onto is often written as \oper ...
, though results in a less uniform distribution of quantization points (the poles of the Euclidean ''n''-sphere become denser than non-poles). No efficient algorithm for the ideal (i.e., uniform) vector quantization of the Euclidean ''n''-sphere is known as of 2010. This non-uniformity can be reduced by applying deformation like coordinate-wise power before projection, reducing mean-squared quantization error by ~10%. PVQ is used in the
CELT The Celts ( , see Names of the Celts#Pronunciation, pronunciation for different usages) or Celtic peoples ( ) were a collection of Indo-European languages, Indo-European peoples. "The Celts, an ancient Indo-European people, reached the apoge ...
audio codec (inherited into
Opus Opus (: opera Opera is a form of History of theatre#European theatre, Western theatre in which music is a fundamental component and dramatic roles are taken by Singing, singers. Such a "work" (the literal translation of the Italian word "opera ...
) and the
Daala Daala is a video coding format under development by the Xiph.Org Foundation under the lead of Timothy B. Terriberry mainly sponsored by the Mozilla Corporation. Like Theora and Opus, Daala is available free of any royalties and its reference ...
video codec.


Overview

As a form of
vector quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. Developed in the early 1980s by Robert M. Gray, it was ori ...
, PVQ defines a codebook of quantization points, each of which is assigned an
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
codeword from 0 to −1. The goal of the encoder is to find the codeword of the closest vector, which the decoder must decode back into a vector. The PVQ codebook consists of all -dimensional points \vec with integer-only coordinates whose
absolute value In mathematics, the absolute value or modulus of a real number x, is the non-negative value without regard to its sign. Namely, , x, =x if x is a positive number, and , x, =-x if x is negative (in which case negating x makes -x positive), ...
s sum to a constant (i.e. whose
L1-norm In mathematics, the spaces are function spaces defined using a natural generalization of the Norm (mathematics)#p-norm, -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue , although a ...
equals ). In
set-builder notation In mathematics and more specifically in set theory, set-builder notation is a notation for specifying a set by a property that characterizes its members. Specifying sets by member properties is allowed by the axiom schema of specification. Th ...
: : S(N,K)=\left\ where \left\, \vec \right\, _1 denotes the L1-norm of \vec. As it stands, the set tesselates the surface of an -dimensional pyramid. If desired, we may reshape it into a sphere by "projecting" the points onto the sphere, i.e. by normalizing them: : S_\text(N,K)=\left\ where \left\, \vec \right\, _2 denotes the
L2-norm In mathematics, a norm is a function from a real or complex vector space to the non-negative real numbers that behaves in certain ways like the distance from the origin: it commutes with scaling, obeys a form of the triangle inequality, and ze ...
of \vec. Increasing the parameter results in more quantization points, and hence typically yields a more "accurate" approximation of the original
unit vector In mathematics, a unit vector in a normed vector space is a Vector (mathematics and physics), vector (often a vector (geometry), spatial vector) of Norm (mathematics), length 1. A unit vector is often denoted by a lowercase letter with a circumfle ...
\vec at the cost of larger integer codewords that require more bits to transmit.


Example

Suppose we wish to quantize three-dimensional unit vectors using the parameter =2. Our codebook becomes: (0.707 = \sqrt/2 rounded to 3 decimal places.) Now, suppose we wish to transmit the unit vector <0.592, −0.720, 0.362> (rounded here to 3 decimal places, for clarity). According to our codebook, the closest point we can pick is codeword 13 (<0.707, −0.707, 0.000>), located approximately 0.381 units away from our original point. Increasing the parameter results in a larger codebook, which typically increases the reconstruction accuracy. For example, based on the Python code below, =5 (codebook size: 102) yields an error of only 0.097 units, and =20 (codebook size: 1602) yields an error of only 0.042 units.


Python code

import itertools import math from typing import List, NamedTuple, Tuple class PVQEntry(NamedTuple): codeword: int point: Tuple nt, ... normalizedPoint: Tuple loat, ... def create_pvq_codebook(n: int, k: int) -> List VQEntry """ Naive algorithm to generate an n-dimensional PVQ codebook with k pulses. Runtime complexity: O(k**n) """ ret = [] for p in itertools.product(range(-k, k + 1), repeat=n): if sum(abs(x) for x in p)

k: norm = math.sqrt(sum(x ** 2 for x in p)) q = tuple(x / norm for x in p) ret.append(PVQEntry(len(ret), p, q)) return ret def search_pvq_codebook( codebook: List VQEntry p: Tuple loat, ...) -> Tuple VQEntry, float """ Naive algorithm to search the PVQ codebook. Returns the point in the codebook that's "closest" to p, according to the Euclidean distance.) """ ret = None min_dist = None for entry in codebook: q = entry.normalizedPoint dist = math.sqrt(sum((q - p ** 2 for j in range(len(p)))) if min_dist is None or dist < min_dist: ret = entry min_dist = dist return ret, min_dist def example(p: Tuple loat, ... k: int) -> None: n = len(p) codebook = create_pvq_codebook(n, k) print("Number of codebook entries: " + str(len(codebook))) entry, dist = search_pvq_codebook(codebook, p) print("Best entry: " + str(entry)) print("Distance: " + str(dist)) phi = 1.2 theta = 5.4 x = math.sin(phi) * math.cos(theta) y = math.sin(phi) * math.sin(theta) z = math.cos(phi) p = (x, y, z) example(p, 2) example(p, 5) example(p, 20)


Complexity

The PVQ codebook can be searched in O(KN). Encoding and decoding can likewise be performed in O(KN) using O(K+N) memory. The codebook size obeys the recurrence : V(N,K) = V(N-1,K) + V(N,K-1) + V(N-1,K-1) with V(N,0) = 1 for all N \ge 0 and V(0,K) = 0 for all K \ne 0. A closed-form solution is given by : V(N,K) = 2N \cdot _2F_1 (1-K,1-N;2;2). where _2F_1 is the
hypergeometric function In mathematics, the Gaussian or ordinary hypergeometric function 2''F''1(''a'',''b'';''c'';''z'') is a special function represented by the hypergeometric series, that includes many other special functions as specific or limiting cases. It is ...
.


See also

*
Vector quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. Developed in the early 1980s by Robert M. Gray, it was ori ...
*
ACELP Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the cod ...
*
Opus (audio format) Opus is a Lossy audio compression, lossy audio coding format developed by the Xiph.Org Foundation and standardized by the Internet Engineering Task Force, designed to efficiently speech coding, code speech and general audio in a single format, w ...


References

{{Reflist Lossy compression algorithms