Tensor reshaping

From HandWiki

In multilinear algebra, a reshaping of tensors is any bijection between the set of indices of an order-[math]\displaystyle{ d }[/math] tensor and the set of indices of an order-[math]\displaystyle{ \ell }[/math] tensor, where [math]\displaystyle{ \ell \lt d }[/math]. The use of indices presupposes tensors in coordinate representation with respect to a basis. The coordinate representation of a tensor can be regarded as a multi-dimensional array, and a bijection from one set of indices to another therefore amounts to a rearrangement of the array elements into an array of a different shape. Such a rearrangement constitutes a particular kind of linear map between the vector space of order-[math]\displaystyle{ d }[/math] tensors and the vector space of order-[math]\displaystyle{ \ell }[/math] tensors.

Definition

Given a positive integer [math]\displaystyle{ d }[/math], the notation [math]\displaystyle{ [d] }[/math] refers to the set [math]\displaystyle{ \{1, \dots, d \} }[/math] of the first d positive integers.

For each integer [math]\displaystyle{ k }[/math] where [math]\displaystyle{ 1 \le k \le d }[/math] for a positive integer [math]\displaystyle{ d }[/math], let Vk denote an nk-dimensional vector space over a field [math]\displaystyle{ F }[/math]. Then there are vector space isomorphisms (linear maps)

[math]\displaystyle{ \begin{align} V_1 \otimes \cdots \otimes V_d & \simeq F^{n_1} \otimes \cdots \otimes F^{n_d} \\ & \simeq F^{n_{\pi_1}} \otimes \cdots \otimes F^{n_{\pi_d}} \\ & \simeq F^{n_{\pi_1} n_{\pi_2}} \otimes F^{n_{\pi_3}} \otimes \cdots \otimes F^{n_{\pi_d}} \\ & \simeq F^{n_{\pi_1} n_{\pi_3}} \otimes F^{n_{\pi_2}} \otimes F^{n_{\pi_4}} \otimes \cdots \otimes F^{n_{\pi_d}} \\ & \,\,\,\vdots \\ & \simeq F^{n_1 n_2 \ldots n_d}, \end{align} }[/math]

where [math]\displaystyle{ \pi \in \mathfrak{S}_d }[/math] is any permutation and [math]\displaystyle{ \mathfrak{S}_d }[/math] is the symmetric group on [math]\displaystyle{ d }[/math] elements. Via these (and other) vector space isomorphisms, a tensor can be interpreted in several ways as an order-[math]\displaystyle{ \ell }[/math] tensor where [math]\displaystyle{ \ell \le d }[/math].

Coordinate representation

The first vector space isomorphism on the list above, [math]\displaystyle{ V_1 \otimes \cdots \otimes V_d \simeq F^{n_1} \otimes \cdots \otimes F^{n_d} }[/math], gives the coordinate representation of an abstract tensor. Assume that each of the [math]\displaystyle{ d }[/math] vector spaces [math]\displaystyle{ V_k }[/math] has a basis [math]\displaystyle{ \{ v_1^k, v_2^k, \ldots, v_{n_k}^k \} }[/math]. The expression of a tensor with respect to this basis has the form [math]\displaystyle{ \mathcal{A} = \sum_{j_1=1}^{n_1}\ldots\sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} v_{j_1}^1 \otimes v_{j_2}^2 \otimes \cdots \otimes v_{j_d}^{d}, }[/math] where the coefficients [math]\displaystyle{ a_{j_1,j_2,\ldots,j_d} }[/math] are elements of [math]\displaystyle{ F }[/math]. The coordinate representation of [math]\displaystyle{ \mathcal{A} }[/math] is [math]\displaystyle{ \sum_{j_1=1}^{n_1}\ldots\sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_1}^1 \otimes \mathbf{e}_{j_2}^2 \otimes \cdots \otimes \mathbf{e}_{j_d}^d, }[/math]where [math]\displaystyle{ \mathbf{e}_{j}^k }[/math] is the [math]\displaystyle{ j^\text{th} }[/math] standard basis vector of [math]\displaystyle{ F^{n_k} }[/math]. This can be regarded as a d-dimensional array whose elements are the coefficients [math]\displaystyle{ a_{j_1,j_2,\ldots,j_d} }[/math].

Vectorization

By means of a bijective map [math]\displaystyle{ \mu : [n_1] \times \cdots \times [n_d] \to [n_1\cdots n_d] }[/math], a vector space isomorphism between [math]\displaystyle{ F^{n_1} \otimes \cdots \otimes F^{n_d} }[/math] and [math]\displaystyle{ F^{n_1 \cdots n_d} }[/math] is constructed via the mapping [math]\displaystyle{ \mathbf{e}_{j_1}^1 \otimes \cdots \otimes \mathbf{e}_{j_d}^d \mapsto \mathbf{e}_{\mu(j_1,j_2,\ldots,j_d)}, }[/math] where for every natural number [math]\displaystyle{ j }[/math] such that [math]\displaystyle{ 1 \le j \le n_1 \cdots n_d }[/math], the vector [math]\displaystyle{ \mathbf{e}_j }[/math] denotes the jth standard basis vector of [math]\displaystyle{ F^{n_1 \cdots n_d} }[/math]. In such a reshaping, the tensor is simply interpreted as a vector in [math]\displaystyle{ F^{n_1 \cdots n_d} }[/math]. This is known as vectorization, and is analogous to vectorization of matrices. A standard choice of bijection [math]\displaystyle{ \mu }[/math] is such that

[math]\displaystyle{ \operatorname{vec}(\mathcal{A}) := \begin{bmatrix} a_{1,1,\ldots,1} & a_{2,1,\ldots,1} & \cdots & a_{n_1,1,\ldots,1} & a_{1,2,1,\ldots,1} & \cdots & a_{n_1,n_2,\ldots,n_d} \end{bmatrix}^T, }[/math]

which is consistent with the way in which the colon operator in Matlab and GNU Octave reshapes a higher-order tensor into a vector. In general, the vectorization of [math]\displaystyle{ \mathcal{A} }[/math] is the vector [math]\displaystyle{ [ a_{\mu^{-1}(j)} ]_{j=1}^{n_1 \cdots n_d} }[/math].

General flattenings

For any permutation [math]\displaystyle{ \pi \in \mathfrak{S}_d }[/math] there is a canonical isomorphism between the two tensor products of vector spaces [math]\displaystyle{ V_1 \otimes V_2 \otimes \cdots \otimes V_d }[/math] and [math]\displaystyle{ V_{\pi(1)} \otimes V_{\pi(2)} \otimes \cdots \otimes V_{\pi(d)} }[/math]. Parentheses are usually omitted from such products due to the natural isomorphism between [math]\displaystyle{ V_i\otimes(V_j\otimes V_k) }[/math] and [math]\displaystyle{ (V_i\otimes V_j)\otimes V_k }[/math], but may, of course, be reintroduced to emphasize a particular grouping of factors. In the grouping,

[math]\displaystyle{ (V_{\pi(1)} \otimes \cdots \otimes V_{\pi(r_1)})\otimes(V_{\pi(r_1+1)} \otimes \cdots \otimes V_{\pi(r_2)})\otimes\cdots\otimes(V_{\pi(r_{\ell-1}+1)} \otimes \cdots \otimes V_{\pi(r_\ell)}), }[/math]

there are [math]\displaystyle{ \ell }[/math] groups with [math]\displaystyle{ r_j-r_{j-1} }[/math] factors in the [math]\displaystyle{ j^\text{th} }[/math] group (where [math]\displaystyle{ r_0=0 }[/math] and [math]\displaystyle{ r_\ell=d }[/math]).

Letting [math]\displaystyle{ S_j=(\pi(r_{j-1}+1),\pi(r_{j-1}+2),\ldots,\pi(r_j)) }[/math] for each [math]\displaystyle{ j }[/math] satisfying [math]\displaystyle{ 1\le j\le\ell }[/math], an [math]\displaystyle{ (S_1,S_2,\ldots,S_\ell) }[/math]-flattening of a tensor [math]\displaystyle{ \mathcal{A} }[/math], denoted [math]\displaystyle{ \mathcal{A}_{(S_1,S_2,\ldots,S_\ell)} }[/math], is obtained by applying the two processes above within each of the [math]\displaystyle{ \ell }[/math] groups of factors. That is, the coordinate representation of the [math]\displaystyle{ j^\text{th} }[/math] group of factors is obtained using the isomorphism [math]\displaystyle{ (V_{\pi(r_{j-1}+1)} \otimes V_{\pi(r_{j-1}+2)} \otimes \cdots \otimes V_{\pi(r_j)})\simeq(F^{n_{\pi(r_{j-1}+1)}}\otimes F^{n_{\pi(r_{j-1}+2)}}\otimes\cdots\otimes F^{n_{\pi(r_j)}}) }[/math], which requires specifying bases for all of the vector spaces [math]\displaystyle{ V_k }[/math]. The result is then vectorized using a bijection [math]\displaystyle{ \mu_j:[n_{\pi(r_{j-1}+1)}]\times[n_{\pi(r_{j-1}+2)}]\times\cdots\times[n_{\pi(r_j)}]\to[N_{S_j}] }[/math] to obtain an element of [math]\displaystyle{ F^{N_{S_j}} }[/math], where [math]\displaystyle{ N_{S_j} := \prod_{i=r_{j-1}+1}^{r_j} n_{\pi(i)} }[/math], the product of the dimensions of the vector spaces in the [math]\displaystyle{ j^\text{th} }[/math] group of factors. The result of applying these isomorphisms within each group of factors is an element of [math]\displaystyle{ F^{N_{S_1}} \otimes \cdots \otimes F^{N_{S_\ell}} }[/math], which is a tensor of order [math]\displaystyle{ \ell }[/math].

The vectorization of [math]\displaystyle{ \mathcal{A} }[/math] is an [math]\displaystyle{ (S_1) }[/math]-reshaping, [math]\displaystyle{ \mathcal{A}_{(S_1)} }[/math] wherein [math]\displaystyle{ S_1 = (1,2,\ldots,d) }[/math].

Matricization

Let [math]\displaystyle{ \mathcal{A} \in F^{n_1} \otimes F^{n_2} \otimes \cdots \otimes F^{n_d} }[/math] be the coordinate representation of an abstract tensor with respect to a basis. A standard factor-k flattening of [math]\displaystyle{ \mathcal{A} }[/math] is an [math]\displaystyle{ (S_1, S_2) }[/math]-reshaping in which [math]\displaystyle{ S_1 = (k) }[/math] and [math]\displaystyle{ S_2 = (1,2,\ldots,k-1,k+1,\ldots,d) }[/math]. Usually, a standard flattening is denoted by

[math]\displaystyle{ \mathcal{A}_{(k)} := \mathcal{A}_{(S_1,S_2)} }[/math]

These reshapings are sometimes called matricizations or unfoldings in the literature. A standard choice for the bijections [math]\displaystyle{ \mu_1,\ \mu_2 }[/math] is the one that is consistent with the reshape function in Matlab and GNU Octave, namely

[math]\displaystyle{ \mathcal{A}_{(k)} := \begin{bmatrix} a_{1,1,\ldots,1,1,1,\ldots,1} & a_{2,1,\ldots,1,1,1,\ldots,1} & \cdots & a_{n_1,n_2,\ldots,n_{k-1},1,n_{k+1},\ldots,n_d} \\ a_{1,1,\ldots,1,2,1,\ldots,1} & a_{2,1,\ldots,1,2,1,\ldots,1} & \cdots & a_{n_1,n_2,\ldots,n_{k-1},2,n_{k+1},\ldots,n_d} \\ \vdots & \vdots & & \vdots \\ a_{1,1,\ldots,1,n_k,1,\ldots,1} & a_{2,1,\ldots,1,n_k,1,\ldots,1} & \cdots & a_{n_1,n_2,\ldots,n_{k-1},n_k,n_{k+1},\ldots,n_d} \end{bmatrix} }[/math]