特殊矩阵

特殊矩阵

这篇博客篇幅较长,总结了很多特殊矩阵,

1. 正交矩阵 (orthogonal matrix)

an orthogonal matrix or real orthogonal matrix is a matrix with real entries whose columns are orthogonal vectors , that is, \[
Q^T Q = I
\]
where \(I\) is the identity matrix

如果熟悉矩阵乘法,可以知道:
\[
\begin {align}
Q^T Q &= \begin {bmatrix}
\mathbf {q}^T_1 \\
\mathbf {q}^T_2 \\
\vdots \\
\mathbf {q}^T_n
\end {bmatrix}
\begin {bmatrix}
\mathbf {q}_1 & \mathbf {q}_2 &\cdots& \mathbf {q}_n \\
\end {bmatrix} \\
&= \begin {bmatrix}
\mathbf {q}_1^T \mathbf {q}_1 & \mathbf {q}_1^T\mathbf {q}_2 &\cdots& \mathbf {q}_1^T\mathbf {q}_n \\
\mathbf {q}_2^T \mathbf {q}_1 & \mathbf {q}_2^T\mathbf {q}_2 &\cdots& \mathbf {q}_2^T\mathbf {q}_n \\
\vdots & \vdots & \ddots & \vdots \\
\mathbf {q}_n^T \mathbf {q}_1 & \mathbf {q}_n^T\mathbf {q}_2 &\cdots& \mathbf {q}_n^T\mathbf {q}_n \\
\end {bmatrix}
\end{align}
\]
\(I\) 对比,可以得出结论: \(Q\) 的列向量两两之间规范正交 ,即 \[
\mathbf {q}_i^T\mathbf {q}_j = \begin{cases}
0, &\text{whenever } i \neq j ,\quad \text {giving the orthogonality}\\
1, &\text{whenever } i = j, \quad \text {giving the normalization}
\end{cases}
\]
note:

  1. orthogonal matrix 严格来说应该叫 orthonormal matrix ,这是历史遗留问题
  2. \(Q\) 不一定为方阵 (square matrix),它可以是普通矩阵。但如果 \(Q\) 满足 \(QQ^T = I\) ,则 \(Q\) 的 row columns 也满足 两两正交的条件,且 \(Q\) 为方阵

正交矩阵在 投影,QR分解 中介绍的很详细

如果把正交矩阵 \(Q\) 推广到复数域,就得到下面的 unitary matrix \(U\)

转置操作变成了共轭转置

2. unitary matrix and normal matrix

2.1 unitary matrix

a complex square matrix \(U\) is unitary if its conjugate transpose (共轭转置) \(U^{H}\) is also its inverse. That is. \[
U^{H} U = I
\]
where \(I\) is the identity matrix

Note:

  • 共轭转置(conjugate transpose) 符号有两种表示方式: \(U^H\) 或者 \(U^{\ast}\)
  • 矩阵 \(U\) 和上面的 \(Q\) 类似,不一定为方阵

A complex matrix with orthonormal columns is called a unitary matrix. \[
\mathbf {u}_i^{H}\mathbf {u}_j = \begin{cases}
0, &\text{whenever } i \neq j ,\quad \text {giving the orthogonality}\\
1, &\text{whenever } i = j, \quad \text {giving the normalization}
\end{cases}
\]
性质:

  1. \((U\mathbf {x})^H (U\mathbf {y}) = \mathbf {x}^H U^H Uy = \mathbf {x}^H \mathbf {y}\)
    Length unchanged: \(||U\mathbf {x}||^2 = (U\mathbf {x})^H (U\mathbf {x}) = ||\mathbf {x}||^2\)

  2. Every eigenvalue of \(U\) has absolute value \(|\lambda| = 1\) ,即所有的特征值都在复数域的单位圆上。
    证明: \(U\mathbf {x} = \lambda \mathbf {x}\) ,根据性质1,\(||U\mathbf {x}||^2 = ||\mathbf {x}||^2\)
    \(||\lambda \mathbf {x}|| = |\lambda| ||\mathbf {x}||\) ,于是 \(|\lambda| = 1\)

  3. eigenvector corresponding to different eigenvalues are orthonormal
    证明: 假设有 \(U\mathbf {x} = \lambda_1 \mathbf {x}\)\(U\mathbf {y} = \lambda_2 \mathbf {y}\) ,则根据性质1: \[
    \mathbf {x}^H \mathbf {y} = (U\mathbf {x})^H (U\mathbf {y}) = (\lambda_1 \mathbf {x})^H (\lambda_2 \mathbf {y}) = \overline \lambda_1 \lambda_2 \mathbf {x}^H \mathbf {y}
    \]
    根据性质2,所有特征值都有 \(|\lambda| = 1\), 如果 \(\lambda_1 \neq \lambda_2\), 则 \(\overline \lambda_1 \lambda_2 \neq 1\) ,于是 \(\mathbf {x}^H \mathbf {y} = 0\)

Unitary matrix 最重要的例子就是 Fast Fourier matrix \[
U = \cfrac {1}{\sqrt {n}}
\begin {bmatrix}
1 & 1 & \cdots & 1 \\
1 & w & \cdots & w^{n-1} \\
\vdots & \vdots & \ddots & \vdots \\
1 & w^{n-1} & \cdots & w^{(n-1)^2}
\end {bmatrix}
= \cfrac {\text {Fourier matrix}} {\sqrt {n}}
\]
其中 复数 \(w\) 是单位圆 (unit circle) 上 \(\theta = 2\pi i/n\) 对应的点,即 \(w = e^{2\pi i/n}\)

由于 \(U\) 的所有元素都是 纯虚数,因此 \(U^{-1} = U^H = \overline U\)

2.2 normal matrix

\(A\)\(n \times n\) 的复数矩阵,满足下面性质的矩阵 \(A\) 称为 normal matrix
\[
A^H A = A A^H
\]
normal matrix 矩阵的等价性质是 : unitarily diagonalizable

即:
\[
A = U D U^H
\]
其中 \(U\) 为 unitary matrix, 即 \(U^{-1} = U^H\) , \(D\) 为对角矩阵

重要的 normal matrix :

  1. 实对称矩阵, 因为 如果 \(A = A^T\), 则 \(A^T A = A A^T = A^2\)

  2. 下面要提到的 Hermitian 矩阵,实对称矩阵在复数域的推广

  3. orthogonal matrix and unitary matrix

  4. 下面提到的 skew-Hermitian matrix
    \(K^H K = (-K) (-K^H) = K K^H\)

normal matrix \(A = [a_{ij}]\) 的等价条件:

  1. \(A\) 可以对角化, \(A = U\Lambda U^H\) , 其中 \(\Lambda = \mathrm{diag} (\lambda_1, \ldots, \lambda_n)\) 是对角阵,\(U^H = U^{-1}\)\(U\) 为 unitary matrix
  2. \(A = B + iC\) , 其中 \(B\)\(C\) 是 可交换 (\(BC = CB\)) Hermitian 矩阵
  3. \(\sum_{i=1}^n \vert \lambda_i \vert^2 = \sum_{i=1}^n \sum_{j=1}^n \vert a_{ij} \vert^2\)

3. 实对称矩阵 和 Gramian 矩阵

3.1 对称矩阵

如果 \(A\) 满足 \(A^T = A\) ,则 \(A\) 为对称矩阵,(注: 先不讨论复数)

性质:

  1. 不同特征值对应的特征向量正交
    证明: 假设由两个特征值 \(\lambda\)\(\mu\) ,且 \(\lambda \neq \mu\) ,对应的特征向量为 \(\mathbf {x}\)\(\mathbf {y}\), 即 \(A \mathbf {x} = \lambda \mathbf {x}\)\(A \mathbf {y} = \mu \mathbf {y}\) 则: \[
    \begin {align}
    \mu \mathbf {y}^{T} \mathbf {x} &= (\mu \mathbf {y})^{T} \mathbf {x} = (A\mathbf {y})^{T} \mathbf {x} = \mathbf {y}^{T} A^T\mathbf {x} \\
    &= \mathbf {y}^{T} A\mathbf {x} =\mathbf {y}^{T} (A\mathbf {x}) = \lambda \mathbf {y}^{T} \mathbf {x}
    \end {align}
    \]
    \(\lambda \neq \mu\) ,故 \(\mathbf {y}^{T} \mathbf {x} = 0\) ,即 \(\mathbf {x}\)\(\mathbf {y}\) 正交

  2. 实对称矩阵可正交对角化 \(A = Q \Lambda Q^T\)
    根据 对角化 \(A = S\Lambda S^{-1}\) (\(AS = S\Lambda\) 是由 \(A\mathbf {x} = \lambda \mathbf {x}\) 推广得到)
    和 对称性质可以不严格证明,严格证明要使用上面的 normal matrix 的性质
    其中 \(Q\) 为 orthogonal matrix, 列向量正交由性质1得到,满足 \(Q^T = Q^{-1}\) ,矩阵 \(\Lambda\) 满足 \(\Lambda = \mathrm {diag}(\lambda_1, \ldots, \lambda_n)\)
    这就是著名的 principal axis theorem 或者 spectral theorem

注: 如果 \(A\) 满足 \(A^T = -A\) ,则 \(A\) 为 反对称矩阵 (skew-symmetric matrix)

3.2 Gramian 矩阵 (很重要)

假设 \(A\) 为一个 \(m \times n\) 阶实矩阵,则 \(n\) 阶方阵 \(G = [g_{ij}] = A^T A\) 称为 Gramian 矩阵。

考虑 \(A\) 的列向量表达式 \(A = \begin {bmatrix} \mathbf {a}_1 & \mathbf {a}_2 & \cdots & \mathbf {a}_n\end {bmatrix}\) , \(\mathbf {a}_i \in \mathbb {R}^m\) ,则 \[
\begin {align}
G &= A^T A =
\begin {bmatrix} \mathbf {a}_1^T \\ \mathbf {a}_2^T \\ \vdots \\ \mathbf {a}_n^T \end {bmatrix} \begin {bmatrix} \mathbf {a}_1 & \mathbf {a}_2 & \cdots & \mathbf {a}_n \end {bmatrix} \\
&= \begin {bmatrix} \mathbf {a}_1^T \mathbf {a}_1 & \mathbf {a}_1^T \mathbf {a}_2 & \cdots & \mathbf {a}_1^T\mathbf {a}_n \\ \mathbf {a}_2^T \mathbf {a}_1 & \mathbf {a}_2^T \mathbf {a}_2 & \cdots & \mathbf {a}_2^T\mathbf {a}_n \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf {a}_n^T \mathbf {a}_1 & \mathbf {a}_n^T \mathbf {a}_2 & \cdots & \mathbf {a}_n^T\mathbf {a}_n\end {bmatrix}
\end {align}
\]
性质:

  1. \(A^T A\) 是对称矩阵,于是它有对称矩阵的性质,其中最重要的一条:
    \(A^TA\) 的相异特征值对应的特征向量正交

  2. \(A^T A\) 是半正定矩阵
    随便提下半正定矩阵的性质:

    • \(\mathbf {x} \in \mathbb {R}^n\)\(\mathbf {x} \neq \mathbf {0}\) , 满足 \[
      \mathbf {x}^T A^T A \mathbf {x} = (A\mathbf {x})^T (A\mathbf {x}) = ||A \mathbf {x}||^2 \ge 0
      \]

    • \(A^TA\) 的所有特征值满足 \(\lambda_i \ge 0\)

    • No principal submatrices have negative determinant

    • No pivots are negative

  3. \(\mathrm {rank} (A^T A) = \mathrm {rank} A\)
    先证明: \(A\mathbf {x} = \mathbf {0} \to A^T A \mathbf {x} = \mathbf {0}\) , 这是显然的。
    然后证明: \(A^T A \mathbf {x} = \mathbf {0} \to A\mathbf {x} = \mathbf {0}\)
    两种证明思路:

    1. 利用 \(\mathbf {x}^T A^T A \mathbf {x} = ||A \mathbf {x}||^2 = \mathbf {0}\) ,推出 \(A \mathbf {x} = \mathbf {0}\)

    2. 利用 \(A\mathbf {x} \in C(A)\)\(A^T(A\mathbf {x}) = \mathbf {0} \to A\mathbf {x} \in N(A^T)\)
      \(C(A) \cap N(A^T) = \{ \mathbf {0} \}\) ,于是 \(A \mathbf {x} = \mathbf {0}\)

    于是: \(N(A) = N(A^T A)\) , \(\dim N(A) = \dim N(A^TA)\)
    根据 秩-零度定理,得到: \[
    \mathrm {rank} A = n – \dim N(A) \\
    \mathrm {rank} (A^TA) = n – \dim N(A^TA)
    \]
    于是: \(\mathrm {rank} (A^T A) = \mathrm {rank} A​\)

  4. 当且仅当 \(A\) 的列向量 \(\mathbf {a}_1, \ldots, \mathbf {a}_n\) 线性独立时,\(A^T A\) 可逆并且是正定矩阵
    根据定义,\(\dim C(A)\) 其实就是 \(A\) 的线性独立列向量个数,于是 \(\dim C(A) = n\) , 或者说 \(\mathrm{rank} A = n\) :
    根据性质3,得出 \(\text {rank}(A^TA) = \text {rank}(A) = n\) ,说明 \(A^T A\) 可逆
    正定矩阵证明,
    根据 \(A^TA\) 可逆,推出 \(\det (A^TA) \neq 0\) ,得到:
    \(\prod_{i=1}^n \lambda_i = \det(A^TA) \neq 0\) ,又由于 \(\lambda_i \ge 0\) ,于是 \(\lambda_i \gt 0\)
    于是 \(A^TA\) 为正定矩阵

  5. 实对称半正定矩阵 \(M\) 均可表示为 Gramian 矩阵

  6. Gramian 矩阵 \(G = A^TA\) 可表示为 \(G = R^TR\) ,其中 \(R\) 为上三角矩阵
    对于 \(m \times n\) 阶矩阵 \(A\), QR 分解得到 \(A = QR\) ,其中 \(m \times n\) 阶,

用途:

  1. 投影矩阵
  2. 协方差矩阵
  3. SVD 分解

请牢记一句话:

With rectangular matrices, the key is almost always to consider \(A^TA\) and \(AA^T\)

我们把 对称矩阵 推广到 复数域,得到 Hermitian Matrix

4. Hermitian matrix

a Hermitian matrix is a complex square matrix that is equal to its own conjugate transpose. That is. \[
A = A^{H}
\]
如果 \(A​\) 是 Hermitian 矩阵,有如下性质:

  1. 对于任意向量 \(\mathbf {x} \in \mathbb {C}^n\), \(\mathbf {x}^{H}A\mathbf {x}\) 是实数
    反之,亦成立
    证明 \((\mathbf {x}^{H}A\mathbf {x})^{H} = \mathbf {x}^{H}A\mathbf {x}\) 即可,
    反向证明,利用 \((\mathbf {x}+\mathbf {y})^{H}A(\mathbf {x}+\mathbf {y})\)

  2. \(A​\) 的特征值均为实数
    利用 \(A\mathbf {x} = \lambda \mathbf {x}\) ,可得 \(\mathbf {x}^{H}A\mathbf {x} = \lambda \mathbf {x}^{H}\mathbf {x}\)
    于是: \(\lambda = \mathbf {x}^{H}A\mathbf {x}/\mathbf {x}^{H}\mathbf {x}\)
    然后由性质1可得,\(\lambda\) 为实数

  3. 其他性质和 上面的 对称矩阵相似

注: 如果 \(K^H = -K\) ,那么 \(K\) 称为 skew-Hermitian 矩阵 (反共轭对称矩阵)

如果 \(A\) 是 Hermitian matrix, 那么 \(K = iA\) 是 skew-Hermitian 矩阵

实对称矩阵 和 复对称矩阵 的对比 总结:

Real Complex
\(\mathbb {R}^n\) (n real components) \(\mathbb {C}^n\) (n complex components)
length: \(||\mathbf {x}||^2 = x_1^2 + \cdots + x_n^2\) length: \(||\mathbf {x}||^2 = |x_1|^2 + \cdots + |x_2|^2\)
transpose: \(A_{ij}^T = A_{ij}\) Hermitian transpose: \(A_{ij}^H = A_{ij}\)
inner product: \(\mathbf {x}^T \mathbf {y} = x_1 y_1 + \cdots + x_n y_n\) inner product: \(\mathbf {x}^H \mathbf {y} = \overline x_1 y_1 + \cdots + \overline x_n y_n\)
\((A\mathbf {x})^T \mathbf {y} = \mathbf {x}^T (A^T \mathbf {y})\) \((A\mathbf {x})^H \mathbf {y} = \mathbf {x}^H (A^H \mathbf {y})\)
orthogonality: \(\mathbf {x}^T \mathbf {y} = 0\) orthogonality: \(\mathbf {x}^H \mathbf {y} = 0\)
symmetric matrices: \(A^T = A\) Hermitian matrices: \(A^H = A\)
\(A = Q \Lambda Q^{-1} = Q \Lambda Q^T\)(real \(\Lambda\)) \(A = U \Lambda U^{-1} = U \Lambda U^H\)(real \(\Lambda\))
skew-symmetric \(K^T = -K\) skew-Hermitian \(K^H = -K\)
orthogonal \(Q^TQ = I\) or \(Q^T = Q^{-1}\) unitary \(U^H U = I\) or \(U^{-1} = U^H\)
\((Q\mathbf {x})^T (Q\mathbf {y}) = \mathbf {x}^T \mathbf {y}\) and \(||Q\mathbf {x}|| = ||\mathbf {x}||\) \((U\mathbf {x})^H (U\mathbf {y}) = \mathbf {x}^H \mathbf {y}\) and \(||U\mathbf {x}|| = ||\mathbf {x}||\)

The column, rows and eigenvectors of \(Q​\) and \(U​\) are orthonormal, and every \(|\lambda| = 1​\)

2 thoughts on “特殊矩阵

发表评论

电子邮件地址不会被公开。 必填项已用*标注

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d 博主赞过: