학교수업/수치해석

[Numerical Analysis] 7. PCA

hwijin97 2022. 5. 20. 18:20

Bounding Volume Construction

- Principal Component Analysis (PCA)

 

 

Principal Components

- All principal components (PCs) start at the origin

- First PC is direction of maximum variance from origin

- Subsequent PCs are orthogonal to 1st PC and describe maximun residual variance

 

- For an arbitrary set of $N$ vertices $P_1, $P_2, \cdots , P_N$

- Mean position : $m = \frac{1}{N} \sum_{i=1}^N P_i$

- 3x3 covariance matrix : $C = \frac{1}{N} \sum_{i=1}^N (P_i-m)(P_i-m)^T$

    - Represents the correlation between each pair of the $x,y,z$ coordinates

 

- Coveriance matrix : An entry of zero -> no correlation

- $C$ is diagonal matrix -> three coordinates are uncorrelated ( 대각성분 제외 전부 0 )

 

 

- We want to transform points so that the corvariance matrix is diagonal

    - Transform matrix : $A$

    - new covariance matrix ( diagonal ) : $C' = \frac{1}{N}\sum_{i=1}^N (AP_i - Am)(AP_i-Am)^T = \frac{1}{N} \sum_{i=1}^N A(P_i - m)(P_i-m)^TA^T = ACA^T$ -> (make $A$ to diagonalization $C$)

    - Find $A$ using eigenvectors : Rows of $A$ are unit eigenvectors sorted by eigenvalues in decreasing order

    - $A^{-1} = A^T = [X_1 \cdots X_n] = 0 $

 

- Suppose data points are N-dimensional

    - Same procedure applies : $C' = ACA^T$

    - The eigenvectors define a new coordinates

      :: eigenvector with largest eigenvalue captures the most variation among training vectors

      :: eigenvector with smallest eigenvalue has least variation

    - We can compress the data by only using the top few eigenvectors

      :: corresponds to choosing a "linear subsplace"

 

즉 covariance matrix 의 eigenvalue, eigenvector 을 구해서, Transformation matrix $A = \begin{bmatrix} R \\ S \\ S \end{bmatrix}$  으로 변환한다. 여기서 가장작은 eigenvalue 의 축으로 차원을 축소해서 compression 을 진행할 수 있다.