# Generalizing projections to Mahalanobis-type metrics

In this previous post, we defined projections onto a subspace and obtained an expression for the matrix representing this projection. Specifically, if $A$ is a full rank matrix and $\text{proj}_A v$ is the projection of $v$ onto the column space of $A$, then $\text{proj}_A v = Pv$, where $P = A (A^T A)^{-1} A^T$. In this post, we generalize the idea of projections to Mahalanobis-type metrics.

Let $M \in \mathbb{R}^{d \times d}$ be a symmetric positive definite matrix, and define the metric $\| \cdot \|_M$ by $\| x \|_M^2 = x^\top Mx$ for $x \in \mathbb{R}^d$. Let $A \in \mathbb{R}^{d \times k}$ be some full-rank matrix with $k \leq d$. We can define the projection onto the column space of $A$ using the metric $\| \cdot \|_M$, denoted by $\Pi$, as the matrix such that for all $x$, $\Pi x = Ay$, where $y \in \mathbb{R}_k$ minimizes the quantity $\| x - Ay \|_M^2$. That is, the projection of $x$ is the vector in the column space of $A$ that is closest to $x$ in the metric $\| \cdot \|_M$.

We can determine the projection matrix $\Pi$ explicitly. Differentiating the objective function by $y$ and setting it to zero, \begin{aligned} \dfrac{\partial}{\partial y} \| x - Ay \|_M^2 &= \dfrac{\partial}{\partial y} (x - Ay)^\top M (x - Ay) \\ &= -2 A^\top M (x - Ay) = 0, \\ A^\top M A y &= A^\top Mx. \end{aligned}

Since $A$ has full rank and $M$ is non-singular, $A^\top M A$ is non-singular and so $y = \left( A^\top M A \right)^{-1} A^\top Mx$. This implies that \begin{aligned} \Pi = A \left( A^\top M A \right)^{-1} A^\top M. \end{aligned}

How does this relate to our previous post? Setting $M = I$ reduces $\Pi$ to our initial expression for the projection matrix, $A (A^\top A)^{-1} A^T$. That is, our original expression for the projection matrix was for the (orthogonal) projection according to the standard Euclidean metric.

References:

1. Ferguson, T. S. (1996). A Course in Large Sample Theory. Chapter 23.