NMF Algorithm

For $boldsymbol{u}in mathbb{R}^d$ , $boldsymbol{V} in mathbb{R}^{ntimes d}$ and $V subseteq mathbb{R}^d$ , let mathscr{D}(boldsymbol{u};boldsymbol{V})

be the squared ell_2

Distances of boldsymbol{u}

from the convex envelope of the rows of boldsymbol{V}

and the set

, respectively. Namely, letting Delta^n

be the standard (n-1)

–simplex, we let

$mathscr{D}(boldsymbol{u};boldsymbol{V}) equiv min_{boldsymbol{pi}in Delta^n}|boldsymbol{u} - boldsymbol{V}^{sf{T}}boldsymbol{pi}|_2,~~~~~ mathscr{D}(boldsymbol{u};V) equiv min_{boldsymbol{v}in V}|boldsymbol{u} - boldsymbol{v}|_2.$

Using this notation, if $boldsymbol{X}in mathbb{R}^{ntimes d}$ is the matrix with data points $boldsymbol{x_1}, boldsymbol{x}_2, cdots, boldsymbol{x}_n in mathbb{R}^d$ and $boldsymbol{H}in mathbb{R}^{rtimes d}$ is the matrix with archetypes $boldsymbol{h}_1, boldsymbol{h}_2, cdots, boldsymbol{h}_r in mathbb{R}^d$ as their rows, we attempt to solve the following nonconvex problem:

We propose an initialization algorithm based on the Successive projections algorithm in a paper by Araùjo et al. In addition, Proximal Alternating Linearized Minimization (PALM) proposed in this work by Bolte et al. is used for minimizing the above cost function.

Initialization

Input: Data ${boldsymbol{x}_i}_{ile n}$ , $boldsymbol{x}_iinmathbb{R}^d$ , integer ;
Output: Initial archetypes ${hat{boldsymbol{h}}_{ell}^{(0)}}_{1le ellle r}$ ;

Set $i(1) = argmax_{i le n} |boldsymbol{x}_i|_2^2$ ;
Set $hat{boldsymbol{h}}^{(0)}_{1}= boldsymbol{x}_{i(1)}$ ;
For
1. Define $V_{ell}equiv rm{aff}(hat{boldsymbol{h}}_{1}^{(0)},hat{boldsymbol{h}}_{2}^{(0)},dots,hat{boldsymbol{h}}_{ell}^{(0)})$ , the affine hull of $hat{boldsymbol{h}}_{1}^{(0)},hat{boldsymbol{h}}_{2}^{(0)},dots,hat{boldsymbol{h}}_{ell}^{(0)}$ ;
2. Set $i(ell+1) = argmax {mathscr{D}(boldsymbol{x}_{i};V_{ell}), :; ile n}$ ;
3. Set $hat{boldsymbol{h}}^{(0)}_{ell+1} = boldsymbol{x}_{i(ell+1)}$ ;
Return ${hat{boldsymbol{h}}_{ell}^{(0)}}_{1le ellle r}$ .

After finding the initial set of archetypes ${hat{boldsymbol{h}}_{ell}^{(0)}}_{1le ellle r}$ , initial set of weights ${hat{boldsymbol{w}}_{i}^{(0)}}_{1le ile n}$ can be found by

After finding the above initial estimates, we perform the PALM iterations that is guaranteed to converge to critical points of the risk function. For $boldsymbol{u}in mathbb{R}^d$ , and $V subseteq mathbb{R}^d$ , If we let

denote the projection of boldsymbol{u}

, for this risk function the iterations will take the following form:

Proximal Alternating Linearized Minimization (PALM) Iterations

Let {rm{conv}}(boldsymbol{X}) represent the convex envelope of the rows of and Initialize k = 0 . While $|boldsymbol{H}^{k+1}-boldsymbol{H}^{k}|_F > epsilon_1$ , $|boldsymbol{W}^{k+1}-boldsymbol{W}^{k}|_F > epsilon_2$ :

$widetilde{boldsymbol{H}}^{k} = boldsymbol{H}^{k} - frac{1}{gamma_1^k}(boldsymbol{W}^k)^{sf{T}}Big(boldsymbol{W}^kboldsymbol{H}^k - boldsymbol{X}Big)$ ;
$boldsymbol{H}^{k+1} = widetilde{boldsymbol{H}}^{k} - frac{lambda}{lambda + gamma_1^k}Big(widetilde{boldsymbol{H}}^{k} - boldsymbol{Pi}_{{rm{conv}}(boldsymbol{X})}Big(widetilde{boldsymbol{H}}^{k}Big)Big)$ ;
$boldsymbol{W}^{k+1} = boldsymbol{Pi}_{Delta^r}bigg(boldsymbol{W}^k - frac{1}{gamma_2^k}Big(boldsymbol{W}^kboldsymbol{H}^{k+1}-boldsymbol{X}Big)(boldsymbol{H}^{k+1})^{sf{T}}bigg)$ ;
.

In our code, we have also provided an accelerated version of the PALM iterations by employing the technique used by Beck and Teboulle in MFISTA in this paper and the extension of this technique presented by Li and Lin in this paper. Our code gives this option to the user to choose the parameter lambda

or let the algorithm to decide an appropriate value for lambda

using a data driven procedure explained below.

Let $boldsymbol{X} = boldsymbol{U}boldsymbol{Sigma}boldsymbol{V}^{sf{T}}$ be the singular value decomposition of the data matrix with $Sigma_{11}geq Sigma_{22} geq dots geq Sigma_{n,n}$ . Taking $widehat{boldsymbol{Sigma}}^{(r)}$ such that $widehat{boldsymbol{Sigma}}_{i,i}^{(r)} = Sigma_{i,i}$ for i leq r