Combining unsupervised dimension reduction with sufficient dimension reduction
Views: 0 / PDF downloads: 0
Keywords:
unsupervised dimension reduction, sufficient dimension reduction, complex measures, hybrid settingAbstract
We present a new method for dimension reduction that combines unsupervised dimension reduction (UDR) with sufficient dimension reduction (SDR). In unsupervised dimension reduction the goal is to find a low-dimensional linear subspace that approximates the support of a data distribution. If data is supervised, then in sufficient dimension reduction the goal is to find a low-dimensional linear subspace, called the effective subspace, such that the projection of an input vector onto that subspace maximally captures information on correlations between an input and an output.
The objective that we suggest to minimize consists of two parts. The first one is responsible for the UDR part, it forces a low-dimensional probabilistic measure \(\mu\) to approximate a distribution over inputs. The second one is responsible for the SDR part, it forces a regression function \(f\) to be consistent with supervised data. Additionally, we require the support of \(\mu\) and the effective subspace of \(f\) to be equal. In this hybrid setting we solve two problems, UDR and SDR, so that the UDR term serves as a regularizer of the SDR term.
We reformulate the problem as an optimization task of finding a \(k\)-dimensional linear subspace \(S\) and a pair of complex measures \((\mu,\mu')\) supported in \(S\). Instead of optimizing over complex measures, we suggest minimizing over ordinary functions \((g_1,g_2)\) but with an additional term \(R\) that penalizes a distortion of the common support of \(g_1,g_2\) from a \(k\)-dimensional linear subspace. The algorithm that we develop can be formulated for functions \((g_1,g_2)\) as well as for their inverse Fourier transforms. Eventually, we report results of numerical experiments on well-known datasets.


