Submitted by antodima t3_10oxy9j in MachineLearning

Hi all!

Given X ∈ ℝ ^(Nx), Y ∈ ℝ ^(Ny), β ∈ ℝ^(+), so

W = YX^(T)(XX^(T)+βI)^(-1) (with the Moore–Penrose pseudoinverse)

where A = YX^(T) and B = XX^(T)+βI.

If we consider an arbitrary number of indices/units < Nx, and so we consider only some columns of matrix A and some columns and rows (crosses) of B. The rest of A and B are zeros.

The approach above of sparsify A and B will break the ridge regression solution when W=AB^(-1)? If yes, there are ways to avoid it?

Many thanks!

2

Comments

You must log in or register to comment.

thevillagersid t1_j6kvatk wrote

Are you asking about the feasibility ridge regression with sparse inputs? Or about regularization to enforce a sparse solution?

1

antodima OP t1_j6m5aqd wrote

Basically is the feasibility ridge regression with sparse inputs, but I want to select partial units of W acting on A and B. For instance, if I have A of (2x5) and B of (5x5) and I choose units 2 and 4, the columns [0,1,3] of A are zeros and columns and rows of B with index [0,1,3] are also zero. I select the units 2 and 4 with some importance mechanism. The question is: there is a way of having W* resulting from filter A and B that is similar to W computed without filtering A and B?

I asked because filtering A and B break the inversion and so the computation of W. I don't know if there exists some way of decomposing B in order to invert more easily or something like this.

Anyway thanks for your interest!

1

thevillagersid t1_j6nfjle wrote

You can still compute the estimator with sparse inputs because the regularization term ensures the denominator is full rank. If the zeros are standing in for missing values, however, your estimates will be biased.

As for your second question, W* computed from only columns 2 and 4 will only yield the same values as W in the unrestricted model if the columns of X are orthogonal. Could you work with an orthogonal transform (e.g. PCA projection) of the X matrix?

2