# Smooth Lasso cross-validation¶

class mrinversion.linear_model.SmoothLassoCV(alphas, lambdas, inverse_dimension, folds=10, max_iterations=10000, tolerance=1e-05, positive=True, sigma=0.0, randomize=False, times=2, verbose=False, n_jobs=- 1, method='gradient_decent')[source]

Bases: mrinversion.linear_model._base_l1l2.GeneralL2LassoCV

The linear model trained with the combined l1 and l2 priors as the regularizer. The method minimizes the objective function,

(14)$\| {\bf Kf - s} \|^2_2 + \alpha \sum_{i=1}^{d} \| {\bf J}_i {\bf f} \|_2^2 + \lambda \| {\bf f} \|_1 ,$

where $${\bf K} \in \mathbb{R}^{m \times n}$$ is the kernel, $${\bf s} \in \mathbb{R}^{m \times m_\text{count}}$$ is the known signal containing noise, and $${\bf f} \in \mathbb{R}^{n \times m_\text{count}}$$ is the desired solution. The parameters, $$\alpha$$ and $$\lambda$$, are the hyperparameters controlling the smoothness and sparsity of the solution $${\bf f}$$. The matrix $${\bf J}_i$$ is given as

(15)${\bf J}_i = {\bf I}_{n_1} \otimes \cdots \otimes {\bf A}_{n_i} \otimes \cdots \otimes {\bf I}_{n_{d}},$

where $${\bf I}_{n_i} \in \mathbb{R}^{n_i \times n_i}$$ is the identity matrix,

(16)$\begin{split}{\bf A}_{n_i} = \left(\begin{array}{ccccc} 1 & -1 & 0 & \cdots & \vdots \\ 0 & 1 & -1 & \cdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & 0 \\ 0 & \cdots & 0 & 1 & -1 \end{array}\right) \in \mathbb{R}^{(n_i-1)\times n_i},\end{split}$

and the symbol $$\otimes$$ is the Kronecker product. The terms, $$\left(n_1, n_2, \cdots, n_d\right)$$, are the number of points along the respective dimensions, with the constraint that $$\prod_{i=1}^{d}n_i = n$$, where $$d$$ is the total number of dimensions.

The cross-validation is carried out using a stratified splitting of the signal.

Parameters
• alphas (ndarray) – A list of $$\alpha$$ hyperparameters.

• lambdas (ndarray) – A list of $$\lambda$$ hyperparameters.

• inverse_dimension (list) – A list of csdmpy Dimension objects representing the inverse space.

• folds (int) – The number of folds used in cross-validation.The default is 10.

• max_iterations (int) – The maximum number of iterations allowed when solving the problem. The default value is 10000.

• tolerance (float) – The tolerance at which the solution is considered converged. The default value is 1e-5.

• positive (bool) – If True, the amplitudes in the solution, $${\bf f}$$, is contrained to only positive values, else the solution may contain positive and negative amplitudes. The default is True.

• sigma (float) – The standard deviation of the noise in the signal. The default is 0.0.

• sigma – The standard deviation of the noise in the signal. The default is 0.0.

• randomize (bool) – If true, the folds are created by randomly assigning the samples to each fold. If false, a stratified sampled is used to generate folds. The default is False.

• times (int) – The number of times to randomized n-folds are created. Only applicable when randomize attribute is True.

• verbose (bool) – If true, prints the process.

• n_jobs (int) – The number of CPUs used for computation. The default is -1, that is, all available CPUs are used.

f

A ndarray of shape (m_count, nd, …, n1, n0). The solution, $${\bf f} \in \mathbb{R}^{m_\text{count} \times n_d \times \cdots n_1 \times n_0}$$ or an equivalent CSDM object.

Type

ndarray or CSDM object.

n_iter

The number of iterations required to reach the specified tolerance.

Type

int.

hyperparameters

A dictionary with the $$\alpha$$ and :math:lambda hyperparameters.

Type

dict.

cross_validation_curve

The cross-validation error metric determined as the mean square error.

Type

CSDM object.

Methods Documentation

fit(K, s)

Fit the model using the coordinate descent method from scikit-learn for all alpha anf lambda values using the n-folds cross-validation technique. The cross-validation metric is the mean squared error.

Parameters
• K – A $$m \times n$$ kernel matrix, $${\bf K}$$. A numpy array of shape (m, n).

• s – A $$m \times m_\text{count}$$ signal matrix, $${\bf s}$$ as a csdm object or a numpy array or shape (m, m_count).

predict(K)

Predict the signal using the linear model.

Parameters

K – A $$m \times n$$ kernel matrix, $${\bf K}$$. A numpy array of shape (m, n).

Returns

A numpy array of shape (m, m_count) with the predicted values.

residuals`(K, s)

Return the residual as the difference the data and the prediced data(fit), following

(17)$\text{residuals} = {\bf s - Kf^*}$

where $${\bf f^*}$$ is the optimum solution.

Parameters
• K – A $$m \times n$$ kernel matrix, $${\bf K}$$. A numpy array of shape (m, n).

• s – A csdm object or a $$m \times m_\text{count}$$ signal matrix, $${\bf s}$$.

Returns

If s is a csdm object, returns a csdm object with the residuals. If s is a numpy array, return a $$m \times m_\text{count}$$ residue matrix.