Smooth Lasso cross-validation¶
- class mrinversion.linear_model.SmoothLassoCV(alphas, lambdas, inverse_dimension, folds=10, max_iterations=10000, tolerance=1e-05, positive=True, sigma=0.0, randomize=False, times=2, verbose=False, n_jobs=-1, method='gradient_decent')[source]¶
Bases:
GeneralL2LassoCV
The linear model trained with the combined l1 and l2 priors as the regularizer. The method minimizes the objective function,
(18)¶\[\| {\bf Kf - s} \|^2_2 + \alpha \sum_{i=1}^{d} \| {\bf J}_i {\bf f} \|_2^2 + \lambda \| {\bf f} \|_1 ,\]where \({\bf K} \in \mathbb{R}^{m \times n}\) is the kernel, \({\bf s} \in \mathbb{R}^{m \times m_\text{count}}\) is the known signal containing noise, and \({\bf f} \in \mathbb{R}^{n \times m_\text{count}}\) is the desired solution. The parameters, \(\alpha\) and \(\lambda\), are the hyperparameters controlling the smoothness and sparsity of the solution \({\bf f}\). The matrix \({\bf J}_i\) is given as
(19)¶\[{\bf J}_i = {\bf I}_{n_1} \otimes \cdots \otimes {\bf A}_{n_i} \otimes \cdots \otimes {\bf I}_{n_{d}},\]where \({\bf I}_{n_i} \in \mathbb{R}^{n_i \times n_i}\) is the identity matrix,
(20)¶\[\begin{split}{\bf A}_{n_i} = \left(\begin{array}{ccccc} 1 & -1 & 0 & \cdots & \vdots \\ 0 & 1 & -1 & \cdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & 0 \\ 0 & \cdots & 0 & 1 & -1 \end{array}\right) \in \mathbb{R}^{(n_i-1)\times n_i},\end{split}\]and the symbol \(\otimes\) is the Kronecker product. The terms, \(\left(n_1, n_2, \cdots, n_d\right)\), are the number of points along the respective dimensions, with the constraint that \(\prod_{i=1}^{d}n_i = n\), where \(d\) is the total number of dimensions.
The cross-validation is carried out using a stratified splitting of the signal.
- Parameters:
alphas (ndarray) – A list of \(\alpha\) hyperparameters.
lambdas (ndarray) – A list of \(\lambda\) hyperparameters.
inverse_dimension (list) – A list of csdmpy Dimension objects representing the inverse space.
folds (int) – The number of folds used in cross-validation.The default is 10.
max_iterations (int) – The maximum number of iterations allowed when solving the problem. The default value is 10000.
tolerance (float) – The tolerance at which the solution is considered converged. The default value is 1e-5.
positive (bool) – If True, the amplitudes in the solution, \({\bf f}\), is constrained to only positive values, else the solution may contain positive and negative amplitudes. The default is True.
sigma (float) – The standard deviation of the noise in the signal. The default is 0.0.
sigma – The standard deviation of the noise in the signal. The default is 0.0.
randomize (bool) – If true, the folds are created by randomly assigning the samples to each fold. If false, a stratified sampled is used to generate folds. The default is False.
times (int) – The number of times to randomized n-folds are created. Only applicable when randomize attribute is True.
verbose (bool) – If true, prints the process.
n_jobs (int) – The number of CPUs used for computation. The default is -1, that is, all available CPUs are used.
- f¶
A ndarray of shape (m_count, nd, …, n1, n0). The solution, \({\bf f} \in \mathbb{R}^{m_\text{count} \times n_d \times \cdots n_1 \times n_0}\) or an equivalent CSDM object.
- Type:
ndarray or CSDM object.
- n_iter¶
The number of iterations required to reach the specified tolerance.
- Type:
int.
- hyperparameters¶
A dictionary with the \(\alpha\) and :math:lambda` hyperparameters.
- Type:
dict.
- cross_validation_curve¶
The cross-validation error metric determined as the mean square error.
- Type:
CSDM object.
Methods Documentation
- fit(K, s)¶
Fit the model using the coordinate descent method from scikit-learn for all alpha anf lambda values using the n-folds cross-validation technique. The cross-validation metric is the mean squared error.
- Parameters:
K – A \(m \times n\) kernel matrix, \({\bf K}\). A numpy array of shape (m, n).
s – A \(m \times m_\text{count}\) signal matrix, \({\bf s}\) as a csdm object or a numpy array or shape (m, m_count).
- predict(K)¶
Predict the signal using the linear model.
- Parameters:
K – A \(m \times n\) kernel matrix, \({\bf K}\). A numpy array of shape (m, n).
- Returns:
A numpy array of shape (m, m_count) with the predicted values.
- residuals(K, s)¶
Return the residual as the difference the data and the predicted data(fit), following
(21)¶\[\text{residuals} = {\bf s - Kf^*}\]where \({\bf f^*}\) is the optimum solution.
- Parameters:
K – A \(m \times n\) kernel matrix, \({\bf K}\). A numpy array of shape (m, n).
s – A csdm object or a \(m \times m_\text{count}\) signal matrix, \({\bf s}\).
- Returns:
If s is a csdm object, returns a csdm object with the residuals. If s is a numpy array, return a \(m \times m_\text{count}\) residue matrix.