Unimodal distribution

The following example demonstrates the statistical learning based determination of the nuclear shielding tensor parameters from a one-dimensional cross-section of a magic-angle flipping (MAF) spectrum. In this example, we use a synthetic MAF lineshape from a unimodal tensor distribution.

Before getting started

Import all relevant packages.

import csdmpy as cp
import matplotlib.pyplot as plt
import numpy as np

from mrinversion.kernel.nmr import ShieldingPALineshape
from mrinversion.linear_model import SmoothLasso, SmoothLassoCV, TSVDCompression
from mrinversion.utils import get_polar_grids

# Setup for the matplotlib figures


# function for 2D x-y plot.
def plot2D(ax, csdm_object, title=""):
    # convert the dimension coordinates of the csdm_object from Hz to pmm.
    _ = [item.to("ppm", "nmr_frequency_ratio") for item in csdm_object.dimensions]

    levels = (np.arange(9) + 1) / 10
    plt.figure(figsize=(4.5, 3.5))
    ax.contourf(csdm_object, cmap="gist_ncar", levels=levels)
    ax.grid(None)
    ax.set_title(title)
    get_polar_grids(ax)
    ax.set_aspect("equal")

Dataset setup

Import the dataset

Load the dataset. Here, we import the dataset as a CSDM data-object.

# the 1D MAF cross-section data in csdm format
filename = "https://osu.box.com/shared/static/puxfgdh25rru1q3li124anylkgup8rdp.csdf"
data_object = cp.load(filename)

# convert the data dimension from `Hz` to `ppm`.
data_object.dimensions[0].to("ppm", "nmr_frequency_ratio")

The variable data_object holds the 1D MAF cross-section. For comparison, let’s also import the true tensor parameter distribution from which the synthetic 1D pure anisotropic MAF cross-section line-shape is simulated.

datafile = "https://osu.box.com/shared/static/s5wpm26w4cv3w64qjhouqu458ch4z0nd.csdf"
true_data_object = cp.load(datafile)

The plot of the 1D MAF cross-section along with the 2D true tensor parameter distribution of the synthetic dataset is shown below.

# the plot of the 1D MAF cross-section dataset.
_, ax = plt.subplots(1, 2, figsize=(9, 3.5), subplot_kw={"projection": "csdm"})
ax[0].plot(data_object)
ax[0].invert_xaxis()

# the plot of the true tensor distribution.
plot2D(ax[1], true_data_object, title="True distribution")
plt.tight_layout()
plt.show()
  • True distribution
  • plot 1D 3 MAF

Linear Inversion setup

Dimension setup

Anisotropic-dimension: The dimension of the dataset that holds the pure anisotropic frequency contributions, which in this case, is the only dimension.

x-y dimensions: The two inverse dimensions corresponding to the x and y-axis of the x-y grid.

inverse_dimension = [
    cp.LinearDimension(count=25, increment="370 Hz", label="x"),  # the `x`-dimension.
    cp.LinearDimension(count=25, increment="370 Hz", label="y"),  # the `y`-dimension.
]

Generating the kernel

For MAF datasets, the line-shape kernel corresponds to the pure nuclear shielding anisotropy line-shapes. Use the ShieldingPALineshape class to generate a shielding line-shape kernel.

lineshape = ShieldingPALineshape(
    anisotropic_dimension=anisotropic_dimension,
    inverse_dimension=inverse_dimension,
    channel="29Si",
    magnetic_flux_density="9.4 T",
    rotor_angle="90 deg",
    rotor_frequency="14 kHz",
    number_of_sidebands=4,
)

Here, lineshape is an instance of the ShieldingPALineshape class. The required arguments of this class are the anisotropic_dimension, inverse_dimension, and channel. We have already defined the first two arguments in the previous sub-section. The value of the channel argument is the nucleus observed in the MAF experiment. In this example, this value is ‘29Si’. The remaining arguments, such as the magnetic_flux_density, rotor_angle, and rotor_frequency, are set to match the conditions under which the spectrum was acquired. The value of the number_of_sidebands argument is the number of sidebands calculated for each line-shape within the kernel.

Once the ShieldingPALineshape instance is created, use the kernel() method of the instance to generate the MAF line-shape kernel.

K = lineshape.kernel(supersampling=1)

Data Compression

Data compression is optional but recommended. It may reduce the size of the inverse problem and, thus, further computation time.

Out:

compression factor = 1.5737704918032787
truncation_index = 61

Solving the inverse problem

Smooth-LASSO problem

Solve the smooth-lasso problem. You may choose to skip this step and proceed to the statistical learning method. Usually, the statistical learning method is a time-consuming process that solves the smooth-lasso problem over a range of predefined hyperparameters. If you are unsure what range of hyperparameters to use, you can use this step for a quick look into the possible solution by giving a guess value for the \(\alpha\) and \(\lambda\) hyperparameters, and then decide on the hyperparameters range accordingly.

# guess alpha and lambda values.
s_lasso = SmoothLasso(alpha=5e-5, lambda1=5e-6, inverse_dimension=inverse_dimension)
s_lasso.fit(K=compressed_K, s=compressed_s)
f_sol = s_lasso.f

Here, f_sol is the solution corresponding to hyperparameters \(\alpha=5\times10^{-5}\) and \(\lambda=5\times 10^{-6}\). The plot of this solution is

_, ax = plt.subplots(1, 2, figsize=(9, 3.5), subplot_kw={"projection": "csdm"})

# the plot of the guess tensor distribution solution.
plot2D(ax[0], f_sol / f_sol.max(), title="Guess distribution")

# the plot of the true tensor distribution.
plot2D(ax[1], true_data_object, title="True distribution")
plt.tight_layout()
plt.show()
  • Guess distribution, True distribution
  • plot 1D 3 MAF
  • plot 1D 3 MAF

Predicted spectrum

You may also evaluate the predicted spectrum from the above solution following

residuals = s_lasso.residuals(K, data_object)
predicted_spectrum = data_object - residuals

plt.figure(figsize=(4, 3))
plt.subplot(projection="csdm")
plt.plot(data_object, color="black", label="spectrum")  # the original spectrum
plt.plot(predicted_spectrum, color="red", label="prediction")  # the predicted spectrum
plt.gca().invert_xaxis()
plt.legend()
plt.tight_layout()
plt.show()
plot 1D 3 MAF

As you can see from the predicted spectrum, our guess isn’t far from the optimum hyperparameters. Let’s create a search grid about the guess hyperparameters and run a cross-validation method for selection.

Statistical learning of the tensors

Smooth LASSO cross-validation

Create a guess range of values for the \(\alpha\) and \(\lambda\) hyperparameters. The following code generates a range of \(\lambda\) and \(\alpha\) values that are uniformly sampled on the log scale.

lambdas = 10 ** (-5.2 - 1 * (np.arange(6) / 5))
alphas = 10 ** (-4 - 2 * (np.arange(6) / 5))

# set up cross validation smooth lasso method
s_lasso_cv = SmoothLassoCV(
    alphas=alphas,
    lambdas=lambdas,
    inverse_dimension=inverse_dimension,
    sigma=0.005,
    folds=10,
)
# run the fit using the compressed kernel and compressed signal.
s_lasso_cv.fit(compressed_K, compressed_s)

The optimum hyper-parameters

Use the hyperparameters attribute of the instance for the optimum hyper-parameters, \(\alpha\) and \(\lambda\), determined from the cross-validation.

print(s_lasso_cv.hyperparameters)

Out:

{'alpha': 6.30957344480193e-06, 'lambda': 1.584893192461114e-06}

The cross-validation surface

Optionally, you may want to visualize the cross-validation error curve/surface. Use the cross_validation_curve attribute of the instance, as follows. The cross-validation metric is the mean square error (MSE).

cv_curve = s_lasso_cv.cross_validation_curve

# plot of the cross-validation curve
plt.figure(figsize=(5, 3.5))
ax = plt.subplot(projection="csdm")
ax.contour(np.log10(s_lasso_cv.cross_validation_curve), levels=25)
ax.scatter(
    -np.log10(s_lasso_cv.hyperparameters["alpha"]),
    -np.log10(s_lasso_cv.hyperparameters["lambda"]),
    marker="x",
    color="k",
)
plt.tight_layout(pad=0.5)
plt.show()
plot 1D 3 MAF

The optimum solution

The f attribute of the instance holds the solution.

The corresponding plot of the solution, along with the true tensor distribution, is shown below.

_, ax = plt.subplots(1, 2, figsize=(9, 3.5), subplot_kw={"projection": "csdm"})

# the plot of the tensor distribution solution.
plot2D(ax[0], f_sol / f_sol.max(), title="Optimum distribution")

# the plot of the true tensor distribution.
plot2D(ax[1], true_data_object, title="True distribution")
plt.tight_layout()
plt.show()
  • Optimum distribution, True distribution
  • plot 1D 3 MAF
  • plot 1D 3 MAF

Total running time of the script: ( 0 minutes 19.771 seconds)

Gallery generated by Sphinx-Gallery