Note
Go to the end to download the full example code.
PLS regression example
In this example, we perform a PLS regression to predict the moisture of corn samples from their NIR spectra.
Import the spectrochempy API package
import spectrochempy as scp
The data set is available to download from the Eigenvector Archive:
ds_list = scp.read("http://www.eigenvector.com/data/Corn/corn.mat", merge=False)
ds_list_names = [f"{i} : {ds.name}({ds.shape})" for i, ds in enumerate(ds_list)]
print(ds_list_names)
['0 : m5nbs((3, 700))', '1 : mp5nbs((4, 700))', '2 : mp6nbs((4, 700))', '3 : propvals((80, 4))', '4 : m5spec((80, 700))', '5 : mp5spec((80, 700))', '6 : mp6spec((80, 700))']
This data set, originally taken at Cargil, consists of 80 samples of corn measured on
3 different NIR spectrometers together with the moisture, oil, protein and starch
values for each of the samples is also included.
The 5th dataset named 'm5spec'
, contains the NIR spectra of 80 corn samples recorded
on the same instrument. Let’s assign this NDDataset specta to X
, add few
informations and plot it:
%%

The values of the properties we want to predict are in the 4th dattaset named 'propval'
dataset:
Y = ds_list[3]
Y.T.plot(cmap=None, legend=Y.x.labels)

We are interested to predict the moisture content:
First we select 57 first samples (2/3 of the total) to train/calibrate the model and the remaining ones to test/validate the model:
Then we create a PLSRegression object and fit the train datasets:
pls = scp.PLSRegression(n_components=5)
pls.fit(X_train, y_train)
<spectrochempy.analysis.crossdecomposition.pls.PLSRegression object at 0x7fcff2005400>
Finally we generate a parity plot comparing the predicted and actual values, for both train set and test set.
ax = pls.parityplot(label="calibration", s=150)
pls.parityplot(
y_test, pls.predict(X_test), s=150, c="red", label="validation", clear=False
)
ax.legend(loc="lower right")

<matplotlib.legend.Legend object at 0x7fcff199efd0>
This ends the example ! The following line can be uncommented if no plot shows when running the .py script with python
scp.show()
Total running time of the script: (0 minutes 1.181 seconds)