Chemometric preprocessing

This example demonstrates the standard preprocessing operations available in SpectroChemPy: normalization, mean-centering, autoscaling, Standard Normal Variate (SNV), and Multiplicative Scatter Correction (MSC).

import spectrochempy as scp

Load a stacked IR dataset and select a single spectral region.

dataset = scp.read_omnic("irdata/nh4y-activation.spg")
region = dataset[:, 2200.0:1800.0]

Normalization scales each spectrum. The default is method='max'.

norm = region.normalize(method="max", dim="x")
_ = norm.plot(title="Max-normalized")
Max-normalized

Mean-centering subtracts the mean along a chosen dimension. Here we center each spectrum individually (dim='x').

centered = region.center(dim="x")
_ = centered.plot(title="Mean-centered per spectrum")
Mean-centered per spectrum
/home/runner/work/spectrochempy/spectrochempy/.venv/lib/python3.13/site-packages/traitlets/traitlets.py:741: UserWarning: Given trait value dtype "float64" does not match required type "float64". A coerced copy has been created.
  value = self.validate(obj, value)

Autoscaling mean-centers and divides by the standard deviation. This is the classic z-score used before PCA or PLS.

scaled = region.autoscale(dim="x")
_ = scaled.plot(title="Autoscaled (z-score) per spectrum")
Autoscaled (z-score) per spectrum
/home/runner/work/spectrochempy/spectrochempy/.venv/lib/python3.13/site-packages/traitlets/traitlets.py:741: UserWarning: Given trait value dtype "float64" does not match required type "float64". A coerced copy has been created.
  value = self.validate(obj, value)

Standard Normal Variate (SNV) is a convenience wrapper that autoscales each spectrum individually. It is equivalent to autoscale(dim='x').

snv = region.snv()
_ = snv.plot(title="SNV corrected")
SNV corrected
/home/runner/work/spectrochempy/spectrochempy/.venv/lib/python3.13/site-packages/traitlets/traitlets.py:741: UserWarning: Given trait value dtype "float64" does not match required type "float64". A coerced copy has been created.
  value = self.validate(obj, value)

Multiplicative Scatter Correction (MSC) removes multiplicative and additive scatter by regressing each spectrum against a mean reference.

msc = region.msc()
_ = msc.plot(title="MSC corrected")
MSC corrected
/home/runner/work/spectrochempy/spectrochempy/.venv/lib/python3.13/site-packages/traitlets/traitlets.py:741: UserWarning: Given trait value dtype "float64" does not match required type "float64". A coerced copy has been created.
  value = self.validate(obj, value)

This ends the example. Uncomment the next line to display the figures when running the script directly with Python.

# scp.show()

Total running time of the script: (0 minutes 0.602 seconds)