[1]:
# ruff: noqa: T201

Quickstart Tutorial 🚀

Contents

What you’ll learn

  • Load and visualize spectroscopic data

  • Perform basic data manipulation and plotting

  • Apply common processing techniques

  • Use advanced analysis methods

Installing SpectroChemPy

Prerequisites

  • Python 3.10 or later

  • Basic knowledge of Python

  • Jupyter notebook environment

You can install SpectroChemPy using either pip or mamba:

Using pip:

pip install spectrochempy

Using mamba:

mamba install -c spectrocat spectrochempy

See the Installation Guide for detailed instructions.

In the following, we assume that we are running spectrochempy in a Jupyter notebook. See here for details on how to start a Jupyter notebook.

Getting Started

Let’s start by importing SpectroChemPy and checking its version:

[2]:
import spectrochempy as scp
  SpectroChemPy's API - v.0.8.2.dev7
©Copyright 2014-2025 - A.Travert & C.Fernandez @ LCS

Working with Spectroscopic Data

Loading Data

SpectroChemPy supports many file formats including:

  • OMNIC (.spa, .spg)

  • JCAMP-DX (.dx, .jdx)

  • CSV files

  • And many more

Let’s load an example FTIR dataset:

[3]:
# Load FTIR data of NH4Y zeolite activation
ds = scp.read("irdata/nh4y-activation.spg")
print(f"Dataset shape: {ds.shape}")  # Show dimensions
print(f"x-axis unit: {ds.x.units}")  # Show wavenumber units
print(f"y-axis unit: {ds.y.units}")  # Show time units
Running on GitHub Actions
MPL Configuration directory: /home/runner/.config/matplotlib
Stylelib directory: /home/runner/.config/matplotlib/stylelib
Dataset shape: (55, 5549)
x-axis unit: cm⁻¹
y-axis unit: s

The read function is a powerful method that can load various file formats. In this case, we loaded an OMNIC file. For a full list of supported formats, see the Import tutorial section.

The read function returns an NDDataset object.

Exploring the Data

Understanding the NDDataset object

The NDDataset is the core data structure in SpectroChemPy. It’s designed specifically for spectroscopic data and provides:

  • Multi-dimensional data support

  • Coordinates and units handling

  • Built-in visualization

  • Processing methods

You can display the loaded NDDataset in a Jupyter notebook as follows:

[4]:
ds
[4]:
NDDataset: [float64] a.u. (shape: (y:55, x:5549))[nh4y-activation]
Summary
name
:
nh4y-activation
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:44:12+00:00
description
:
Omnic title: NH4Y-activation.SPG
Omnic filename: /home/runner/.spectrochempy/testdata/irdata/nh4y-activation.spg
history
:
2025-04-27 01:44:12+00:00> Imported from spg file /home/runner/.spectrochempy/testdata/irdata/nh4y-activation.spg.
2025-04-27 01:44:12+00:00> Sorted by date
Data
title
:
absorbance
values
:
...
[[ 2.057 2.061 ... 2.013 2.012]
[ 2.033 2.037 ... 1.913 1.911]
...
[ 1.794 1.791 ... 1.198 1.198]
[ 1.816 1.815 ... 1.24 1.238]] a.u.
shape
:
(y:55, x:5549)
Dimension `x`
size
:
5549
title
:
wavenumbers
coordinates
:
[ 6000 5999 ... 650.9 649.9] cm⁻¹
Dimension `y`
size
:
55
title
:
acquisition timestamp (GMT)
coordinates
:
[1.468e+09 1.468e+09 ... 1.468e+09 1.468e+09] s
labels
:
...
[[ 2016-07-06 19:03:14+00:00 2016-07-06 19:13:14+00:00 ... 2016-07-07 04:03:17+00:00 2016-07-07 04:13:17+00:00]
[ vz0466.spa, Wed Jul 06 21:00:38 2016 (GMT+02:00) vz0467.spa, Wed Jul 06 21:10:38 2016 (GMT+02:00) ...
vz0520.spa, Thu Jul 07 06:00:41 2016 (GMT+02:00) vz0521.spa, Thu Jul 07 06:10:41 2016 (GMT+02:00)]]

Basic information about the data are given in the summary: data type, units, shape, and name of the dataset.

Clicking on the arrow on the left side of the summary will expand the metadata section, which contains additional information about the dataset.

The data itself is contained in the data attribute, which is a numpy array of shape (55,5549).

[5]:
ds.data
[5]:
array([[   2.057,    2.061, ...,    2.013,    2.012],
       [   2.033,    2.037, ...,    1.913,    1.911],
       ...,
       [   1.794,    1.791, ...,    1.198,    1.198],
       [   1.816,    1.815, ...,     1.24,    1.238]], shape=(55, 5549))

The x and y attributes contain the coordinates of the dataset. In this case, the x-axis represents the wavenumber:

[6]:
ds.x
[6]:
Coord: [float64] cm⁻¹ (size: 5549)[x]
Summary
size
:
5549
title
:
wavenumbers
coordinates
:
[ 6000 5999 ... 650.9 649.9] cm⁻¹

The y-axis represents the sample acquisition time:

[7]:
ds.y
[7]:
Coord: [float64] s (size: 55)[y]
Summary
size
:
55
title
:
acquisition timestamp (GMT)
coordinates
:
[1.468e+09 1.468e+09 ... 1.468e+09 1.468e+09] s
labels
:
...
[[ 2016-07-06 19:03:14+00:00 2016-07-06 19:13:14+00:00 ... 2016-07-07 04:03:17+00:00 2016-07-07 04:13:17+00:00]
[ vz0466.spa, Wed Jul 06 21:00:38 2016 (GMT+02:00) vz0467.spa, Wed Jul 06 21:10:38 2016 (GMT+02:00) ...
vz0520.spa, Thu Jul 07 06:00:41 2016 (GMT+02:00) vz0521.spa, Thu Jul 07 06:10:41 2016 (GMT+02:00)]]

Data Visualization

SpectroChemPy’s plotting capabilities are built on matplotlib but provide spectroscopy-specific features:

[8]:
ds.plot()
[8]:
../_images/gettingstarted_quickstart_20_1.png

Data Selection and Manipulation

You can easily select specific regions of your spectra using intuitive slicing. Here we select wavenumbers between 4000 and 2000 cm⁻¹:

[9]:
region = ds[:, 4000.0:2000.0]
region.plot()
[9]:
../_images/gettingstarted_quickstart_22_1.png

Mathematical Operations

NDDataset supports various mathematical operations. Here’s an example of baseline correction:

Make y coordinate relative to the first point

[10]:
region.y -= region.y[0]
region.y.title = "Dehydration time"

Subtract the last spectrum from all spectra

[11]:
region -= region[-1]

Plot with colorbar to show intensity changes

[12]:
region.plot(colorbar=True)
[12]:
../_images/gettingstarted_quickstart_29_1.png

Other Operations

NDDataset supports many other operations, such as:

  • Arithmetic operations

  • Statistical analysis

  • Data transformation

  • And more

For more information, see:

Data Processing Techniques

SpectroChemPy includes numerous processing methods. Here are some common examples:

Spectral Smoothing

Reduce noise while preserving spectral features:

[13]:
smoothed = region.smooth(size=51, window="hanning")
smoothed.plot(colormap="magma")
[13]:
../_images/gettingstarted_quickstart_33_1.png

Baseline Correction

Remove baseline artifacts using various algorithms:

Prepare data

[14]:
region = ds[:, 4000.0:2000.0]
smoothed = region.smooth(size=51, window="hanning")

Configure baseline correction

[15]:
blc = scp.Baseline()
blc.ranges = [[2000.0, 2300.0], [3800.0, 3900.0]]  # Baseline regions
blc.multivariate = True
blc.model = "polynomial"
blc.order = "pchip"
blc.n_components = 5

Apply correction

[16]:
blc.fit(smoothed)
blc.corrected.plot()
[16]:
../_images/gettingstarted_quickstart_40_1.png

SpectroChemPy provides many other processing techniques, such as:

  • Normalization

  • Derivatives

  • Peak detection

  • And more

For more information, see the Processing tutorial section.

Advanced Analysis

IRIS Processing example

IRIS (Iterative Regularized Inverse Solver) is an advanced technique for analyzing spectroscopic data. Here’s an example with CO adsorption data:

Load and prepare CO adsorption data

[17]:
ds = scp.read_omnic("irdata/CO@Mo_Al2O3.SPG")[:, 2250.0:1950.0]

Define pressure coordinates

[18]:
pressure = [
    0.00300,
    0.00400,
    0.00900,
    0.01400,
    0.02100,
    0.02600,
    0.03600,
    0.05100,
    0.09300,
    0.15000,
    0.20300,
    0.30000,
    0.40400,
    0.50300,
    0.60200,
    0.70200,
    0.80100,
    0.90500,
    1.00400,
]
ds.y = scp.Coord(pressure, title="Pressure", units="torr")

Plot the dataset

[19]:
ds.plot(colormap="magma")
[19]:
../_images/gettingstarted_quickstart_48_1.png

Perform IRIS analysis

[20]:
iris = scp.IRIS(reg_par=[-10, 1, 12])
K = scp.IrisKernel(ds, "langmuir", q=[-8, -1, 50])
iris.fit(ds, K)
iris.plotdistribution(-7, colormap="magma")
[20]:
../_images/gettingstarted_quickstart_50_1.png

Other Advanced Analysis Techniques

SpectroChemPy includes many other advanced analysis techniques, such as:

  • Multivariate analysis

  • Peak fitting

  • Kinetic modeling

  • And more

For more information, see the Advanced Analysis tutorial section.

Next Steps 🎯

Now that you’ve got a taste of SpectroChemPy’s capabilities, here are some suggestions for diving deeper: