[1]:

# ruff: noqa: T201

Quickstart Tutorial 🚀

Contents

Installing SpectroChemPy
Getting Started
Working with Spectroscopic Data
Data Processing Techniques
Advanced Analysis

What you’ll learn

Load and visualize spectroscopic data

Perform basic data manipulation and plotting

Apply common processing techniques

Use advanced analysis methods

Installing SpectroChemPy

Prerequisites ✅

Python 3.11 or later

Basic knowledge of Python

Jupyter notebook environment

You can install SpectroChemPy using either pip or mamba:

Using pip:

pip install spectrochempy

Using mamba:

mamba install -c spectrocat spectrochempy

See the Installation Guide for detailed instructions.

In the following, we assume that we are running spectrochempy in a Jupyter notebook. See here for details on how to start a Jupyter notebook.

Getting Started

Let’s start by importing SpectroChemPy and checking its version:

[2]:

import spectrochempy as scp

SpectroChemPy's API - v.0.8.2.dev20
©Copyright 2014-2025 - A.Travert & C.Fernandez @ LCS

Working with Spectroscopic Data

Loading Data

SpectroChemPy supports many file formats including:

OMNIC (.spa, .spg)
JCAMP-DX (.dx, .jdx)
CSV files
And many more

Let’s load an example FTIR dataset:

[3]:

# Load FTIR data of NH4Y zeolite activation
ds = scp.read("irdata/nh4y-activation.spg")
print(f"Dataset shape: {ds.shape}")  # Show dimensions
print(f"x-axis unit: {ds.x.units}")  # Show wavenumber units
print(f"y-axis unit: {ds.y.units}")  # Show time units

Running on GitHub Actions
MPL Configuration directory: /home/runner/.config/matplotlib
Stylelib directory: /home/runner/.config/matplotlib/stylelib
Dataset shape: (55, 5549)
x-axis unit: cm⁻¹
y-axis unit: s

The read function is a powerful method that can load various file formats. In this case, we loaded an OMNIC file. For a full list of supported formats, see the Import tutorial section.

The read function returns an NDDataset object.

Exploring the Data

Understanding the NDDataset object

The NDDataset is the core data structure in SpectroChemPy. It’s designed specifically for spectroscopic data and provides:

Multi-dimensional data support
Coordinates and units handling
Built-in visualization
Processing methods

You can display the loaded NDDataset in a Jupyter notebook as follows:

[4]:

ds

[4]:

NDDataset: [float64] a.u. (shape: (y:55, x:5549))[nh4y-activation]

Summary

name

:

nh4y-activation

author

:

runner@runnervmwhb2z

created

:

2025-10-12 01:32:38+00:00

description

:

Omnic title: NH4Y-activation.SPG
Omnic filename: /home/runner/.spectrochempy/testdata/irdata/nh4y-activation.spg

history

:

2025-10-12 01:32:38+00:00> Imported from spg file /home/runner/.spectrochempy/testdata/irdata/nh4y-activation.spg.
2025-10-12 01:32:38+00:00> Sorted by date

Data

title

:

absorbance

values

:

...

[[ 2.057 2.061 ... 2.013 2.012]
[ 2.033 2.037 ... 1.913 1.911]
...
[ 1.794 1.791 ... 1.198 1.198]
[ 1.816 1.815 ... 1.24 1.238]] a.u.

shape

:

(y:55, x:5549)

Dimension `x`

size

:

5549

title

:

wavenumbers

coordinates

:

[ 6000 5999 ... 650.9 649.9] cm⁻¹

Dimension `y`

size

:

55

title

:

acquisition timestamp (GMT)

coordinates

:

[1.468e+09 1.468e+09 ... 1.468e+09 1.468e+09] s

labels

:

...

[[ 2016-07-06 19:03:14+00:00 2016-07-06 19:13:14+00:00 ... 2016-07-07 04:03:17+00:00 2016-07-07 04:13:17+00:00]
[ vz0466.spa, Wed Jul 06 21:00:38 2016 (GMT+02:00) vz0467.spa, Wed Jul 06 21:10:38 2016 (GMT+02:00) ...
vz0520.spa, Thu Jul 07 06:00:41 2016 (GMT+02:00) vz0521.spa, Thu Jul 07 06:10:41 2016 (GMT+02:00)]]

Basic information about the data are given in the summary: data type, units, shape, and name of the dataset.

Clicking on the arrow on the left side of the summary will expand the metadata section, which contains additional information about the dataset.

The data itself is contained in the data attribute, which is a numpy array of shape (55,5549).

[5]:

ds.data

[5]:

array([[   2.057,    2.061, ...,    2.013,    2.012],
       [   2.033,    2.037, ...,    1.913,    1.911],
       ...,
       [   1.794,    1.791, ...,    1.198,    1.198],
       [   1.816,    1.815, ...,     1.24,    1.238]], shape=(55, 5549))

The x and y attributes contain the coordinates of the dataset. In this case, the x-axis represents the wavenumber:

[6]:

ds.x

[6]:

Coord: [float64] cm⁻¹ (size: 5549)[x]

Summary

size

:

5549

title

:

wavenumbers

coordinates

:

[ 6000 5999 ... 650.9 649.9] cm⁻¹

The y-axis represents the sample acquisition time:

[7]:

ds.y

[7]:

Coord: [float64] s (size: 55)[y]

Summary

size

:

55

title

:

acquisition timestamp (GMT)

coordinates

:

[1.468e+09 1.468e+09 ... 1.468e+09 1.468e+09] s

labels

:

...

[[ 2016-07-06 19:03:14+00:00 2016-07-06 19:13:14+00:00 ... 2016-07-07 04:03:17+00:00 2016-07-07 04:13:17+00:00]
[ vz0466.spa, Wed Jul 06 21:00:38 2016 (GMT+02:00) vz0467.spa, Wed Jul 06 21:10:38 2016 (GMT+02:00) ...
vz0520.spa, Thu Jul 07 06:00:41 2016 (GMT+02:00) vz0521.spa, Thu Jul 07 06:10:41 2016 (GMT+02:00)]]

Data Visualization

SpectroChemPy’s plotting capabilities are built on matplotlib but provide spectroscopy-specific features:

[8]:

ds.plot()

[8]:

../_images/gettingstarted_quickstart_20_1.png

Data Selection and Manipulation

You can easily select specific regions of your spectra using intuitive slicing. Here we select wavenumbers between 4000 and 2000 cm⁻¹:

[9]:

region = ds[:, 4000.0:2000.0]
region.plot()

[9]:

../_images/gettingstarted_quickstart_22_1.png

Mathematical Operations

NDDataset supports various mathematical operations. Here’s an example of baseline correction:

Make y coordinate relative to the first point

[10]:

region.y -= region.y[0]
region.y.title = "Dehydration time"

Subtract the last spectrum from all spectra

[11]:

region -= region[-1]

Plot with colorbar to show intensity changes

[12]:

region.plot(colorbar=True)

[12]:

../_images/gettingstarted_quickstart_29_1.png

Other Operations

NDDataset supports many other operations, such as:

Arithmetic operations
Statistical analysis
Data transformation
And more

For more information, see:

More insight on the NDDataset objects section.
API Reference for a full list of available operations.
Plotting tutorial for more information on advanced plotting.

Data Processing Techniques

SpectroChemPy includes numerous processing methods. Here are some common examples:

Spectral Smoothing

Reduce noise while preserving spectral features:

[13]:

smoothed = region.smooth(size=51, window="hanning")
smoothed.plot(colormap="magma")

[13]:

../_images/gettingstarted_quickstart_33_1.png

Baseline Correction

Remove baseline artifacts using various algorithms:

Prepare data

[14]:

region = ds[:, 4000.0:2000.0]
smoothed = region.smooth(size=51, window="hanning")

Configure baseline correction

[15]:

blc = scp.Baseline()
blc.ranges = [[2000.0, 2300.0], [3800.0, 3900.0]]  # Baseline regions
blc.multivariate = True
blc.model = "polynomial"
blc.order = "pchip"
blc.n_components = 5

Apply correction

[16]:

blc.fit(smoothed)
blc.corrected.plot()

[16]:

../_images/gettingstarted_quickstart_40_1.png

SpectroChemPy provides many other processing techniques, such as:

Normalization
Derivatives
Peak detection
And more

For more information, see the Processing tutorial section.

Advanced Analysis

IRIS Processing example

IRIS (Iterative Regularized Inverse Solver) is an advanced technique for analyzing spectroscopic data. Here’s an example with CO adsorption data:

Load and prepare CO adsorption data

[17]:

ds = scp.read_omnic("irdata/CO@Mo_Al2O3.SPG")[:, 2250.0:1950.0]

Define pressure coordinates

[18]:

pressure = [
00300,
00400,
00900,
01400,
02100,
02600,
03600,
05100,
09300,
15000,
20300,
30000,
40400,
50300,
60200,
70200,
80100,
90500,
00400,
]
ds.y = scp.Coord(pressure, title="Pressure", units="torr")

Plot the dataset

[19]:

ds.plot(colormap="magma")

[19]:

../_images/gettingstarted_quickstart_48_1.png

Perform IRIS analysis

[20]:

iris = scp.IRIS(reg_par=[-10, 1, 12])
K = scp.IrisKernel(ds, "langmuir", q=[-8, -1, 50])
iris.fit(ds, K)
iris.plotdistribution(-7, colormap="magma")

[20]:

../_images/gettingstarted_quickstart_50_1.png

Other Advanced Analysis Techniques

SpectroChemPy includes many other advanced analysis techniques, such as:

Multivariate analysis
Peak fitting
Kinetic modeling
And more

For more information, see the Advanced Analysis tutorial section.

Next Steps 🎯

Now that you’ve got a taste of SpectroChemPy’s capabilities, here are some suggestions for diving deeper:

Examples Gallery 📈: Browse through practical examples and use cases
User Guide 📖: Learn about specific features in detail
API Reference 🔍: Explore the complete API documentation
Get Help 💬: Join our community discussions