Import of OMNIC files

Thermo Scientific OMNIC software have two proprietary binary file formats:

  • .spa files that handle single spectra

  • .spg files which contain a group of spectra

Both have been reverse engineered, hence allowing extracting their key data. The Omnic reader of Spectrochempy ( read_omnic() ) has been developed based on posts in open forums on the .spa file format and extended to .spg file formats.

Import spg file

Let’s import an .spg file from the datadir (see :ref:import.ipynb for details)): and display its main attributes:

[1]:
import spectrochempy as scp
  SpectroChemPy's API - v.0.8.2.dev7
©Copyright 2014-2025 - A.Travert & C.Fernandez @ LCS
[2]:
X = scp.read_omnic("irdata/CO@Mo_Al2O3.SPG")
X
Running on GitHub Actions
MPL Configuration directory: /home/runner/.config/matplotlib
Stylelib directory: /home/runner/.config/matplotlib/stylelib
[2]:
NDDataset: [float64] a.u. (shape: (y:19, x:3112))[CO@Mo_Al2O3]
Summary
name
:
CO@Mo_Al2O3
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:45:42+00:00
description
:
Omnic title: Group sust Mo_Al2O3_base line.SPG
Omnic filename: /home/runner/.spectrochempy/testdata/irdata/CO@Mo_Al2O3.SPG
history
:
2025-04-27 01:45:42+00:00> Imported from spg file /home/runner/.spectrochempy/testdata/irdata/CO@Mo_Al2O3.SPG.
2025-04-27 01:45:42+00:00> Sorted by date
Data
title
:
absorbance
values
:
...
[[0.0008032 3.788e-05 ... 0.0003027 0.0003745]
[-3.608e-05 -0.0001981 ... 0.0003089 0.00117]
...
[0.0008357 -0.0001387 ... -0.0005221 -0.001121]
[0.0005655 -0.000116 ... -0.00057 -0.0006307]] a.u.
shape
:
(y:19, x:3112)
Dimension `x`
size
:
3112
title
:
wavenumbers
coordinates
:
[ 4000 3999 ... 1001 999.9] cm⁻¹
Dimension `y`
size
:
19
title
:
acquisition timestamp (GMT)
coordinates
:
[1.477e+09 1.477e+09 ... 1.477e+09 1.477e+09] s
labels
:
...
[[ 2016-10-18 13:49:35+00:00 2016-10-18 13:54:06+00:00 ... 2016-10-18 16:01:33+00:00 2016-10-18 16:06:37+00:00]
[ *Résultat de Soustraction:04_Mo_Al2O3_calc_0.003torr_LT_after sulf_Oct 18 15:46:42 2016 (GMT+02:00)
*Résultat de Soustraction:04_Mo_Al2O3_calc_0.004torr_LT_after sulf_Oct 18 15:51:12 2016 (GMT+02:00) ...
*Résultat de Soustraction:04_Mo_Al2O3_calc_0.905torr_LT_after sulf_Oct 18 17:58:42 2016 (GMT+02:00)
*Résultat de Soustraction:04_Mo_Al2O3_calc_1.004torr_LT_after sulf_Oct 18 18:03:41 2016 (GMT+02:00)]]

The displayed attributes are detailed in the following:

  • name is the name of the group of spectra as it appears in the .spg file. OMNIC sets this name to the .spg filename used at the creation of the group. In this example, the name (“Group sust Mo_Al2O3_base line.SPG”) differs from the filename ("CO@Mo_Al2O3.SPG") because the latter has been changed from outside OMNIC (directly in the OS).

  • author is that of the creator of the NDDataset (not of the .spg file, which, to our knowledge, does not have this type of attribute). The string is composed of the username and of the machine name as given by the OS, e.g., "username@machinename". It can be accessed and changed using X.author .

  • created is the creation date of the NDDataset (again not that of the .spg file). It can be accessed (or even changed) using X.created .

  • description indicates the complete pathname of the .spg file. As the pathname is also given in the history (below), it can be a good practice to give a self-explaining description of the group, for instance:

[3]:
X.description = "CO adsorption on CoMo/Al2O3, difference spectra"
X.description
[3]:
'CO adsorption on CoMo/Al2O3, difference spectra'

or directly at the import:

[4]:
X = scp.read_omnic("irdata//CO@Mo_Al2O3.SPG", description="CO@CoMo/Al2O3, diff spectra")
X.description
[4]:
'CO@CoMo/Al2O3, diff spectra'
  • history records changes made to the dataset. Here, right after its creation, it has been sorted by date (see below).

Then come the attributes related to the data themselves:

  • title (not to be confused with the name of the dataset) describes the nature of data (here absorbance ).

  • values shows the data as quantity (with their units when they exist - here a.u. for absorbance units).

  • The numerical values ar accessed through the data attribute and the units throughout units attribute.

[5]:
X.values
[5]:
Magnitude
[[0.000803191214799881 3.787875175476074e-05 ... 0.000302683562040329 0.0003744959831237793] [-3.607943654060364e-05 -0.0001980997622013092 ... 0.0003089122474193573 0.0011698119342327118] ... [0.0008356980979442596 -0.0001386702060699463 ... -0.0005221068859100342 -0.001121222972869873] [0.0005654506385326385 -0.00011600926518440247 ... -0.0005699768662452698 -0.000630699098110199]]
Unitsa.u.
[6]:
X.data
[6]:
array([[0.0008032, 3.788e-05, ..., 0.0003027, 0.0003745],
       [-3.608e-05, -0.0001981, ..., 0.0003089,  0.00117],
       ...,
       [0.0008357, -0.0001387, ..., -0.0005221, -0.001121],
       [0.0005655, -0.000116, ..., -0.00057, -0.0006307]], shape=(19, 3112))
[7]:
X.units
[7]:
a.u.
  • shape is the same as the ndarray shape attribute and gives the shape of the data array, here 19 x 3112.

Then come the attributes related to the dimensions of the dataset.

  • x : this dimension has one coordinate (a Coord object) made of the 3112 the wavenumbers.

[8]:
print(X.x)
X.x
Coord: [float64] cm⁻¹ (size: 3112)
[8]:
Coord: [float64] cm⁻¹ (size: 3112)[x]
Summary
size
:
3112
title
:
wavenumbers
coordinates
:
[ 4000 3999 ... 1001 999.9] cm⁻¹
  • y : this dimension contains:

    • one coordinate made of the 19 acquisition timestamps

    • two labels:

      • the acquisition date (UTC) of each spectrum

      • the name of each spectrum.

[9]:
X.y
[9]:
Coord: [float64] s (size: 19)[y]
Summary
size
:
19
title
:
acquisition timestamp (GMT)
coordinates
:
[1.477e+09 1.477e+09 ... 1.477e+09 1.477e+09] s
labels
:
...
[[ 2016-10-18 13:49:35+00:00 2016-10-18 13:54:06+00:00 ... 2016-10-18 16:01:33+00:00 2016-10-18 16:06:37+00:00]
[ *Résultat de Soustraction:04_Mo_Al2O3_calc_0.003torr_LT_after sulf_Oct 18 15:46:42 2016 (GMT+02:00)
*Résultat de Soustraction:04_Mo_Al2O3_calc_0.004torr_LT_after sulf_Oct 18 15:51:12 2016 (GMT+02:00) ...
*Résultat de Soustraction:04_Mo_Al2O3_calc_0.905torr_LT_after sulf_Oct 18 17:58:42 2016 (GMT+02:00)
*Résultat de Soustraction:04_Mo_Al2O3_calc_1.004torr_LT_after sulf_Oct 18 18:03:41 2016 (GMT+02:00)]]
  • dims : Note that the x and y dimensions are the second and first dimension respectively. Hence, X[i,j] will return the absorbance of the ith spectrum at the jth wavenumber. However, this is subject to change, for instance if you perform operation on your data such as Transposition. At any time the attribute dims gives the correct names (which can be modified) and order of the dimensions.

[10]:
X.dims
[10]:
['y', 'x']

Acquisition dates and y axis

The acquisition timestamps are the Unix times of the acquisition, i.e. the time elapsed in seconds since the reference date of Jan 1st 1970, 00:00:00 UTC.

[11]:
X.y.values
[11]:
Magnitude
[1476798575.0 1476798846.0 ... 1476806493.0 1476806797.0]
Unitss

In OMNIC, the acquisition time is that of the start of the acquisition. As such these may be not convenient to use directly (they are currently in the order of 1.5 billion…) With this respect, it can be convenient to shift the origin of time coordinate to that of the 1st spectrum, which has the index 0 :

[12]:
X.y = X.y - X.y[0]
X.y.values
[12]:
Magnitude
[0.0 271.0 ... 7918.0 8222.0]
Unitss

Note that you can also use the inplace subtract operator to perform the same operation.

[13]:
X.y -= X.y[0]

It is also possible to use the ability of SpectroChemPy to handle unit changes. For this one can use the to or ito (inplace) methods.

val = val.to(some_units)
val.ito(some_units)   # the same inplace
[14]:
X.y.ito("minute")
X.y.values
[14]:
Magnitude
[0.0 4.517 ... 131.967 137.033]
Unitsmin

As shown above, the values of the Coord object are accessed through the values attribute. To get the last values corresponding to the last row of the X dataset, you can use:

[15]:
tf = X.y.values[-1]
tf
[15]:
137.033 min

Negative index in python indicates the position in a sequence from the end, so -1 indicate the last element.

Finally, if for instance you want the x time axis to be shifted by 2 minutes, it is also very easy to do so:

[16]:
X.y = X.y + 2
X.y.values
[16]:
Magnitude
[2.0 6.517 ... 133.967 139.033]
Unitsmin

or using the inplace add/subtract operator:

[17]:
X.y -= 2  # this restore the previous coordinates
X.y.values
[17]:
Magnitude
[0.0 4.517 ... 131.967 137.033]
Unitsmin

The order of spectra

The order of spectra in OMNIC .spg files depends on the order in which the spectra were included in the OMNIC window before the group was saved. By default, spectrochempy reorders the spectra by acquisition date but the original OMNIC order can be kept using the sortbydate=True at the function call. For instance:

[18]:
X2 = scp.read_omnic("irdata/CO@Mo_Al2O3.SPG", sortbydate=False)

In the present case, this will change nothing because the spectra in the OMNIC file were already ordered by increasing data.

Finally, it is worth mentioning that a NDDataset can generally be manipulated as numpy ndarray. Hence, for instance, the following will inverse the order of the first dimension:

[19]:
X = X[::-1]  # reorders the NDDataset along the first dimension going backward
X.y.values  # displays the `y` dimension
[19]:
Magnitude
[137.033 131.967 ... 4.517 0.0]
Unitsmin

Note

Case of groups with different wavenumbers An OMNIC .spg file can contain spectra having different wavenumber axes (e.g. different spacings or wavenumber ranges). In its current implementation, the spg reader will purposely return an error because such spectra cannot be included in a single NDDataset which, by definition, contains items that share common axes or dimensions ! Future releases might include an option to deal with such a case and return a list of NDDatasets. Let us know if you are interested in such a feature, see Bug reports and enhancement requests.

Import of .spa files

The import of a single spectrum follows exactly the same rules as that of the import of a group:

[20]:
scp.read_omnic("irdata/subdir/7_CZ0-100_Pd_101.SPA")
[20]:
NDDataset: [float64] a.u. (shape: (y:1, x:5549))[7_CZ0-100 Pd_101]
Summary
name
:
7_CZ0-100 Pd_101
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:45:42+00:00
description
:
# Omnic name: 7_CZ0-100 Pd_101
# Filename: 7_CZ0-100_Pd_101.SPA
history
:
2025-04-27 01:45:42+00:00> Imported from spa file(s)
2025-04-27 01:45:42+00:00> Data processing history from Omnic :
------------------------------------
Acquisition échantillon

<br/> Background acquis le Ven Nov 30 08:03:45 2018 (GMT+01:00) <br/> Format Final : Absorbance <br/> Résolution: 4,000 de 649,9207 à 5999,7134 <br/> Roue de validation: 0 <br/> Roue porte écran atténuation: Vide <br/> Numéro Série du banc:ALK1100494</div></div></div></details></div>

<div class=”scp-output section”><details><summary> Data </summary> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> absorbance</div></div> <div class=”scp-output section”><div class=”attr-name”> values</div><div>:</div><div class=”attr-value”> … </div></div> <div class=’numeric’> [[ 1.544 1.543 … 2.1 2.091]] a.u.</div> <div class=”scp-output section”><div class=”attr-name”> shape</div><div>:</div><div class=”attr-value”> (y:1, x:5549)</div></div></details></div> <div class=”scp-output section”><details><summary> Dimension `x`</summary> <div class=”scp-output section”><div class=”attr-name”> size</div><div>:</div><div class=”attr-value”> 5549</div></div> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> wavenumbers</div></div> <div class=”scp-output section”><div class=”attr-name”> coordinates</div><div>:</div><div class=”attr-value”> <div class=’numeric’>[ 6000 5999 … 650.9 649.9] cm⁻¹</div></div></div></details></div> <div class=”scp-output section”><details><summary> Dimension `y`</summary> <div class=”scp-output section”><div class=”attr-name”> size</div><div>:</div><div class=”attr-value”> 1</div></div> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> acquisition timestamp (GMT)</div></div> <div class=”scp-output section”><div class=”attr-name”> coordinates</div><div>:</div><div class=”attr-value”> <div class=’numeric’>[1.544e+09] s</div></div></div> <div class=”scp-output section”><div class=”attr-name”> labels</div><div>:</div><div class=”attr-value”> … </div></div> <div class=’label’> [[ 2018-11-30 07:10:57+00:00]<br/> [ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_101.SPA]]</div></details></div></details></div>

The omnic reader can also import several spa files together, providing that they share a common axis for the wavenumbers.

This is the case of the following files in the irdata/subdir directory: “7_CZ0-100 Pd_101.SPA”, …, “7_CZ0-100 Pd_104.spa”.

It is possible to import them in a single NDDataset by using the list of filenames in the function call:

[21]:
list_files = (
    "7_CZ0-100_Pd_101.SPA",
    "7_CZ0-100_Pd_102.SPA",
    "7_CZ0-100_Pd_103.SPA",
    "7_CZ0-100_Pd_104.SPA",
)
scp.read_omnic(list_files, directory="irdata/subdir", name="Merged 7_CZ0-100 Pd")
[21]:
NDDataset: [float64] a.u. (shape: (y:4, x:5549))[Merged 7_CZ0-100 Pd]
Summary
name
:
Merged 7_CZ0-100 Pd
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:45:42+00:00
description
:
Concatenation of 4 datasets:
( 7_CZ0-100 Pd_101, 7_CZ0-100 Pd_102, 7_CZ0-100 Pd_103, 7_CZ0-100 Pd_104 )
history
:
2025-04-27 01:45:42+00:00> Created by concatenate
2025-04-27 01:45:42+00:00> Merged from several files
Data
title
:
absorbance
values
:
...
[[ 1.544 1.543 ... 2.1 2.091]
[ 1.552 1.553 ... 2.161 2.109]
[ 1.461 1.46 ... 2.087 2.088]
[ 1.448 1.447 ... 2.071 2.065]] a.u.
shape
:
(y:4, x:5549)
Dimension `x`
size
:
5549
title
:
wavenumbers
coordinates
:
[ 6000 5999 ... 650.9 649.9] cm⁻¹
Dimension `y`
size
:
4
title
:
acquisition timestamp (GMT)
coordinates
:
[1.544e+09 1.544e+09 1.544e+09 1.544e+09] s
labels
:
...
[[ 2018-11-30 07:10:57+00:00 2018-11-30 07:22:52+00:00 2018-11-30 07:34:49+00:00 2018-11-30 07:46:44+00:00]
[ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_101.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_102.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_103.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_104.SPA]]

When compatible .spa files are alone in a directory, a very convenient is to call the read_omnic method using only the directory path as argument that will gather the .spa files together:

[22]:
scp.read_omnic("irdata/subdir/1-20")
[22]:
NDDataset: [float64] a.u. (shape: (y:3, x:5549))[7_CZ0-100 Pd_5]
Summary
name
:
7_CZ0-100 Pd_5
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:45:42+00:00
description
:
Concatenation of 3 datasets:
( 7_CZ0-100 Pd_3, 7_CZ0-100 Pd_4, 7_CZ0-100 Pd_5 )
history
:
2025-04-27 01:45:42+00:00> Created by concatenate
2025-04-27 01:45:42+00:00> Merged from several files
Data
title
:
absorbance
values
:
...
[[ 1.245 1.245 ... 1.311 1.307]
[ 1.245 1.245 ... 1.302 1.299]
[ 1.236 1.235 ... 1.3 1.296]] a.u.
shape
:
(y:3, x:5549)
Dimension `x`
size
:
5549
title
:
wavenumbers
coordinates
:
[ 6000 5999 ... 650.9 649.9] cm⁻¹
Dimension `y`
size
:
3
title
:
acquisition timestamp (GMT)
coordinates
:
[1.544e+09 1.544e+09 1.544e+09] s
labels
:
...
[[ 2018-11-29 16:23:05+00:00 2018-11-29 16:35:00+00:00 2018-11-29 16:46:56+00:00]
[ /home/runner/.spectrochempy/testdata/irdata/subdir/1-20/7_CZ0-100_Pd_3.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/1-20/7_CZ0-100_Pd_4.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/1-20/7_CZ0-100_Pd_5.SPA]]

In the case where not all files are compatibles, they are returned in different NDDatasets(with independent merging).

For example:

[23]:
Y = scp.read_omnic("irdata/subdir/")
Y
[23]:
List (len=2, type=NDDataset)
    0: NDDataset: [float64] a.u. (shape: (y:335, x:1868))[dd_6.6_19039_538]
    Summary
    name
    :
    dd_6.6_19039_538
    author
    :
    runner@fv-az2211-104
    created
    :
    2025-04-27 01:45:43+00:00
    description
    :
    Concatenation of 1 datasets:
    ( dd_6.6_19039_538 )
    history
    :
    2025-04-27 01:45:43+00:00> Created by concatenate
    2025-04-27 01:45:43+00:00> Merged from several files
    Data
    title
    :
    absorbance
    values
    :
    ...
    [[-0.007524 -0.0003661 ... 8.291e-05 9.239e-05]
    [-0.009306 -0.002252 ... 0.0001051 0.000107]
    ...
    [ 0.02474 0.02814 ... 0.002962 0.002967]
    [ 0.02663 0.02899 ... 0.002907 0.002916]] a.u.
    shape
    :
    (y:335, x:1868)
    Dimension `x`
    size
    :
    1868
    title
    :
    wavenumbers
    coordinates
    :
    [ 4000 3998 ... 401.1 399.2] cm⁻¹
    Dimension `y`
    size
    :
    335
    title
    :
    Time
    coordinates
    :
    [ 0.26 0.52 ... 86.76 87.02] min
    labels
    :
    ...
    [ Verknüpftes Spektrum bei 0,260 Min.
    ˜Ë
    ÁØsúË
    t/ä
    /ä
    /ä



    ±½˜@
    —‹ö»Îö¿¹·j»D€»t4»*
    †»Ý©S»£°»ýóɺ-êÒº¿p»“¢»

    »[ýéºùH|»
    »Íu»ä䩺LÊкsöþºvÐmºÎG4ºC঺˜Ôͺ™Õ
    ºšŠ¹¶:»ñ&Rº*
    Ò:Ý¿9WÅEºÓo
    ºB'ƺP|_º)ïQ9Ժ܂ìº-˜‚ºÁrï¹ç
    ±ºð¹¯ºqHºIKº Verknüpftes Spektrum bei 0,519 Min.
    ˜Ë
    ÁØsúË
    t/ä
    /ä
    /ä



    î˜@
    y¼—“»¾¯^¹‰¸:¸4꺗=x»ØÃ1»¼-øºÿɺÓź[L»×

    »£Êp»Õ­@»¤’(»ë𢹱w 9•5ºÍ9ÔºÓ೺Ou€ºÔ
    ܹ—–Šºm¼BºâÞå7ÛÍ9Mƒº’,ºJq:ò49 jZºLÛº¬ëºŸÃi¹;ý:ì) º)vº¡€ê¹xÕ½·?³º©ëÁºÀ
    ºó(
    ¹ ...
    Verknüpftes Spektrum bei 86,756 Min.
    ˜Ë
    ÁØsúË
    t/ä
    /ä
    /ä



    ã!—@
    š¡Ê r <«><­
    é;òÙ;™\
    <â½Ä;²˜;ÿÐæ;¤ññ;°Í¹;¡´Š;ÛYÎ;˜?å;Þ¦ò;²ø§;hAe;°TÉ;ó÷Ì;dK’;úºY;ÛýÀ;j8•;ïkª;#GŒ;€¸Û;Öþ”;
    Verknüpftes Spektrum bei 87,016 Min.
    ˜Ë
    ÁØsúË
    t/ä
    /ä
    /ä



    0+—@
    +1Ú<9}í<†„ B <µ‘*<;µÖ;ÓÆ" <½²ò;ç<àP<ãâ;‡4Å;vüö;ÑMà;·S‘;M†;HÃà;¾cö;M¶ü;C‡ƒ;Ö[#;3Ó;*Ûæ;6–;ýH€;R±Ã;ÍNš;í­;j7Œ;oŽÚ;ú;]
    1: NDDataset: [float64] a.u. (shape: (y:4, x:5549))[7_CZ0-100 Pd_104]
    Summary
    name
    :
    7_CZ0-100 Pd_104
    author
    :
    runner@fv-az2211-104
    created
    :
    2025-04-27 01:45:43+00:00
    description
    :
    Concatenation of 1 datasets:
    ( 7_CZ0-100 Pd_104 )
    history
    :
    2025-04-27 01:45:43+00:00> Created by concatenate
    2025-04-27 01:45:43+00:00> Merged from several files
    Data
    title
    :
    absorbance
    values
    :
    ...
    [[ 1.544 1.543 ... 2.1 2.091]
    [ 1.552 1.553 ... 2.161 2.109]
    [ 1.461 1.46 ... 2.087 2.088]
    [ 1.448 1.447 ... 2.071 2.065]] a.u.
    shape
    :
    (y:4, x:5549)
    Dimension `x`
    size
    :
    5549
    title
    :
    wavenumbers
    coordinates
    :
    [ 6000 5999 ... 650.9 649.9] cm⁻¹
    Dimension `y`
    size
    :
    4
    title
    :
    acquisition timestamp (GMT)
    coordinates
    :
    [1.544e+09 1.544e+09 1.544e+09 1.544e+09] s
    labels
    :
    ...
    [[ 2018-11-30 07:10:57+00:00 2018-11-30 07:22:52+00:00 2018-11-30 07:34:49+00:00 2018-11-30 07:46:44+00:00]
    [ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_101.SPA
    /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_102.SPA
    /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_103.SPA
    /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_104.SPA]]

Here we get a list of two NDDataset because there is two type of file in the directory (.spa and .srs).

The desired dataset can be obtained using a list:

[24]:
Y[1]
[24]:
NDDataset: [float64] a.u. (shape: (y:4, x:5549))[7_CZ0-100 Pd_104]
Summary
name
:
7_CZ0-100 Pd_104
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:45:43+00:00
description
:
Concatenation of 1 datasets:
( 7_CZ0-100 Pd_104 )
history
:
2025-04-27 01:45:43+00:00> Created by concatenate
2025-04-27 01:45:43+00:00> Merged from several files
Data
title
:
absorbance
values
:
...
[[ 1.544 1.543 ... 2.1 2.091]
[ 1.552 1.553 ... 2.161 2.109]
[ 1.461 1.46 ... 2.087 2.088]
[ 1.448 1.447 ... 2.071 2.065]] a.u.
shape
:
(y:4, x:5549)
Dimension `x`
size
:
5549
title
:
wavenumbers
coordinates
:
[ 6000 5999 ... 650.9 649.9] cm⁻¹
Dimension `y`
size
:
4
title
:
acquisition timestamp (GMT)
coordinates
:
[1.544e+09 1.544e+09 1.544e+09 1.544e+09] s
labels
:
...
[[ 2018-11-30 07:10:57+00:00 2018-11-30 07:22:52+00:00 2018-11-30 07:34:49+00:00 2018-11-30 07:46:44+00:00]
[ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_101.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_102.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_103.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_104.SPA]]

Other ways to select only the required file with extension (.spa)are:

  • writing a list as previously explicitely listing the required files.

  • using a more specific reader:

[25]:
scp.read_spa("irdata/subdir/")
[25]:
NDDataset: [float64] a.u. (shape: (y:4, x:5549))[7_CZ0-100 Pd_104]
Summary
name
:
7_CZ0-100 Pd_104
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:45:43+00:00
description
:
Concatenation of 4 datasets:
( 7_CZ0-100 Pd_101, 7_CZ0-100 Pd_102, 7_CZ0-100 Pd_103, 7_CZ0-100 Pd_104 )
history
:
2025-04-27 01:45:43+00:00> Created by concatenate
2025-04-27 01:45:43+00:00> Merged from several files
Data
title
:
absorbance
values
:
...
[[ 1.544 1.543 ... 2.1 2.091]
[ 1.552 1.553 ... 2.161 2.109]
[ 1.461 1.46 ... 2.087 2.088]
[ 1.448 1.447 ... 2.071 2.065]] a.u.
shape
:
(y:4, x:5549)
Dimension `x`
size
:
5549
title
:
wavenumbers
coordinates
:
[ 6000 5999 ... 650.9 649.9] cm⁻¹
Dimension `y`
size
:
4
title
:
acquisition timestamp (GMT)
coordinates
:
[1.544e+09 1.544e+09 1.544e+09 1.544e+09] s
labels
:
...
[[ 2018-11-30 07:10:57+00:00 2018-11-30 07:22:52+00:00 2018-11-30 07:34:49+00:00 2018-11-30 07:46:44+00:00]
[ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_101.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_102.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_103.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_104.SPA]]
  • using a pattern filter

[26]:
scp.read_omnic("irdata/subdir/", pattern="*.spa")
[26]:
NDDataset: [float64] a.u. (shape: (y:4, x:5549))[7_CZ0-100 Pd_104]
Summary
name
:
7_CZ0-100 Pd_104
author
:
runner@fv-az2211-104
created
:
2025-04-27 01:45:43+00:00
description
:
Concatenation of 4 datasets:
( 7_CZ0-100 Pd_101, 7_CZ0-100 Pd_102, 7_CZ0-100 Pd_103, 7_CZ0-100 Pd_104 )
history
:
2025-04-27 01:45:43+00:00> Created by concatenate
2025-04-27 01:45:43+00:00> Merged from several files
Data
title
:
absorbance
values
:
...
[[ 1.544 1.543 ... 2.1 2.091]
[ 1.552 1.553 ... 2.161 2.109]
[ 1.461 1.46 ... 2.087 2.088]
[ 1.448 1.447 ... 2.071 2.065]] a.u.
shape
:
(y:4, x:5549)
Dimension `x`
size
:
5549
title
:
wavenumbers
coordinates
:
[ 6000 5999 ... 650.9 649.9] cm⁻¹
Dimension `y`
size
:
4
title
:
acquisition timestamp (GMT)
coordinates
:
[1.544e+09 1.544e+09 1.544e+09 1.544e+09] s
labels
:
...
[[ 2018-11-30 07:10:57+00:00 2018-11-30 07:22:52+00:00 2018-11-30 07:34:49+00:00 2018-11-30 07:46:44+00:00]
[ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_101.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_102.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_103.SPA
/home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_104.SPA]]

One advantage of the latter solution is a greter flexibility. For instance the lollowing will select only the *101.spa and *102.spa:

[27]:
scp.read_omnic("irdata/subdir/", pattern="*10[12].spa", merge=False)
[27]:
List (len=2, type=NDDataset)
    0: NDDataset: [float64] a.u. (shape: (y:1, x:5549))[7_CZ0-100 Pd_101]
    Summary
    name
    :
    7_CZ0-100 Pd_101
    author
    :
    runner@fv-az2211-104
    created
    :
    2025-04-27 01:45:43+00:00
    description
    :
    # Omnic name: 7_CZ0-100 Pd_101
    # Filename: 7_CZ0-100_Pd_101.SPA
    history
    :
    2025-04-27 01:45:43+00:00> Imported from spa file(s)
    2025-04-27 01:45:43+00:00> Data processing history from Omnic :
    ------------------------------------
    Acquisition échantillon

    <br/> Background acquis le Ven Nov 30 08:03:45 2018 (GMT+01:00) <br/> Format Final : Absorbance <br/> Résolution: 4,000 de 649,9207 à 5999,7134 <br/> Roue de validation: 0 <br/> Roue porte écran atténuation: Vide <br/> Numéro Série du banc:ALK1100494</div></div></div></details></div>

    <div class=”scp-output section”><details><summary> Data </summary> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> absorbance</div></div> <div class=”scp-output section”><div class=”attr-name”> values</div><div>:</div><div class=”attr-value”> … </div></div> <div class=’numeric’> [[ 1.544 1.543 … 2.1 2.091]] a.u.</div> <div class=”scp-output section”><div class=”attr-name”> shape</div><div>:</div><div class=”attr-value”> (y:1, x:5549)</div></div></details></div> <div class=”scp-output section”><details><summary> Dimension `x`</summary> <div class=”scp-output section”><div class=”attr-name”> size</div><div>:</div><div class=”attr-value”> 5549</div></div> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> wavenumbers</div></div> <div class=”scp-output section”><div class=”attr-name”> coordinates</div><div>:</div><div class=”attr-value”> <div class=’numeric’>[ 6000 5999 … 650.9 649.9] cm⁻¹</div></div></div></details></div> <div class=”scp-output section”><details><summary> Dimension `y`</summary> <div class=”scp-output section”><div class=”attr-name”> size</div><div>:</div><div class=”attr-value”> 1</div></div> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> acquisition timestamp (GMT)</div></div> <div class=”scp-output section”><div class=”attr-name”> coordinates</div><div>:</div><div class=”attr-value”> <div class=’numeric’>[1.544e+09] s</div></div></div> <div class=”scp-output section”><div class=”attr-name”> labels</div><div>:</div><div class=”attr-value”> … </div></div> <div class=’label’> [[ 2018-11-30 07:10:57+00:00]<br/> [ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_101.SPA]]</div></details></div></details></div></div> <div class=’scp-output section’><div class=’scp-output’><details><summary>1: NDDataset: [float64] a.u. (shape: (y:1, x:5549))[7_CZ0-100 Pd_102]</summary><div class=”scp-output section”><details><summary>Summary</summary> <div class=”scp-output section”><div class=”attr-name”> name</div><div>:</div><div class=”attr-value”> 7_CZ0-100 Pd_102</div></div> <div class=”scp-output section”><div class=”attr-name”> author</div><div>:</div><div class=”attr-value”> runner@fv-az2211-104</div></div> <div class=”scp-output section”><div class=”attr-name”> created</div><div>:</div><div class=”attr-value”> 2025-04-27 01:45:43+00:00</div></div> <div class=”scp-output section”><div class=”attr-name”> description</div><div>:</div><div class=”attr-value”> <div># Omnic name: 7_CZ0-100 Pd_102<br/> # Filename: 7_CZ0-100_Pd_102.SPA</div></div></div> <div class=”scp-output section”><div class=”attr-name”> history</div><div>:</div><div class=”attr-value”> <div>2025-04-27 01:45:43+00:00> Imported from spa file(s)<br/> 2025-04-27 01:45:43+00:00> Data processing history from Omnic :<br/> ————————————<br/> Acquisition échantillon

    <br/> Background acquis le Ven Nov 30 08:12:56 2018 (GMT+01:00) <br/> Format Final : Absorbance <br/> Résolution: 4,000 de 649,9207 à 5999,7134 <br/> Roue de validation: 0 <br/> Roue porte écran atténuation: Vide <br/> Numéro Série du banc:ALK1100494</div></div></div></details></div>

    <div class=”scp-output section”><details><summary> Data </summary> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> absorbance</div></div> <div class=”scp-output section”><div class=”attr-name”> values</div><div>:</div><div class=”attr-value”> … </div></div> <div class=’numeric’> [[ 1.552 1.553 … 2.161 2.109]] a.u.</div> <div class=”scp-output section”><div class=”attr-name”> shape</div><div>:</div><div class=”attr-value”> (y:1, x:5549)</div></div></details></div> <div class=”scp-output section”><details><summary> Dimension `x`</summary> <div class=”scp-output section”><div class=”attr-name”> size</div><div>:</div><div class=”attr-value”> 5549</div></div> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> wavenumbers</div></div> <div class=”scp-output section”><div class=”attr-name”> coordinates</div><div>:</div><div class=”attr-value”> <div class=’numeric’>[ 6000 5999 … 650.9 649.9] cm⁻¹</div></div></div></details></div> <div class=”scp-output section”><details><summary> Dimension `y`</summary> <div class=”scp-output section”><div class=”attr-name”> size</div><div>:</div><div class=”attr-value”> 1</div></div> <div class=”scp-output section”><div class=”attr-name”> title</div><div>:</div><div class=”attr-value”> acquisition timestamp (GMT)</div></div> <div class=”scp-output section”><div class=”attr-name”> coordinates</div><div>:</div><div class=”attr-value”> <div class=’numeric’>[1.544e+09] s</div></div></div> <div class=”scp-output section”><div class=”attr-name”> labels</div><div>:</div><div class=”attr-value”> … </div></div> <div class=’label’> [[ 2018-11-30 07:22:52+00:00]<br/> [ /home/runner/.spectrochempy/testdata/irdata/subdir/7_CZ0-100_Pd_102.SPA]]</div></details></div></details></div></div> </details></div>

Handling Metadata

Here is an example of accessing metadata

[28]:
X = scp.read_omnic("irdata/CO@Mo_Al2O3.SPG")
print(f"Title: {X.title}")
print(f"Origin: {X.origin}")
print(f"Description: {X.description}")
Title: absorbance
Origin:
Description: Omnic title: Group sust Mo_Al2O3_base line.SPG
Omnic filename: /home/runner/.spectrochempy/testdata/irdata/CO@Mo_Al2O3.SPG

and now do some modifications:

[29]:
X.title = "Modified title"
X.origin = "OMNIC measurement"
X.description = "Modified description"
print("Modified metadata:")
print(f"Title: {X.title}")
print(f"Origin: {X.origin}")
print(f"Description: {X.description}")
Modified metadata:
Title: Modified title
Origin: OMNIC measurement
Description: Modified description

Reading the metadata now reflect the change

[30]:
X.title
[30]:
'Modified title'

Error Handling

When trying to read file, it is a good practice to handle errors explicitely. For example:

[31]:
try:
    X = scp.read_omnic("nonexistent_file.spa")
except FileNotFoundError:
    scp.error_(FileNotFoundError, "File not found")
except Exception as e:
    scp.error_(f"Error reading file: {e}")
 File/directory not found locally: Attempt to download it from the GitHub repository `spectrochempy_data`...
 ERROR | FileNotFoundError: File not found

Advanced Data Operations

Example of data manipulation:

[32]:
X = scp.read_omnic("irdata/CO@Mo_Al2O3.SPG")
  • Baseline correction

[33]:
X_corrected = X - X[0]  # Subtract first spectrum as baseline
  • Normalization

[34]:
X_normalized = X / X.max()
[35]:
print("Original data shape:", X.shape)
print("Max value before normalization:", X.max())
print("Max value after normalization:", X_normalized.max())
Original data shape: (19, 3112)
Max value before normalization: 0.24812382459640503 a.u.
Max value after normalization: 1.0