Adding a Reader
Import of spectroscopic data with their metadata from various file formats is a key feature of SpectroChemPy. Data
import is handled through the generic read()
function (in spectrochempy.core.readers.importer
).
This guide describes the steps to add a specific reader, using the example of reading Grams/Thermo .spc files.
Step 1: Add Tests and Sample Files
Following Test-Driven Development, start by writing tests and providing sample files:
Create test file in
tests/test_core/test_readers/test_xxx.py
Add sample files in
spectrochempy_data/testdata/xxx_data/
Write basic test case:
def test_read_spc():
path = "spc_data/BENZENE.SPC"
dataset = scp.read_spc(path)
assert dataset.shape == (1, 1842)
assert isinstance(dataset, scp.NDDataset)
For local testing, configure the data directory:
scp.preferences.datadir = Path("path/to/testdata")
Step 2: Register the File Format
Add format details in spectrochempy/core/readers/importer.py
:
FILETYPES = [
// ...existing code...
("galactic", "GRAMS/Thermo Galactic files (*.spc)"),
]
ALIAS = [
// ...existing code...
("galactic", "spc"),
]
Step 3: Create the Reader Module
Create spectrochempy/core/readers/reader_xxx.py
:
# Basic structure for reader_spc.py
from spectrochempy.core.dataset.nddataset import NDDataset
from spectrochempy.core.readers.importer import _importer_method, Importer
__all__ = ["read_spc"]
__dataset_methods__ = __all__
def read_spc(*paths, **kwargs):
"""Read Thermo Galactic .spc file(s).
Parameters
----------
*paths : str or Path
Path(s) to .spc file(s)
**kwargs
Additional import options
Returns
-------
NDDataset or list of NDDataset
Loaded spectral data
"""
kwargs["filetypes"] = ["GRAMS/Thermo Galactic files (*.spc)"]
kwargs["protocol"] = ["spc"]
importer = Importer()
return importer(*paths, **kwargs)
@_importer_method
def _read_spc(*args, **kwargs):
"""Internal reader implementation."""
dataset, filename = args
// ...implementation details...
return dataset
Step 4: Data Format Guidelines
When implementing the reader:
Always return 2D datasets, even for 1D spectra
Use timestamps for time axes when available
Include relevant metadata and units
Add proper description
Example of proper axis setup:
# Set up coordinates
x_data = get_wavelengths(file) # Your implementation
x_coord = scp.Coord(x_data, title="wavelength", units="nm")
y_data = get_timestamps(file) # Your implementation
y_coord = scp.Coord(y_data, title="Time", units="s",
labels=acquisition_dates)
# Create dataset
data = get_spectra(file) # Your implementation
dataset = NDDataset(data,
coords=[y_coord, x_coord],
title="Absorption",
units="absorbance")
dataset.description = "Dataset from .spc file\n"
dataset.history.append(f"Imported from {filename}")
Step 5: Documentation
Add docstrings following NumPy style
Include examples in docstrings
Add reader to main documentation
Update
whatsnew/changelog.rst
For complete implementation examples, see existing readers in spectrochempy/core/readers/
.