spectrochempy.CP

class CP(*, log_level='WARNING', warm_start=False, cvg_criterion='abs_rec_error', fixed_modes, hard_sparsity=None, init='svd', l1_reg=None, l2_reg=None, l2_square_reg=None, monotonicity=None, n_components=0, n_iter_max=100, n_iter_max_inner=10, non_negative=False, normalize=None, normalized_sparsity=None, random_state=None, return_errors=False, simplex=None, smoothness=None, soft_sparsity=None, svd='truncated_svd', tol_inner=1e-06, tol_outer=1e-08, unimodality=None, verbose=0)[source]

CP/PARAFAC decomposition of 3D datasets using TensorLy.

CP (Canonical Polyadic) decomposition, also known as PARAFAC, factorizes a 3D tensor into a sum of rank-1 tensors. By default, this implementation uses TensorLy’s parafac function. When constraints or penalties are active (e.g., non_negative, l1_reg, smoothness, etc.), it automatically switches to TensorLy’s constrained_parafac function with AO-ADMM optimization.

Note: fixed_modes is supported by both parafac and constrained_parafac, so using fixed_modes alone does not trigger the constrained path.

Only 3D datasets are supported. For 2D data, use PCA, NMF, or SVD instead.

Parameters:

Number of components (rank) for the decomposition.
Maximum number of outer iterations for the ALS algorithm.
Number of iterations for inner loop (ADMM optimization).
Type of factor matrix initialization. If a CPTensor is passed, it is used directly.
Function to use for SVD computation during initialization.
Relative reconstruction error tolerance for outer loop convergence.
Absolute reconstruction error tolerance for inner loop (ADMM optimization).
Seed for random number generator or RandomState instance for reproducibility.
Level of verbosity for iteration logging.
Whether to store iteration errors after fitting. Errors are only
available when using constrained_parafac (i.e., when constraints
are active). For unconstrained parafac, errors will be None.
If True, applies non-negative constraint to all modes. Can also be a dict
specifying constraints per mode.
L1 norm regularization parameter for sparsity. Applied to factors.
L2 norm regularization parameter. Applied to factors.
L2 square norm regularization parameter. Applied to factors.
If True, enforces unimodality constraint on all modes.
If True, normalizes factors by dividing by maximum value.
Projects factors onto the simplex with the given parameter.
Normalizes factors with L1 norm after hard thresholding.
Imposes L1 norm bound on factor columns.
Optimizes factors by solving a banded system for smoothness.
If True, projects factor columns to monotonically decreasing distribution.
Applies hard thresholding with the given threshold.
Stopping criterion for ALS. “abs_rec_error” uses absolute difference,
“rec_error” uses relative difference.
Modes for which initial values are not modified during optimization.
The last mode cannot be fixed.

cvg_criterionany value of ['abs_rec_error', 'rec_error'], optional, default: 'abs_rec_error'

Stopping criterion for ALS, works if tol is not None.

If ‘rec_error’, ALS stops at current iteration if (previous rec_error - current rec_error) < tol. If ‘abs_rec_error’, ALS terminates when |previous rec_error - current rec_error| < tol.

fixed_modeslist, optional, default: []

A list of modes for which the initial value is not modified.: The last mode cannot be fixed due to error computation.

hard_sparsitya float or a list or a dict, optional, default: None

Hard thresholding with the given threshold.

initany value of ['random', 'svd', 'CPTensor'], optional, default: 'svd'

Type of factor matrix initialization.

If a CPTensor is passed, this is directly used for initialization. See initialize_factors.

l1_rega float or a list or a dict, optional, default: None

Penalizes the factor with the l1 norm using the input value as regularization parameter.

l2_rega float or a list or a dict, optional, default: None

Penalizes the factor with the l2 norm using the input value as regularization parameter.

l2_square_rega float or a list or a dict, optional, default: None

Penalizes the factor with the l2 square norm using the input value as regularization parameter.

monotonicitya boolean or a dict, optional, default: None

Projects columns to monotonically decreasing distribution.: Applied to each column separately. If True, monotonicity constraint is applied to all modes.

n_componentsint, optional, default: 0

Number of components (this is the ‘rank’ parameter used in TensorLy).

n_iter_maxint, optional, default: 100

Maximum number of outer iterations.

n_iter_max_innerint, optional, default: 10

Number of iterations for inner loop (ADMM optimization).

non_negativea boolean or a dict, optional, default: False

This constraint clips negative values to ‘0’.

If True, non-negative constraint is applied to all modes.

normalizea boolean or a dict, optional, default: None

This constraint divides all the values by maximum value of the input array.: If True, normalize constraint is applied to all modes.

normalized_sparsitya float or a list or a dict, optional, default: None

Normalizes with the norm after hard thresholding.

random_statean int or a RandomState, optional, default: None

If int, used to set the seed of the random number generator.: If numpy.random.RandomState, used to initialize factor matrices with uniform distribution.

return_errorsbool, optional, default: False

Activate return of iteration errors.

simplexa float or a list or a dict, optional, default: None

Projects on the simplex with the given parameter.: Applied to each column separately.

smoothnessa float or a list or a dict, optional, default: None

Optimizes the factors by solving a banded system.

soft_sparsitya float or a list or a dict, optional, default: None

Impose that the columns of factors have L1 norm bounded by a user-defined threshold.

svdany value of ['numpy_svd', 'truncated_svd', 'randomized_svd'], optional, default: 'truncated_svd'

Function to use to compute the SVD. Maps to tensorly SVD functions.

tol_innerfloat, optional, default: 1e-06

Absolute reconstruction error tolerance for factor update during inner loop,: i.e., ADMM optimization.

tol_outerfloat, optional, default: 1e-08

Relative reconstruction error tolerance for outer loop.

The algorithm is considered to have found a local minimum when the reconstruction error is less than tol_outer.

unimodalitya boolean or a dict, optional, default: None

If True, enforces unimodality constraint on all modes.: Applied to each mode separately.

verboseint, optional, default: 0

Level of verbosity.

See also

PCA: Principal Component Analysis for 2D data.
SVD: Singular Value Decomposition for 2D data.

Notes

This method requires the optional dependency tensorly. Install it with:

pip install tensorly

CP decomposition is sensitive to initialization and rank choice. The results may vary with different random seeds. Use random_state for reproducibility.

Core consistency (CORCONDIA) helps assess model validity. Values close to 100% indicate a good fit, while negative values suggest overfactoring.

Examples

>>> import spectrochempy as scp
>>> import numpy as np
>>> X = np.random.rand(6, 8, 10)
>>> ds = scp.NDDataset(X)
>>> cp = scp.CP(n_components=2)
>>> cp.fit(ds)
>>> A, B, C = cp.loadings
>>> Xr = cp.inverse_transform()

A

Factor matrix for mode 0 with shape (mode_0_size, n_components).

Type:: NDDataset

B

Factor matrix for mode 1 with shape (mode_1_size, n_components).

Type:: NDDataset

C

Factor matrix for mode 2 with shape (mode_2_size, n_components).

Type:: NDDataset

loadings

Tuple of factor matrices (A, B, C).

Type:: tuple

weights

Weights from CP decomposition.

Type:: ndarray

errors

Iteration errors during fitting. Available when TensorLy returns them (typically with constrained_parafac). Returns None if unavailable or if not fitted yet.

Type:: list or None

SSE

Sum of Squared Errors of the reconstruction.

Type:: float

explained_variance

Percentage of variance explained by the model.

Type:: float

core_consistency

CORCONDIA (Core Consistency) diagnostic value. Can be negative if overfactoring.

Type:: float

fit(X)[source]: Fit the CP model to a 3D dataset.

fit_transform(X)[source]: Fit the model and return the reconstructed tensor.

inverse_transform()[source]: Return the reconstructed tensor from fitted factors.

Initialize the BaseConfigurable class.

Parameters:

log_level (int, optional) – The log level at startup. Default is logging.WARNING.
**kwargs (dict) – Additional keyword arguments for configuration.

Attributes Summary

`A`	Return factor matrix A (mode 0 loadings).
`B`	Return factor matrix B (mode 1 loadings).
`C`	Return factor matrix C (mode 2 loadings).
`SSE`	Return Sum of Squared Errors.
`X`	Return the X input dataset (eventually modified by the model).
`Y`	The `Y` input.
`components`	Return factor B (mode 1) as components, following PCA convention.
`config`	`traitlets.config.Config` object.
`core_consistency`	Return CORCONDIA (Core Consistency) diagnostic.
`cvg_criterion`	Stopping criterion for ALS, works if `tol` is not None.
`errors`	Return iteration errors if return_errors was True.
`explained_variance`	Return explained variance percentage.
`fixed_modes`	A list of modes for which the initial value is not modified.
`hard_sparsity`	Hard thresholding with the given threshold.
`init`	Type of factor matrix initialization.
`l1_reg`	Penalizes the factor with the l1 norm using the input value as regularization parameter.
`l2_reg`	Penalizes the factor with the l2 norm using the input value as regularization parameter.
`l2_square_reg`	Penalizes the factor with the l2 square norm using the input value as regularization parameter.
`loadings`	Return tuple of factor matrices (A, B, C).
`log`	Return `log` output.
`monotonicity`	Projects columns to monotonically decreasing distribution.
`n_components`	Number of components (this is the 'rank' parameter used in TensorLy).
`n_iter_max`	Maximum number of outer iterations.
`n_iter_max_inner`	Number of iterations for inner loop (ADMM optimization).
`name`	Object name
`non_negative`	This constraint clips negative values to '0'.
`normalize`	This constraint divides all the values by maximum value of the input array.
`normalized_sparsity`	Normalizes with the norm after hard thresholding.
`random_state`	If int, used to set the seed of the random number generator.
`return_errors`	Activate return of iteration errors.
`simplex`	Projects on the simplex with the given parameter.
`smoothness`	Optimizes the factors by solving a banded system.
`soft_sparsity`	Impose that the columns of factors have L1 norm bounded by a user-defined threshold.
`svd`	Function to use to compute the SVD.
`tol_inner`	Absolute reconstruction error tolerance for factor update during inner loop, i.e., ADMM optimization.
`tol_outer`	Relative reconstruction error tolerance for outer loop.
`unimodality`	If True, enforces unimodality constraint on all modes.
`verbose`	Level of verbosity.
`weights`	Return the weights from CP decomposition.

Methods Summary

`fit`(X)	Fit the CP model on X.
`fit_transform`(X, **kwargs)	Fit the CP model on X and return the factors.
`get_components`([n_components])	Return the component's dataset: (selected n_components, n_features).
`inverse_transform`([X_transform])	Transform data back to its original space.
`parameters`([replace, removed, default])	Alias for `params` method.
`params`([default])	Return current or default configuration values.
`plot_merit`([X, X_hat])	Plot the input (`X`), reconstructed (`X_hat`) and residuals.
`plotmerit`([X, X_hat])	Plot the input (`X`), reconstructed (`X_hat`) and residuals.
`reconstruct`([X_transform])	Transform data back to its original space.
`reduce`([X])	Apply dimensionality reduction to `X`.
`reset`()	Reset configuration parameters to their default values.
`to_dict`()	Return config value in a dict form.
`transform`([X])	Apply dimensionality reduction to `X`.

Attributes Documentation

A: Return factor matrix A (mode 0 loadings).

B: Return factor matrix B (mode 1 loadings).

C: Return factor matrix C (mode 2 loadings).

SSE: Return Sum of Squared Errors.

X: Return the X input dataset (eventually modified by the model).

Y: The Y input.

components

Return factor B (mode 1) as components, following PCA convention.

Returns:: NDDataset – Factor B with shape (mode_1_size, n_components).

config: traitlets.config.Config object.

core_consistency

Return CORCONDIA (Core Consistency) diagnostic.

Returns:: float – Core consistency value. Can be negative if overfactoring occurred.

cvg_criterion

Stopping criterion for ALS, works if tol is not None.

If ‘rec_error’, ALS stops at current iteration if (previous rec_error - current rec_error) < tol. If ‘abs_rec_error’, ALS terminates when |previous rec_error - current rec_error| < tol.

errors: Return iteration errors if return_errors was True.

explained_variance: Return explained variance percentage.

fixed_modes: A list of modes for which the initial value is not modified. The last mode cannot be fixed due to error computation.

hard_sparsity: Hard thresholding with the given threshold.

init

Type of factor matrix initialization.

If a CPTensor is passed, this is directly used for initialization. See initialize_factors.

l1_reg: Penalizes the factor with the l1 norm using the input value as regularization parameter.

l2_reg: Penalizes the factor with the l2 norm using the input value as regularization parameter.

l2_square_reg: Penalizes the factor with the l2 square norm using the input value as regularization parameter.

loadings: Return tuple of factor matrices (A, B, C).

log: Return log output.

monotonicity: Projects columns to monotonically decreasing distribution. Applied to each column separately. If True, monotonicity constraint is applied to all modes.

n_components: Number of components (this is the ‘rank’ parameter used in TensorLy).

n_iter_max: Maximum number of outer iterations.

n_iter_max_inner: Number of iterations for inner loop (ADMM optimization).

name: Object name

non_negative

This constraint clips negative values to ‘0’.

If True, non-negative constraint is applied to all modes.

normalize: This constraint divides all the values by maximum value of the input array. If True, normalize constraint is applied to all modes.

normalized_sparsity: Normalizes with the norm after hard thresholding.

random_state: If int, used to set the seed of the random number generator. If numpy.random.RandomState, used to initialize factor matrices with uniform distribution.

return_errors: Activate return of iteration errors.

simplex: Projects on the simplex with the given parameter. Applied to each column separately.

smoothness: Optimizes the factors by solving a banded system.

soft_sparsity: Impose that the columns of factors have L1 norm bounded by a user-defined threshold.

svd: Function to use to compute the SVD. Maps to tensorly SVD functions.

tol_inner: Absolute reconstruction error tolerance for factor update during inner loop, i.e., ADMM optimization.

tol_outer

Relative reconstruction error tolerance for outer loop.

The algorithm is considered to have found a local minimum when the reconstruction error is less than tol_outer.

unimodality: If True, enforces unimodality constraint on all modes. Applied to each mode separately.

verbose: Level of verbosity.

weights: Return the weights from CP decomposition.

Methods Documentation

fit(X)[source]

Fit the CP model on X.

Parameters:: X (NDDataset) – 3D dataset to decompose.
Returns:: self (CP) – The fitted CP instance.

fit_transform(X, **kwargs)[source]

Fit the CP model on X and return the factors.

Parameters:

X (NDDataset) – 3D dataset to decompose.
**kwargs – Additional keyword arguments passed to fit.

Returns:

tuple of NDDataset – The factor matrices (A, B, C).

get_components(n_components=None)

Return the component’s dataset: (selected n_components, n_features).

Parameters:: n_components (int, optional, default: None) – The number of components to keep in the output dataset. If None, all calculated components are returned.
Returns:: NDDataset – Dataset with shape (n_components, n_features)

inverse_transform(X_transform=None, **kwargs)[source]

Transform data back to its original space.

Reconstruct the original tensor from the CP factors.

Parameters:

X_transform (None) – Ignored. Present for API compatibility.
**kwargs – Additional keyword arguments (ignored).

Returns:

NDDataset – Reconstructed dataset with shape matching the input.

parameters(replace="params", removed="0.8.0") def parameters(self, default=False)[source]: Alias for params method.

Deprecated since version 0.8.0: Use params instead.

params(default=False)[source]

Return current or default configuration values.

Parameters:: default (bool, optional, default: False) – If default is True, the default parameters are returned, else the current values.
Returns:: dict – Current or default configuration values.

plot_merit(X=None, X_hat=None, **kwargs)[source]

Plot the input (X), reconstructed (X_hat) and residuals.

\(X\) and \(\hat{X}\) can be passed as arguments. If not, the X attribute is used for \(X`and :math:\)hat{X}`is computed by the inverse_transform method

Parameters:

X (NDDataset, optional) – Original dataset. If is not provided (default), the X attribute is used and X_hat is computed using inverse_transform.
X_hat (NDDataset, optional) – Inverse transformed dataset. if X is provided, X_hat must also be provided as compuyed externally.

Returns:

Axes – Matplotlib subplot axe.

Other Parameters:

exp_c (color, colormap, or list of colors, optional) – Color(s) for experimental spectra. - None: use unified semantic resolver (auto-detect categorical/sequential) - Single color: use for all experimental spectra - Colormap name/object: sample colors from colormap - List/tuple: use as explicit color cycle
calc_c (color, colormap, or list of colors, optional) – Color(s) for calculated spectra. - None: use default blue “#2a6fbb” - Single color: use for all calculated spectra - Colormap name/object: sample colors from colormap - List/tuple: use as explicit color cycle
resid_c (color, colormap, or list of colors, optional) – Color(s) for residual spectra. - None: use default grey “0.4” - Single color: use for all residual spectra - Colormap name/object: sample colors from colormap - List/tuple: use as explicit color cycle
exp_linestyle (str, optional) – Line style for experimental spectra. Default: “-“.
calc_linestyle (str, optional) – Line style for calculated spectra. Default: “–“.
resid_linestyle (str, optional) – Line style for residual spectra. Default: “-“.
exp_linewidth (float, optional) – Line width for experimental spectra. Default: 1.2.
calc_linewidth (float, optional) – Line width for calculated spectra. Default: 1.0.
resid_linewidth (float, optional) – Line width for residual spectra. Default: 1.0.
min_contrast (float, optional) – Minimum contrast ratio for sequential colormaps. Default: 1.5.
offset (float, optional, default: None) – Specify the separation (in percent) between the \(X\) , \(X_hat\) and \(E\).
nb_traces (int or 'all', optional) – Number of lines to display. Default is 'all'.
**others (Other keywords parameters) – Parameters passed to the internal plot method of the X dataset.

plotmerit(X=None, X_hat=None, **kwargs)[source]

Plot the input (X), reconstructed (X_hat) and residuals.

\(X\) and \(\hat{X}\) can be passed as arguments. If not, the X attribute is used for \(X`and :math:\)hat{X}`is computed by the inverse_transform method

Parameters:

X (NDDataset, optional) – Original dataset. If is not provided (default), the X attribute is used and X_hat is computed using inverse_transform.
X_hat (NDDataset, optional) – Inverse transformed dataset. if X is provided, X_hat must also be provided as compuyed externally.

Returns:

Axes – Matplotlib subplot axe.

Other Parameters:

exp_c (color, colormap, or list of colors, optional) – Color(s) for experimental spectra. - None: use unified semantic resolver (auto-detect categorical/sequential) - Single color: use for all experimental spectra - Colormap name/object: sample colors from colormap - List/tuple: use as explicit color cycle
calc_c (color, colormap, or list of colors, optional) – Color(s) for calculated spectra. - None: use default blue “#2a6fbb” - Single color: use for all calculated spectra - Colormap name/object: sample colors from colormap - List/tuple: use as explicit color cycle
resid_c (color, colormap, or list of colors, optional) – Color(s) for residual spectra. - None: use default grey “0.4” - Single color: use for all residual spectra - Colormap name/object: sample colors from colormap - List/tuple: use as explicit color cycle
exp_linestyle (str, optional) – Line style for experimental spectra. Default: “-“.
calc_linestyle (str, optional) – Line style for calculated spectra. Default: “–“.
resid_linestyle (str, optional) – Line style for residual spectra. Default: “-“.
exp_linewidth (float, optional) – Line width for experimental spectra. Default: 1.2.
calc_linewidth (float, optional) – Line width for calculated spectra. Default: 1.0.
resid_linewidth (float, optional) – Line width for residual spectra. Default: 1.0.
min_contrast (float, optional) – Minimum contrast ratio for sequential colormaps. Default: 1.5.
offset (float, optional, default: None) – Specify the separation (in percent) between the \(X\) , \(X_hat\) and \(E\).
nb_traces (int or 'all', optional) – Number of lines to display. Default is 'all'.
**others (Other keywords parameters) – Parameters passed to the internal plot method of the X dataset.

reconstruct(X_transform=None, **kwargs)[source]

Transform data back to its original space.

In other words, return an input X_original whose reduce/transform would be X_transform.

Parameters:

X_transform (array-like of shape (n_observations, n_components), optional) – Reduced X data, where n_observations is the number of observations and n_components is the number of components. If X_transform is not provided, a transform of X provided in fit is performed first.
**kwargs (keyword parameters, optional) – See Other Parameters.

Returns:

NDDataset – Dataset with shape (n_observations, n_features).

Other Parameters:

n_components (int, optional) – The number of components to use for the reconstruction.