Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.chemolytic.com/llms.txt

Use this file to discover all available pages before exploring further.

Unsupervised analysis means finding structure in your spectra without considering what you want to predict. It’s how you spot outliers, identify natural groupings, and decide whether your data is even modellable before committing to a full experiment. Chemolytic supports three methods:
MethodWhat it answers
PCAWhere does most of the variance in my spectra come from? Are there outliers?
t-SNEAre there natural clusters in my data when I look at it in 2D?
K-MeansIf I assume there are N groups, what does that grouping look like?

Analysis vs run

The unsupervised feature has two levels:
ConceptDescription
AnalysisA workspace tied to one sensor. Holds a set of related runs you want to compare.
RunA single execution of one method (PCA, t-SNE, or K-Means) with a specific preprocessing pipeline and parameters.
You typically create one analysis per investigation (e.g., “NIR Spring 2026 baseline”), then make several runs inside it to try different preprocessing and methods.
Analyses list page showing existing unsupervised analyses with name, sensor, and creation date

Creating an analysis

Click New analysis on the Unsupervised page.
New analysis form with name, sensor selection, and CoPilot suggestion
FieldRequiredNotes
NameYesA descriptive label (e.g., “NIR Spring 2026 · PCA baseline”)
SensorYesOnly spectra from this sensor are included
You can only create an analysis on a sensor that already has spectra uploaded.
The CoPilot suggestion box recommends a starting point: SNV preprocessing followed by PCA. This is a good first pass for most spectroscopy data.
Click Create analysis to land on the detail page.

Analysis detail page

The detail page has three main sections:
  1. Spectra preview: apply preprocessing and visualize the resulting spectra before committing to a run
  2. Configure run: pick a method and parameters, then launch
  3. Runs list: see all runs in this analysis, their status, and open them

Spectra preview

Before committing to a run, you can interactively apply preprocessing and watch the spectra change. This is the fastest way to find a pipeline that brings out the structure in your data. The Spectra preview card has two parts:
  • Spectral overlay chart on the left: every active spectrum overlaid
  • Preprocessing pipeline sidebar on the right: the catalog of methods you can apply
Analysis detail page showing spectra preview chart, configure run form, and runs list

Building a pipeline

The pipeline is a list of preprocessing steps, applied in order. Each step’s output feeds the next step’s input.
1

Pick a method from the catalog

The catalog groups methods by category (Smooth, Baseline, Scatter, Derivative, Scale). Click a method to add it to the pipeline.
2

Reorder steps if needed

Drag a step up or down to change the order. Order matters: SNV → SG D1 produces a different result than SG D1 → SNV.
3

Preview the result

Click Preview spectra. The chart updates to show your spectra after the pipeline is applied. The original raw spectra are not modified.
4

Iterate

Add, remove, or reorder steps and preview again. There’s no cost to previewing as many combinations as you want.

Reading the chart

The preview chart shows all active (non-archived) spectra overlaid. Hover any line to highlight it. The x-axis uses the sensor’s units (e.g., nm or cm⁻¹), the y-axis depends on the last step in the pipeline (e.g., absorbance for raw or SNV, derivative units for SG D1). What to look for as you tweak the pipeline:
ObservationWhat it suggests
Lines collapse to one shape, individual differences goneYou’ve over-smoothed or over-corrected. Remove a step.
Lines spread out into clear groups when coloured by sampleThe pipeline is amplifying useful structure. Good sign.
Baseline drift removed (flat low region)Your baseline correction is working
Sharp peaks become more visibleA derivative is helping
Noise removed (smoother lines)Smoothing is working

Available methods

CategoryMethodsPurpose
SmoothSG Smooth, Mean filter, Median filter, WhittakerRemove high-frequency noise
BaselineLinear baseline, AirPLS, ArPLSRemove baseline drift
ScatterSNV, MSC, RNVCorrect for light scattering and path length
DerivativeSG D1, SG D2, Norris D1, Norris D2Enhance peaks, remove baselines
ScaleMean center, Autoscale, Pareto, MinMax, NormStandardize variables before modelling
Categories are mostly exclusive: you typically use one method per category. The UI prevents you from adding two baseline methods to the same pipeline, for example.

Common pipelines

PipelineWhen to use
SNV → Mean centerA safe default for most NIR/FTIR data
SG D1 → Mean centerWhen you have baseline drift but not much scatter
MSC → SG D2 → AutoscaleWhen scatter and overlapping peaks are both problems

From preview to a run

When the preview looks good, click Use in a new run → at the bottom of the preview card. This carries the pipeline directly into the Configure run form below, so you don’t have to rebuild it. Pick a method, set parameters, and launch.
The preview is for visual exploration only. No run is saved and no plan limit is consumed. Iterate freely on the pipeline before committing to a run.

Spectra management

You can archive specific spectra to exclude them from future runs (useful for outliers identified by PCA). The archive list on the detail page mirrors the global Spectra page but scoped to this analysis. Archived spectra do not participate in new runs but the existing runs that already used them remain unchanged.

Plan limits

Your plan limits the total number of analyses you can have. The current count and limit appear on the Analyses page. If you’ve reached the limit, delete an analysis you no longer need before creating a new one.

What’s next

Now that you’ve configured a preprocessing pipeline, choose a method:
  • PCA for variance and outlier detection
  • t-SNE for 2D cluster visualization
  • K-Means to group samples into N clusters
  • Comparing runs when you have multiple runs to compare