Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.chemolytic.com/llms.txt

Use this file to discover all available pages before exploring further.

Samples and properties are the foundation of every model you build. Get these right and the rest of the workflow follows naturally.

What is a sample?

A sample is a physical item you measured with a spectrometer. One bottle of olive oil, one soil core, one tablet, one grain batch. Each sample has:
FieldDescription
NameA unique identifier within the project. Required, max 200 characters.
DescriptionOptional free-text notes (batch number, harvest date, source, etc.).
TargetsThe measured values for each property. Optional.
SpectraSpectroscopy measurements linked to this sample. Added separately.
Sample names must be unique within a project. You cannot have two samples named “Sample-001” in the same project.

What is a property?

A property is a measurable attribute that you want to predict from spectra. Acidity. Protein content. Origin. Quality grade. Properties are defined once at the project level. Every sample can then have a value for each property.
Properties tab showing a grid of property cards with names, types, and units

Continuous vs categorical

Every property is either continuous or categorical. This decision determines whether you train a regression or classification model.

Continuous properties

Numeric values on a scale.
ExamplesUnit
Moisture content%
Protein concentrationmg/mL
AciditypH
Brix°Bx
Densityg/cm³
Continuous properties produce regression models. The model predicts a number.

Categorical properties

Discrete labels from a fixed list.
ExamplesCategories
Quality gradeA, B, C
OriginBrazil, Colombia, Ethiopia
Pass/FailPass, Fail
VarietyArabica, Robusta
Categorical properties produce classification models. The model predicts which category.
A categorical property must have at least 2 categories. If you have only one possible value, that’s not a prediction problem.

How samples and properties connect

A sample can have one value (called a target) for each property defined in the project. These connections are what enable model training. A sample can:
  • Have values for all defined properties
  • Have values for some properties only
  • Have no property values at all (useful for samples you only want to measure but not train on yet)

Order matters: define properties first

When you start a project, define your properties before adding samples. This way you can fill in property values as you add each sample, instead of going back later to update each one.
1

Define your properties

Go to Samples → Properties tab and create each property you want to predict.
2

Add samples

Switch to the Samples tab. As you add samples, fill in property values.
3

Upload spectra

Link spectra to existing samples (covered in Uploading spectra).

How many samples do I need?

There is no fixed minimum. Reliability depends on the diversity of your data and the difficulty of the prediction problem.
GoalRecommended sample count
Quick prototype, exploration20–50
Production-ready regression model100–300
Robust classification with multiple categories50+ samples per category
The diversity of your samples matters more than the count. 100 samples spanning the full range of property values produces a better model than 1000 samples clustered in a narrow range.