Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.chemolytic.com/llms.txt

Use this file to discover all available pages before exploring further.

CoPilot is the fastest way to get a working model. You give it a dataset and a target property; it tests many combinations and returns the best one.

What CoPilot does

Behind the scenes, CoPilot:
  1. Reads your dataset (sample count, property type, value distribution)
  2. Picks a cross-validation strategy based on dataset size (5-fold, 10-fold, or leave-one-out)
  3. Allows a model pool based on the problem (PLS, Ridge, KNN, SVR, RF for regression; PLS-DA, Logistic, KNN, SVM, RF for classification)
  4. Searches preprocessing pipelines across scatter correction, derivatives, smoothing, and scaling
  5. Optimizes hyperparameters
  6. Runs 1000-3000 trials until the search converges
  7. Selects the best trial by primary metric (RMSE for regression, F1 macro for classification) and complexity
A typical run takes about 15 minutes.

Following progress

While CoPilot runs, the experiment status is Running and the detail page polls every 3 seconds. You can leave the page and come back; nothing is lost. You’ll see:
  • A progress indicator
  • The “Trials” count rising as work completes
  • Live updates to the leaderboard
When all trials finish, the status changes to Done and the best trial is selected.

Reading the result

The detail page leads with a Best trial hero section showing the winning model.
Best trial hero section showing model name and metric grid with R squared, RMSE, MAE, Bias, RPD

Toggle: CV vs Test

Two buttons let you switch between cross-validation and held-out test metrics.
ViewWhat it tells you
CVPerformance averaged across cross-validation folds. Most stable estimate of generalization.
TestPerformance on a held-out test set never seen during training.
In CoPilot, both are computed automatically. Use CV to pick a model and Test to confirm it didn’t overfit.

Regression metrics

MetricRangeBetter
-∞ to 1.0Higher (1.0 is perfect)
RMSE0 to ∞, target unitsLower
MAE0 to ∞, target unitsLower
Bias-∞ to ∞, target unitsCloser to 0
RPD0 to ∞Higher (>2 useful, >3 strong)

Classification metrics

MetricRangeBetter
Accuracy0 to 1.0Higher
F1 (macro)0 to 1.0Higher (treats all classes equally)
F1 (weighted)0 to 1.0Higher (weighted by class size)
Precision0 to 1.0Higher
Recall0 to 1.0Higher

CoPilot’s reasoning

Click How CoPilot configured this experiment to expand a list of decisions CoPilot made and why. Examples of what you’ll see:
  • “Using Venetian Blinds 5-fold CV (30-100 samples)”
  • “Allowed models: PLS, Ridge, KNN”
  • “Target type continuous → regression mode”
  • “Excluded RF and SVM (too few samples)”
This is the rules engine output. It tells you why CoPilot chose what it chose, so you can understand or override it in Scientist mode.

Tabs

After CoPilot finishes, two tabs appear:
TabContents
Best resultDetail of the winning trial: predicted vs actual chart (regression) or confusion matrix (classification)
All trialsLeaderboard of every trial CoPilot ran, sortable by metric
The leaderboard is useful for understanding:
  • How wide the metric range is across trials (is the result robust or fragile?)
  • Which preprocessing kept showing up in top trials
  • Which models did poorly
If the top 10 trials cluster around similar preprocessing and the same model family, you’ve found a robust solution. If they’re scattered across many different recipes, the problem may be hard or the data noisy.

When CoPilot fails

Status: Failed. The detail page shows an error banner. Common causes:
CauseFix
Too few samples (under 10-20 with values)Add more samples to the dataset
Target property has only one value (no variance)Pick a different target
All spectra are essentially identicalCheck your spectra; you may have a calibration issue
Internal errorContact support@chemolytic.com with the experiment ID

What CoPilot can’t do

  • Switch problem types: continuous targets are always regression, categorical always classification
  • Combine targets: predict only one property at a time
  • Use external features: only the spectra and target property are used
  • Custom preprocessing: only the catalog of supported methods is searched
For any of these, use Scientist mode.

After CoPilot finishes

Two main paths:
GoalAction
Deploy the winning modelClick the best trial → Register Model → create a registered model
Iterate manuallyNote the winning preprocessing + algorithm, create a Scientist experiment, and refine from there
CoPilot is a starting point, not the end of the workflow. The leaderboard and explanation tell you what worked. Use that knowledge.