Documentation Index
Fetch the complete documentation index at: https://docs.chemolytic.com/llms.txt
Use this file to discover all available pages before exploring further.
CoPilot is the fastest way to get a working model. You give it a dataset and a target property; it tests many combinations and returns the best one.
What CoPilot does
Behind the scenes, CoPilot:
- Reads your dataset (sample count, property type, value distribution)
- Picks a cross-validation strategy based on dataset size (5-fold, 10-fold, or leave-one-out)
- Allows a model pool based on the problem (PLS, Ridge, KNN, SVR, RF for regression; PLS-DA, Logistic, KNN, SVM, RF for classification)
- Searches preprocessing pipelines across scatter correction, derivatives, smoothing, and scaling
- Optimizes hyperparameters
- Runs 1000-3000 trials until the search converges
- Selects the best trial by primary metric (RMSE for regression, F1 macro for classification) and complexity
A typical run takes about 15 minutes.
Following progress
While CoPilot runs, the experiment status is Running and the detail page polls every 3 seconds. You can leave the page and come back; nothing is lost.
You’ll see:
- A progress indicator
- The “Trials” count rising as work completes
- Live updates to the leaderboard
When all trials finish, the status changes to Done and the best trial is selected.
Reading the result
The detail page leads with a Best trial hero section showing the winning model.
Toggle: CV vs Test
Two buttons let you switch between cross-validation and held-out test metrics.
| View | What it tells you |
|---|
| CV | Performance averaged across cross-validation folds. Most stable estimate of generalization. |
| Test | Performance on a held-out test set never seen during training. |
In CoPilot, both are computed automatically. Use CV to pick a model and Test to confirm it didn’t overfit.
Regression metrics
| Metric | Range | Better |
|---|
| R² | -∞ to 1.0 | Higher (1.0 is perfect) |
| RMSE | 0 to ∞, target units | Lower |
| MAE | 0 to ∞, target units | Lower |
| Bias | -∞ to ∞, target units | Closer to 0 |
| RPD | 0 to ∞ | Higher (>2 useful, >3 strong) |
Classification metrics
| Metric | Range | Better |
|---|
| Accuracy | 0 to 1.0 | Higher |
| F1 (macro) | 0 to 1.0 | Higher (treats all classes equally) |
| F1 (weighted) | 0 to 1.0 | Higher (weighted by class size) |
| Precision | 0 to 1.0 | Higher |
| Recall | 0 to 1.0 | Higher |
CoPilot’s reasoning
Click How CoPilot configured this experiment to expand a list of decisions CoPilot made and why.
Examples of what you’ll see:
- “Using Venetian Blinds 5-fold CV (30-100 samples)”
- “Allowed models: PLS, Ridge, KNN”
- “Target type continuous → regression mode”
- “Excluded RF and SVM (too few samples)”
This is the rules engine output. It tells you why CoPilot chose what it chose, so you can understand or override it in Scientist mode.
Tabs
After CoPilot finishes, two tabs appear:
| Tab | Contents |
|---|
| Best result | Detail of the winning trial: predicted vs actual chart (regression) or confusion matrix (classification) |
| All trials | Leaderboard of every trial CoPilot ran, sortable by metric |
The leaderboard is useful for understanding:
- How wide the metric range is across trials (is the result robust or fragile?)
- Which preprocessing kept showing up in top trials
- Which models did poorly
If the top 10 trials cluster around similar preprocessing and the same model family, you’ve found a robust solution. If they’re scattered across many different recipes, the problem may be hard or the data noisy.
When CoPilot fails
Status: Failed. The detail page shows an error banner.
Common causes:
| Cause | Fix |
|---|
| Too few samples (under 10-20 with values) | Add more samples to the dataset |
| Target property has only one value (no variance) | Pick a different target |
| All spectra are essentially identical | Check your spectra; you may have a calibration issue |
| Internal error | Contact support@chemolytic.com with the experiment ID |
What CoPilot can’t do
- Switch problem types: continuous targets are always regression, categorical always classification
- Combine targets: predict only one property at a time
- Use external features: only the spectra and target property are used
- Custom preprocessing: only the catalog of supported methods is searched
For any of these, use Scientist mode.
After CoPilot finishes
Two main paths:
| Goal | Action |
|---|
| Deploy the winning model | Click the best trial → Register Model → create a registered model |
| Iterate manually | Note the winning preprocessing + algorithm, create a Scientist experiment, and refine from there |
CoPilot is a starting point, not the end of the workflow. The leaderboard and explanation tell you what worked. Use that knowledge.