Data Explorer

The Data Explorer is a health check across your samples, spectra, and properties. Use it before training any model to spot missing data, imbalanced categories, and outliers in your property values. Go to Data Explorer in the project sidebar.

Data Explorer overview tab showing total samples, modelling readiness bar, and property coverage table

Sensor filter

By default, all sensors are included. Use the Sensor dropdown at the top to focus on a specific instrument. Click Clear to go back to all sensors. This filter affects every metric on the page: counts, modelling readiness, and property statistics are all computed only on samples linked to the selected sensor.

Overview tab

Stat tiles

Four numbers across the top:

Tile	What it counts
Total samples	Every sample in the project (or filtered by sensor)
With spectra	Samples that have at least one spectrum uploaded
With properties	Samples that have at least one property value set
Ready for modelling	Samples that have both at least one spectrum and at least one property value

The “Ready for modelling” number is the only one that matters for training. Samples missing spectra or property values cannot contribute to a model.

Modelling readiness bar

Shows the same “Ready for modelling” number as a progress bar, with a vertical tick at 80%.

80% is a healthy target. If most of your samples are missing either spectra or property values, fix that before running experiments. A model is only as good as the data behind it.

Property coverage

The table below shows, for each property:

Column	Description
Property	Property name and unit
Type	Continuous (Num) or Categorical (Cat)
Filled	Samples with a value for this property
Missing	Samples without a value (red if any are missing)
+ Spectra	Of the filled ones, how many also have spectra
Coverage	Visual bar with percentage

Coverage colors:

Green (95% or higher): excellent coverage
Orange (70-94%): acceptable, but watch for bias
Red (below 70%): risky to model from

Properties tab

The Properties tab shows distribution statistics for every property.

Continuous properties

Each continuous property shows:

Stat	Description
Mean	Average of all values
Median	Middle value when sorted
Std	Standard deviation (spread)
Min / Max	Lowest and highest values
Q1 / Q3	First and third quartiles

A histogram plots the distribution across 10 bins. Use it to spot:

Skewed distributions (most values clustered on one end)
Bimodal patterns (two peaks suggesting two underlying groups)
Gaps where data is missing in a range

A box plot shows the same data as a box-and-whisker chart with outliers marked as scatter points using the Tukey method (1.5 × IQR).

If your property has heavy outliers, your model may overfit to them. Consider whether those outliers are real measurements or data entry errors before training.

Categorical properties

Each categorical property shows a donut chart of category counts with the total in the centre and a per-category breakdown (count and percentage) on the side.

Data Explorer category breakdown donut chart for a categorical property showing total count and per-category percentages

For classification, make sure your categories are reasonably balanced. A property with 95% in one class and 5% in another is hard to model. You may need to gather more samples in the minority categories.

When to come back

Visit the Data Explorer:

After uploading spectra: confirm the readiness number went up
After importing samples via CSV: check property coverage didn’t introduce gaps
Before running an experiment: spot any imbalance or outliers that could bias the model
After deleting samples: confirm coverage is still acceptable

Getting started

Account & management

Hardware

Data

Exploration

Modelling

Production

Sensor filter

Overview tab

Stat tiles

Modelling readiness bar

Property coverage

Properties tab

Continuous properties

Categorical properties

When to come back

​Sensor filter

​Overview tab

​Stat tiles

​Modelling readiness bar

​Property coverage

​Properties tab

​Continuous properties

​Categorical properties

​When to come back

Sensor filter

Overview tab

Stat tiles

Modelling readiness bar

Property coverage

Properties tab

Continuous properties

Categorical properties

When to come back