Testing
Overview
The test suite has two tiers, separated by the integration marker:
The default tier runs on a plain
pytest. Everything in it is fast and self-contained: it needs no spacecraft holdings, no SPICE kernels, and no network. Unit tests live here, and so do the in-process simulator tests – every simulated frame is rendered and navigated in memory, so the whole simulator-driven invariant and structural coverage runs without external data.The integration tier is excluded by default (
addopts = ["-m", "not integration"]inpyproject.toml) and opted into with-m ""or-m integration. It holds the slow and the archive-backed tests: the real-image regression cohort (which fetches PDS holdings and resolves SPICE geometry) and the heavier or jitter-prone in-process simulator tests.
The simulator (The Image Simulator) is the engine behind several tiers: it lets the suite grow algorithmic-invariant and sensitivity coverage on frames whose true offset is known by construction, without operator labour and without real data.
Running the suite
pytest # default tier (fast, no holdings)
pytest -m "" # full suite, including integration
pytest -m integration # only the integration tier
pytest -n auto --dist=loadfile # parallel, matching CI (loadfile avoids
# PyQt6 worker crashes)
pytest tests/nav/sim/test_sim_noise.py # one file
pytest tests/nav/sim/test_sim_noise.py::test_foo # one test
pytest --cov # with coverage
./scripts/run-all-checks.sh # ruff + mypy + pytest + docs + markdown
./scripts/run-all-checks.sh -i # the same, including integration tests
pytest-xdist must run with --dist=loadfile; the default scheduling
crashes PyQt6 workers when tests from one file split across processes. Multi-test
integration runs should always use -n auto --dist=loadfile.
Archive-backed tests additionally require the holdings and catalog environment (set by CI; see Introduction):
export PDS3_HOLDINGS_DIR=https://pds-rings.seti.org/holdings
export PDS4_HOLDINGS_DIR=https://pds-rings.seti.org/pds4
export OOPS_RESOURCES=https://storage.googleapis.com/rms-node-oops-resources
export UCAC4_PATH=https://storage.googleapis.com/rms-node-star-catalogs/UCAC4
export YBSC_PATH=https://storage.googleapis.com/rms-node-star-catalogs/YBSC
# plus SPICE kernels at $SPICE_PATH for any real navigation run
Test kinds
The suite is layered by what a test proves and what it needs. The simulator-only tests need none of the archive environment above.
Kind (path) |
Tier |
Requires |
What it proves |
|---|---|---|---|
Unit tests ( |
default |
nothing |
One component in isolation (config, feature, dataset, obs, model, technique, orchestrator, reproj, support). |
Simulator unit tests ( |
default |
nothing |
The renderer’s contracts: determinism, noise, saturation, PSF, stray
light, instrument coupling, camera roll, irregular-body rendering and the
|
GUI smoke ( |
default |
PyQt6 |
Each GUI control wires to the right |
Scene structural ( |
default |
nothing |
Every catalog scene validates, sits in a declared class, has a unique name, and renders. |
Algorithmic invariants ( |
default |
nothing |
Each technique recovers its planted offset / roll on a clean scene – correct by construction, so no baseline. |
End-to-end sim nav ( |
default |
nothing |
A simulated frame navigates through the full orchestrator and recovers a planted offset. |
Sim bug regression ( |
default |
nothing |
Fast bug-specific scenes guarding defects the sweeps surfaced. |
Sim regression baselines ( |
integration |
nothing |
Every catalog scene re-navigates to its recorded rounded outcome (a tripwire). Integration-marked because the solvers carry sub-millipixel cross-process jitter. |
Sensitivity sweeps ( |
integration |
nothing |
Each single-variable sweep responds as expected (a technique transition, a degradation to failure, recovery within tolerance). Heavier – each sweep navigates several frames. |
Pose behavioral ( |
integration |
nothing |
On a wrong-pose irregular body the limb degrades far off while the pose-free blob stays accurate (asserted per technique). |
Real-image structural ( |
default |
nothing |
The operator-curated sidecar catalog validates structurally (schema, classes, uniqueness) without fetching images. |
Real-image regression ( |
integration |
holdings + SPICE |
Each curated real image navigates to its expected status / tier / offset and matches its recorded baseline – the calibration tripwire. |
The simulator’s role across these tiers is described per phase in the simulator improvement plan and summarized in The Image Simulator. The operator-curated real-image cohort is documented in Image Library.
Characterization runners and updaters
These are not pytest tests; they are python -m scripts that produce the
report figures, the example images, and the regression baselines. They render and
navigate in-process and need no holdings.
Command |
Produces |
|---|---|
|
Per-sweep response-curve JSON under |
|
The per-technique accuracy-vs-SNR and accuracy-vs-offset report figures. |
|
The star-field centroiding (moment / PSF / adaptive) report figures. |
|
The developer-guide scene gallery and the report scene images. |
|
Regenerates the simulator regression baselines under |
|
Regenerates the real-image regression baselines (needs holdings). |
After a deliberate change that shifts a baseline or a figure, rerun the relevant
updater or runner and review the diff before committing – the baselines are
tripwires, so an unexpected change is a regression to investigate, not to bless
blindly. A new invariant scene also needs a baseline (update_sim_baselines);
verify no existing baseline shifts.