Testing

Overview

The test suite has two tiers, separated by the integration marker:

The default tier runs on a plain pytest. Everything in it is fast and self-contained: it needs no spacecraft holdings, no SPICE kernels, and no network. Unit tests live here, and so do the in-process simulator tests – every simulated frame is rendered and navigated in memory, so the whole simulator-driven invariant and structural coverage runs without external data.
The integration tier is excluded by default (addopts = ["-m", "not integration"] in pyproject.toml) and opted into with -m "" or -m integration. It holds the slow and the archive-backed tests: the real-image regression cohort (which fetches PDS holdings and resolves SPICE geometry) and the heavier or jitter-prone in-process simulator tests.

The simulator (The Image Simulator) is the engine behind several tiers: it lets the suite grow algorithmic-invariant and sensitivity coverage on frames whose true offset is known by construction, without operator labour and without real data.

Running the suite

pytest                              # default tier (fast, no holdings)
pytest -m ""                        # full suite, including integration
pytest -m integration               # only the integration tier
pytest -n auto --dist=loadfile      # parallel, matching CI (loadfile avoids
                                    #   PyQt6 worker crashes)
pytest tests/nav/sim/test_sim_noise.py            # one file
pytest tests/nav/sim/test_sim_noise.py::test_foo  # one test
pytest --cov                        # with coverage

./scripts/run-all-checks.sh         # ruff + mypy + pytest + docs + markdown
./scripts/run-all-checks.sh -i      # the same, including integration tests

pytest-xdist must run with --dist=loadfile; the default scheduling crashes PyQt6 workers when tests from one file split across processes. Multi-test integration runs should always use -n auto --dist=loadfile.

Archive-backed tests additionally require the holdings and catalog environment (set by CI; see Introduction):

export PDS3_HOLDINGS_DIR=https://pds-rings.seti.org/holdings
export PDS4_HOLDINGS_DIR=https://pds-rings.seti.org/pds4
export OOPS_RESOURCES=https://storage.googleapis.com/rms-node-oops-resources
export UCAC4_PATH=https://storage.googleapis.com/rms-node-star-catalogs/UCAC4
export YBSC_PATH=https://storage.googleapis.com/rms-node-star-catalogs/YBSC
# plus SPICE kernels at $SPICE_PATH for any real navigation run

Test kinds

The suite is layered by what a test proves and what it needs. The simulator-only tests need none of the archive environment above.

Kind (path)	Tier	Requires	What it proves
Unit tests (`tests/nav/**`)	default	nothing	One component in isolation (config, feature, dataset, obs, model, technique, orchestrator, reproj, support).
Simulator unit tests (`tests/nav/sim/**`)	default	nothing	The renderer’s contracts: determinism, noise, saturation, PSF, stray light, instrument coupling, camera roll, irregular-body rendering and the `nav_override` channel, scene-schema validation.
GUI smoke (`tests/main/test_create_simulated_image.py`)	default	PyQt6	Each GUI control wires to the right `sim_params` field and the scene/JSON round-trip is faithful.
Scene structural (`test_sim_scenes.py`)	default	nothing	Every catalog scene validates, sits in a declared class, has a unique name, and renders.
Algorithmic invariants (`test_sim_algorithmic_invariants.py`)	default	nothing	Each technique recovers its planted offset / roll on a clean scene – correct by construction, so no baseline.
End-to-end sim nav (`test_sim_navigation.py`)	default	nothing	A simulated frame navigates through the full orchestrator and recovers a planted offset.
Sim bug regression (`test_sim_regression.py`)	default	nothing	Fast bug-specific scenes guarding defects the sweeps surfaced.
Sim regression baselines (`test_sim_baselines.py`)	integration	nothing	Every catalog scene re-navigates to its recorded rounded outcome (a tripwire). Integration-marked because the solvers carry sub-millipixel cross-process jitter.
Sensitivity sweeps (`test_sim_sweeps.py`)	integration	nothing	Each single-variable sweep responds as expected (a technique transition, a degradation to failure, recovery within tolerance). Heavier – each sweep navigates several frames.
Pose behavioral (`test_sim_irregular_pose.py`)	integration	nothing	On a wrong-pose irregular body the limb degrades far off while the pose-free blob stays accurate (asserted per technique).
Real-image structural (`test_image_library.py`)	default	nothing	The operator-curated sidecar catalog validates structurally (schema, classes, uniqueness) without fetching images.
Real-image regression (`test_autonomous_nav.py`, `test_baselines.py`)	integration	holdings + SPICE	Each curated real image navigates to its expected status / tier / offset and matches its recorded baseline – the calibration tripwire.

The simulator’s role across these tiers is described per phase in the simulator improvement plan and summarized in The Image Simulator. The operator-curated real-image cohort is documented in Image Library.

Characterization runners and updaters

These are not pytest tests; they are python -m scripts that produce the report figures, the example images, and the regression baselines. They render and navigate in-process and need no holdings.

Command	Produces
`python -m tests.integration.sim_sweep_runner`	Per-sweep response-curve JSON under `sim_sweeps/results/` (gitignored) and the offset / star / roll / mesh figures in the report. Add `--dump-images DIR` to also write every sweep frame as a PNG.
`python -m tests.integration.technique_snr_characterization`	The per-technique accuracy-vs-SNR and accuracy-vs-offset report figures.
`python -m tests.integration.star_snr_characterization`	The star-field centroiding (moment / PSF / adaptive) report figures.
`python -m tests.integration.sim_doc_images`	The developer-guide scene gallery and the report scene images.
`python -m tests.integration.update_sim_baselines`	Regenerates the simulator regression baselines under `sim_baselines/`.
`python -m tests.integration.update_baselines`	Regenerates the real-image regression baselines (needs holdings).

After a deliberate change that shifts a baseline or a figure, rerun the relevant updater or runner and review the diff before committing – the baselines are tripwires, so an unexpected change is a regression to investigate, not to bless blindly. A new invariant scene also needs a baseline (update_sim_baselines); verify no existing baseline shifts.