Testing

Overview

The test suite has two tiers, separated by the integration marker:

  • The default tier runs on a plain pytest. Everything in it is fast and self-contained: it needs no spacecraft holdings, no SPICE kernels, and no network. Unit tests live here, and so do the in-process simulator tests – every simulated frame is rendered and navigated in memory, so the whole simulator-driven invariant and structural coverage runs without external data.

  • The integration tier is excluded by default (addopts = ["-m", "not integration"] in pyproject.toml) and opted into with -m "" or -m integration. It holds the slow and the archive-backed tests: the real-image regression cohort (which fetches PDS holdings and resolves SPICE geometry) and the heavier or jitter-prone in-process simulator tests.

The simulator (The Image Simulator) is the engine behind several tiers: it lets the suite grow algorithmic-invariant and sensitivity coverage on frames whose true offset is known by construction, without operator labour and without real data.

Running the suite

pytest                              # default tier (fast, no holdings)
pytest -m ""                        # full suite, including integration
pytest -m integration               # only the integration tier
pytest -n auto --dist=loadfile      # parallel, matching CI (loadfile avoids
                                    #   PyQt6 worker crashes)
pytest tests/nav/sim/test_sim_noise.py            # one file
pytest tests/nav/sim/test_sim_noise.py::test_foo  # one test
pytest --cov                        # with coverage

./scripts/run-all-checks.sh         # ruff + mypy + pytest + docs + markdown
./scripts/run-all-checks.sh -i      # the same, including integration tests

pytest-xdist must run with --dist=loadfile; the default scheduling crashes PyQt6 workers when tests from one file split across processes. Multi-test integration runs should always use -n auto --dist=loadfile.

Archive-backed tests additionally require the holdings and catalog environment (set by CI; see Introduction):

export PDS3_HOLDINGS_DIR=https://pds-rings.seti.org/holdings
export PDS4_HOLDINGS_DIR=https://pds-rings.seti.org/pds4
export OOPS_RESOURCES=https://storage.googleapis.com/rms-node-oops-resources
export UCAC4_PATH=https://storage.googleapis.com/rms-node-star-catalogs/UCAC4
export YBSC_PATH=https://storage.googleapis.com/rms-node-star-catalogs/YBSC
# plus SPICE kernels at $SPICE_PATH for any real navigation run

Test kinds

The suite is layered by what a test proves and what it needs. The simulator-only tests need none of the archive environment above.

Kind (path)

Tier

Requires

What it proves

Unit tests (tests/nav/**)

default

nothing

One component in isolation (config, feature, dataset, obs, model, technique, orchestrator, reproj, support).

Simulator unit tests (tests/nav/sim/**)

default

nothing

The renderer’s contracts: determinism, noise, saturation, PSF, stray light, instrument coupling, camera roll, irregular-body rendering and the nav_override channel, scene-schema validation.

GUI smoke (tests/main/test_create_simulated_image.py)

default

PyQt6

Each GUI control wires to the right sim_params field and the scene/JSON round-trip is faithful.

Scene structural (test_sim_scenes.py)

default

nothing

Every catalog scene validates, sits in a declared class, has a unique name, and renders.

Algorithmic invariants (test_sim_algorithmic_invariants.py)

default

nothing

Each technique recovers its planted offset / roll on a clean scene – correct by construction, so no baseline.

End-to-end sim nav (test_sim_navigation.py)

default

nothing

A simulated frame navigates through the full orchestrator and recovers a planted offset.

Sim bug regression (test_sim_regression.py)

default

nothing

Fast bug-specific scenes guarding defects the sweeps surfaced.

Sim regression baselines (test_sim_baselines.py)

integration

nothing

Every catalog scene re-navigates to its recorded rounded outcome (a tripwire). Integration-marked because the solvers carry sub-millipixel cross-process jitter.

Sensitivity sweeps (test_sim_sweeps.py)

integration

nothing

Each single-variable sweep responds as expected (a technique transition, a degradation to failure, recovery within tolerance). Heavier – each sweep navigates several frames.

Pose behavioral (test_sim_irregular_pose.py)

integration

nothing

On a wrong-pose irregular body the limb degrades far off while the pose-free blob stays accurate (asserted per technique).

Real-image structural (test_image_library.py)

default

nothing

The operator-curated sidecar catalog validates structurally (schema, classes, uniqueness) without fetching images.

Real-image regression (test_autonomous_nav.py, test_baselines.py)

integration

holdings + SPICE

Each curated real image navigates to its expected status / tier / offset and matches its recorded baseline – the calibration tripwire.

The simulator’s role across these tiers is described per phase in the simulator improvement plan and summarized in The Image Simulator. The operator-curated real-image cohort is documented in Image Library.

Characterization runners and updaters

These are not pytest tests; they are python -m scripts that produce the report figures, the example images, and the regression baselines. They render and navigate in-process and need no holdings.

Command

Produces

python -m tests.integration.sim_sweep_runner

Per-sweep response-curve JSON under sim_sweeps/results/ (gitignored) and the offset / star / roll / mesh figures in the report. Add --dump-images DIR to also write every sweep frame as a PNG.

python -m tests.integration.technique_snr_characterization

The per-technique accuracy-vs-SNR and accuracy-vs-offset report figures.

python -m tests.integration.star_snr_characterization

The star-field centroiding (moment / PSF / adaptive) report figures.

python -m tests.integration.sim_doc_images

The developer-guide scene gallery and the report scene images.

python -m tests.integration.update_sim_baselines

Regenerates the simulator regression baselines under sim_baselines/.

python -m tests.integration.update_baselines

Regenerates the real-image regression baselines (needs holdings).

After a deliberate change that shifts a baseline or a figure, rerun the relevant updater or runner and review the diff before committing – the baselines are tripwires, so an unexpected change is a regression to investigate, not to bless blindly. A new invariant scene also needs a baseline (update_sim_baselines); verify no existing baseline shifts.