qoherent/RIA_Example

Fork 0

Example RIAHub Repo with files and example outputs from all tools

Push Tracker

Go to file

Push Tracker

benchinnery pushed to main at qoherent/RIA_Example

2026-05-26 14:19:46 -04:00

1c81180442 Track .sigmf-meta as LFS so curator can pair recordings

Files 8 items

ben 6b56382315 recording inspector useage		2026-05-21 20:17:12 -04:00
.ria	updated example reference	2026-05-20 13:04:10 -04:00
Applications	Add Applications/_init_.py	2026-04-22 09:19:47 -04:00
curator-configs	updated example reference	2026-05-20 13:04:10 -04:00
Datasets	updated example reference	2026-05-20 13:04:10 -04:00
Models	Add models	2026-04-24 09:43:46 -04:00
Recordings	updated example reference	2026-05-20 13:04:10 -04:00
.gitattributes	updated example reference	2026-05-20 13:04:10 -04:00
README.md	recording inspector useage	2026-05-21 20:17:12 -04:00

README.md

RIA Hub — Example Files

This repository contains example input files, configurations, and expected outputs for every tool on RIA Hub. If you are new to the platform, start here. Download any file and follow the walkthrough for the tool you want to try.

What is RIA Hub?

RIA Hub is a collaborative platform for RF and machine learning workflows. It combines a Git-based repository system with a suite of specialized tools that cover the full pipeline from raw IQ recordings to live inference deployments:

Stage	Tools
Collect	Library — browse, organize, and share RF recordings and models
Curate	Dataset Manager — slice, qualify, augment, and inspect radio datasets
Train	Model Builder — train, optimize, and compress PyTorch models
Deploy	Application Packager — compose and build inference applications
Run	Screens — deploy live RF inference pipelines on real hardware

Repository Structure

RIA_Example/
│
├── Recordings/
│   ├── bpsk_915MHz/
│   │   ├── device_A_bpsk.sigmf-data      # Binary IQ samples (cf32_le, tracked via Git LFS)
│   │   └── device_A_bpsk.sigmf-meta      # JSON metadata: sample rate, frequency, labels
│   ├── qpsk_915MHz/
│   │   ├── device_A_qpsk.sigmf-data
│   │   └── device_A_qpsk.sigmf-meta
│   └── qam16_2400MHz/
│       ├── device_B_qam16.sigmf-data
│       └── device_B_qam16.sigmf-meta
│
├── Datasets/
│   ├── example_radio_dataset.h5          # Curated from recordings above (Curator output / Model Trainer input)
│   └── example_synthetic_dataset.h5      # Synthetically generated (Generator output)
│
├── Models/
│   ├── example_model.ckpt                # PyTorch checkpoint (Model Trainer output)
│   └── example_model.onnx               # Exported ONNX model (Screens / Application Packager input)
│
├── curator-configs/
│   └── example_curator_config.json       # Reference configuration for the Curator tool
│
└── .ria/
    └── train.yaml                        # Example Model Trainer workflow (Gitea Actions)

All binary files (.sigmf-data, .h5, .ckpt, .onnx) are tracked with Git LFS. Clone with LFS enabled to get the actual file content.

Tool Walkthroughs

Library

The Library is a cross-repository browser for all RF and ML assets on the platform. It automatically discovers files pushed to any repository you have access to.

To explore the example recordings:

Push this repository (or clone it into your RIA Hub instance). The SigMF file pairs in Recordings/ are tracked via Git LFS and will be indexed automatically on push.
Navigate to Library in the top navigation bar.
Select the Recordings tab. Each .sigmf-data/.sigmf-meta pair appears as a row with metadata and a spectrogram thumbnail.
Click any row to open the Quick View panel — view the spectrogram, constellation, PSD, and time series tabs to assess signal quality.
To mark and export specific time segments, go to Dataset Manager → Recording Inspector, select a repository and recording, then drag keep/discard regions on the spectrogram.

Supported asset types in the Library:

Type	Extension	Description
Recording	`.sigmf-data` + `.sigmf-meta`	SigMF IQ capture pairs — preferred format for the Library and Curator
Radio Dataset	`.h5` / `.hdf5`	Labelled, curated training datasets
PyTorch Module	`.py`	PyTorch model definitions with a nn.Module class
PyTorch State Dict	`.pt` / `.pth`	Model weights / state dictionaries
PyTorch Checkpoint	`.ckpt`	Training checkpoints with weights, optimizer state, and metadata
ONNX Graph	`.onnx`	Portable inference models

Recording Inspector

The Recording Inspector lets you visually select time segments from a recording, preview the export, and write the resulting SigMF file pairs to a repository branch.

Example files: Recordings/bpsk_915MHz/, Recordings/qpsk_915MHz/, Recordings/qam16_2400MHz/

Steps:

Go to Dataset Manager → Recording Inspector.
In Repository, type or select the repository containing the example recordings (e.g. qoherent/RIA_Example).
In Recording, select a .sigmf-data file. The spectrogram generates automatically (5–30 seconds).
Switch to Keep mode and drag on the spectrogram to mark the region you want to export.
- Use Discard mode to cut out interference or noise within a keep region.
- Use Pan mode to scroll and zoom to a different part of the recording.
Check the Export Preview panel — it shows the derived segments, total duration, and any warnings about short segments.
Set an Export destination: repository, branch, and folder.
Click Export. The segments are written as SigMF file pairs to the destination branch.

The exported segments appear in the Library once the job completes and can be fed directly into the Curator.

Dataset Manager — Curator

The Curator takes raw IQ recordings and produces a labelled, ready-to-train HDF5 dataset. It applies a configurable DSP pipeline: slicing, quality filtering, and optional augmentation.

Example files: Recordings/ (three SigMF pairs), curator-configs/example_curator_config.json
Expected output: Datasets/example_radio_dataset.h5

Steps:

Push the recordings to a repository — the Curator reads from the Library once files are indexed.
Go to Dataset Manager → Curator.
Select your recordings from the Library panel on the left.
Configure the pipeline using the settings below, or use curator-configs/example_curator_config.json as a reference:
- Data type: IQ
- Slicer: simple — slice length 1024
- Qualifier: rms — minimum threshold 0.01 (filters out silent/noise-only slices)
- Augmentation: basic policy — 2 augmented copies per slice
Set a dataset name, description, and radio task label.
Click Curate. A progress bar tracks the Celery task.
When complete, commit the output dataset to your repository.

Slicer options:

Slicer	Best for
`simple`	Fixed-length slices, good starting point
`random`	Randomized slice positions
`overlap`	Overlapping slices for smaller datasets

Qualifier options:

Qualifier	Filters on
`rms`	Root mean square amplitude
`snr`	Signal-to-noise ratio estimate
`energy`	Total signal energy
`bandwidth`	Occupied bandwidth

Dataset Manager — Inspector

The Inspector runs diagnostic analysis on an existing dataset — class balance, per-class statistics, anomaly detection, and dataset comparisons.

Example file: datasets/example_radio_dataset.h5

Steps:

Go to Dataset Manager → Inspector.
Select example_radio_dataset.h5 from the file picker.
Choose an analysis type:
- Balance — see how many samples exist per class label
- Per-Class Stats — per-class mean, std, and distribution
- Anomaly Detection — flag outlier samples
- Compare — select a second dataset to diff against

Dataset Manager — Generator

The Generator creates synthetic labelled datasets from a parameter sweep without requiring any hardware or recordings.

Expected output: datasets/example_synthetic_dataset.h5

Steps:

Go to Dataset Manager → Generator.
Configure a modulation sweep:
- Sampling strategy: grid
- Parameters: SNR from -5 to 20 dB in steps of 5; modulation types: [BPSK, QPSK, 8PSK, 16QAM]
- Signal: length 1024, sample rate 1e6
- Channel model: awgn
- Output backend: pytorch
Click Generate. The task runs in the background.
Download the resulting .h5 file when complete.

Model Builder — Model Trainer

The Model Trainer builds a training workflow YAML and commits it to your repository. A Gitea Actions runner then executes the training job.

Example files: datasets/example_radio_dataset.h5, models/example_model.ckpt (optional pre-trained start)
Expected output: .riahub/workflows/train.yaml in your repository, plus a trained example_model.ckpt artifact

Steps:

Go to Model Builder → Model Trainer.
In Repository, select the repository where you want to store the workflow and output artifacts.
In Model, choose an architecture (e.g. ResNet1D) or use example_model.ckpt as a starting checkpoint.
In Dataset, select example_radio_dataset.h5 from the Library.
Configure training:
- Optimizer: Adam, learning rate 1e-3
- Epochs: 20
- Batch size: 64
- Criterion: CrossEntropyLoss
Enable ONNX Export in the Evaluation section to automatically export the trained model.
Click Commit Workflow. A train.yaml is committed to .riahub/workflows/ and a CI run starts.
Monitor the run in Actions within your repository.

The committed workflow file matches workflows/train.yaml in this repository.

Model Builder — Hyperparameter Optimization

HPO runs a sweep over a configurable search space, training multiple model variants and ranking them by a target metric.

Example file: datasets/example_radio_dataset.h5
Expected output: .riahub/workflows/hpo.yaml

Steps:

Go to Model Builder → HPO.
Configure the same model and dataset as in Model Trainer.
In the Search Space panel, define ranges to sweep:
- Learning rate: 1e-4 to 1e-2 (log scale)
- Optimizer: [Adam, SGD]
- Batch size: [32, 64, 128]
Set Trials: 12 and Target metric: val_accuracy.
Click Commit Workflow. See workflows/hpo.yaml for the expected output format.

Model Builder — Model Compression

Compression applies pruning and/or quantization to reduce model size for edge deployment. The output is an ONNX file.

Example files: models/example_model.ckpt, datasets/example_radio_dataset.h5
Expected output: models/example_model.onnx

Steps:

Go to Model Builder → Compression.
Select example_model.ckpt as the source model and example_radio_dataset.h5 as the calibration dataset.
Configure the compression pipeline (pruning ratio, quantization bits).
Click Commit Workflow. The Actions job exports the compressed model to ONNX automatically.
The resulting .onnx file is committed back to your repository.

Application Packager — Application Composer

The Application Composer is a visual node-graph editor for wiring together C++ operator blocks into an inference application. The output is an application JSON file.

Example file: applications/example_application.json

Steps:

Go to Application Packager → Application Composer.
Browse the Operators panel on the left. Drag an operator onto the canvas.
Wire operator ports together by dragging from an output port to an input port.
Configure each operator's parameters in the sidebar.
Click Commit Application to save example_application.json to your repository.
Click Build to trigger a build workflow on a registered runner.

The application JSON format is documented in schemas/application/ria_application.schema.json. See applications/example_application.json for a minimal working example.

Target profiles:

Profile	Use when
`native-x86`	Standard x86 Linux deployment
`native-arm64`	ARM edge devices
`nvidia-x86`	GPU-accelerated inference on x86
`RIA Screens`	Web based runtime environment to monitor app

Screens

Screens deploys a packaged RF inference application to a live pipeline. You build an app from Application Composer, configure a data source (live SDR, file playback, or synthetic), and start the pipeline. Results stream back to the browser in real time.

App package format

A Screens app package is a .tar.gz containing:

manifest.json — describes the app (models, GUI layout, data source, preprocessor)
ONNX model file(s) at the path(s) listed in manifest.models[].path

Data source types:

Type	Description
`synthetic`	Built-in AWGN tone generator — no hardware required
`recording`	Play back a `.h5` IQ recording from the Library
`sdr`	Live data from a connected SDR device
`agent`	Live data from a remote SDR via an edge agent node
`numpy_raw`	Play back a `.npy` raw IQ file

ONNX model requirements

The Zone Fingerprinting model contract:

Input: iq_features — shape [1, 128], dtype float32
Output: scores — shape [1, 5], dtype float32 (softmax probabilities)
Opset: >= 13

When building your own Screens app, export your model to ONNX with matching input/output names and shapes, then reference them in manifest.json.

RIA Projects

Projects group your datasets, models, training runs, and deployed applications into a single tracked entity. The project dashboard shows a three-stage pipeline view: Data Management → Model Building → Deployment.

Steps:

Go to Projects → New Project.
Name your project and create it.
Link assets from the Library using the Link Asset button on each pipeline stage.
As you run Curator, Model Trainer, and Screens jobs, link the outputs to track progress through the pipeline.

File Format Reference

SigMF Recording (`.sigmf-data` + `.sigmf-meta`)

Each recording is a pair of files committed together:

File	Contains
`.sigmf-data`	Raw IQ samples as interleaved float32 little-endian (`cf32_le`) — I[0], Q[0], I[1], Q[1], …
`.sigmf-meta`	JSON metadata: `global` (sample rate, frequency, hardware, RIA-specific fields), `captures` (capture start info), `annotations` (time/frequency regions with labels)

The annotations[].core:label field is what the Curator reads to assign class labels when slicing a recording into a dataset. See Recordings/bpsk_915MHz/device_A_bpsk.sigmf-meta for an annotated example.

HDF5 Radio Dataset (`.h5`)

Curated and generated datasets share a common HDF5 layout:

dataset.h5
├── data                        # IQ samples, shape (N, 2, L) float32
│                               #   N = number of slices
│                               #   2 = I channel (index 0) and Q channel (index 1)
│                               #   L = slice length in samples (e.g. 1024)
├── metadata/
│   ├── metadata                # Per-slice structured array: rec_id, label, modulation,
│   │                           #   snr_db, center_freq, sample_rate, device_id, split, ...
│   └── about                   # Dataset-level: author, date_of_data_generation, dataset_version
└── attrs                       # File-level: license, source_type, collection metadata

Application JSON (`application.json`)

{
  "app_name": "my_inference_app",
  "backend": "native",
  "target_profile": "native-x86",
  "ops": [
    {
      "name": "source",
      "class_name": "UDPSourceOp",
      "type": "source",
      "inputs": [],
      "outputs": [{ "name": "output" }],
      "specs": [
        { "name": "port", "value": "5000", "arg_type": "int" }
      ]
    }
  ],
  "flows": [
    {
      "upstream": "source",
      "downstream": "inference",
      "port_pairs": { "output": "input" }
    }
  ]
}

Getting Help

Full platform documentation: RIA Hub Docs — covers every tool with step-by-step guides and format references.
Open an issue in this repository if an example file is missing, broken, or out of date.
For tool-specific questions, use the in-app help panels (the ? icon on each tool page).

README.md Unescape Escape

RIA Hub — Example Files

What is RIA Hub?

Repository Structure

Tool Walkthroughs

Library

Recording Inspector

Dataset Manager — Curator

Dataset Manager — Inspector

Dataset Manager — Generator

Model Builder — Model Trainer

Model Builder — Hyperparameter Optimization

Model Builder — Model Compression

Application Packager — Application Composer

Screens

App package format

ONNX model requirements

RIA Projects

File Format Reference

SigMF Recording (.sigmf-data + .sigmf-meta)

HDF5 Radio Dataset (.h5)

Application JSON (application.json)

Getting Help

README.md

SigMF Recording (`.sigmf-data` + `.sigmf-meta`)

HDF5 Radio Dataset (`.h5`)

Application JSON (`application.json`)