Docstring updates to fix Sphinx build warnings and errors

This commit is contained in:
Michael Luciuk 2025-09-10 12:41:10 -04:00
parent 0d492c59d2
commit 7cd8d3556b
3 changed files with 46 additions and 105 deletions

View File

@ -145,100 +145,85 @@ class RadioDataset(ABC):
classes_to_augment: Optional[str | list[str]] = None, classes_to_augment: Optional[str | list[str]] = None,
inplace: Optional[bool] = False, inplace: Optional[bool] = False,
) -> RadioDataset | None: ) -> RadioDataset | None:
"""Supplement the dataset with new examples by applying various transformations to the pre-existing examples """
in the dataset. Supplement the dataset with new examples by applying various transformations
to the pre-existing examples in the dataset.
.. todo:: .. todo::
This method is currently under construction, and may produce unexpected results. This method is currently under construction, and may produce unexpected results.
The process of supplementing a dataset to artificially increases the diversity of the examples is called The process of supplementing a dataset to artificially increase the diversity
augmentation. In many cases, training on augmented data can enhance the generalization and robustness of of examples is called augmentation. Training on augmented data can enhance
deep machine learning models. For more information on the benefits and limitations of data the generalization and robustness of deep machine learning models. For more
augmentation, please refer to this tutorial by Abid Ali Awan: `A Complete Guide to Data Augmentation information, see `A Complete Guide to Data Augmentation
<https://www.datacamp.com/tutorial/complete-guide-data-augmentation>`_. <https://www.datacamp.com/tutorial/complete-guide-data-augmentation>`_.
The metadata for each new example will be identical to the metadata of the pre-existing example from Metadata for each new example will be identical to the metadata of the
which it was generated. However, the metadata will be extended to include a 'augmentation' column, which will pre-existing example from which it was generated. The metadata will be
be populated for each new example with the string representation of the transform used to generate it, and left extended to include an 'augmentation' column, populated with the string
empty for all the pre-existing examples. representation of the transform used.
Please note that augmented data should only be utilized for model training, not for testing or validation. Augmented data should only be used for model training, not for testing or
validation.
Unless specified, augmentations are applied equally across classes, maintaining the original class Unless specified, augmentations are applied equally across classes, maintaining
distribution. the original class distribution.
In the case where target_size is not equal to the sum of the original class sizes scaled by an integer If target_size does not match the sum of the original class sizes scaled by
multiple, it is not possible to maintain the original class distribution, so the distribution gets slightly an integer multiple, the class distribution is slightly adjusted to satisfy
skewed to satisfy target_size. To do this, each class size gets divided by the total size and then target_size.
multiplied by target_size, then these values all get rounded to the nearest integers. If the target_size is
not equal to the sum of the rounded sizes, the sizes get sorted based on their decimal portions and then
values are adjusted one by one until the target_size is reached.
:param class_key: Class name that is used to augment from and calculate class distribution. :param class_key: Class name used to augment from and calculate class distribution.
:type class_key: str :type class_key: str
:param augmentations: A function or a list of functions that take as input an example from the :param augmentations: A function or list of functions that take an example
dataset and return a transformed version of that example. If no augmentations are specified, the default and return a transformed version. Defaults to ``default_augmentations()``.
augmentations returned by the ``default_augmentations()`` method will be applied.
:type augmentations: callable or list of callables, optional :type augmentations: callable or list of callables, optional
:param level: The level or extent of data augmentation to apply, ranging from 0.0 (no augmentation) to :param level: The extent of augmentation from 0.0 (none) to 1.0 (full). If
1.0 (full augmentation, where each augmentation is applied to each pre-existing example). ``classes_to_augment`` is specified, can be either:
|br| |br| If ``classes_to_augment`` is specified, this can be either:
* A single float: * A single float: All classes augmented evenly to this level.
All classes are augmented evenly to this level, maintaining the original class distribution. * A list of floats: Each element corresponds to the augmentation level
* A list of floats: target for the corresponding class.
Each element corresponds to the augmentation level target for the corresponding class.
The default is 1.0.
:type level: float or list of floats, optional :type level: float or list of floats, optional
:param target_size: Target size of the augmented dataset. If specified, ``level`` is ignored, and augmentations :param target_size: Target size of the augmented dataset. Overrides ``level``
are applied to expand the dataset to contain the specified number of examples. if specified. If ``classes_to_augment`` is specified, can be either:
If ``classes_to_augment`` is specified, this can be either:
* A single float: * A single float: All classes are augmented proportional to their
All classes are augmented proportional to their relative frequency until the dataset reaches the relative frequency until the dataset reaches target_size.
target size, maintaining the original class distribution. * A list of floats: Each element corresponds to the target size for the
* A list of floats: corresponding class.
Each element in the list corresponds to the target size for the corresponding class.
Defaults to None.
:type target_size: int or list of ints, optional :type target_size: int or list of ints, optional
:param classes_to_augment: List of the metadata keys of the classes to augment. If specified, only these :param classes_to_augment: List of metadata keys of classes to augment.
classes will be augmented. Defaults to None.
:type classes_to_augment: string or list of strings, optional :type classes_to_augment: string or list of strings, optional
:param inplace: If True, the augmentation is performed inplace and ``None`` is returned. Defaults to False. :param inplace: If True, the augmentation is performed inplace and ``None`` is returned.
:type inplace: bool, optional :type inplace: bool, optional
:raises ValueError: If level has any values that are not in the range (0,1]. :raises ValueError: If level has any values not in the range (0,1].
:raises ValueError: If target_size of dataset is already sufficed. :raises ValueError: If target_size of dataset is already sufficed.
:raises ValueError: If a class name in classes_to_augment does not exist in the specified class_key. :raises ValueError: If a class in classes_to_augment does not exist in class_key.
:return: The augmented dataset or None if ``inplace=True``. :return: The augmented dataset or None if ``inplace=True``.
:rtype: RadioDataset or None :rtype: RadioDataset or None
**Examples:** **Examples:**
>>> from ria.dataset_manager.builders import AWGN_Builder() >>> from ria.dataset_manager.builders import AWGN_Builder
>>> builder = AWGN_Builder() >>> builder = AWGN_Builder()
>>> builder.download_and_prepare() >>> builder.download_and_prepare()
>>> ds = builder.as_dataset() >>> ds = builder.as_dataset()
>>> ds.get_class_sizes(class_key='col') >>> ds.get_class_sizes(class_key='col')
{a:100, b:500, c:300} {'a': 100, 'b': 500, 'c': 300}
>>> new_ds = ds.augment(class_key='col', classes_to_augment=['a', 'b'], target_size=1200) >>> new_ds = ds.augment(class_key='col', classes_to_augment=['a', 'b'], target_size=1200)
>>> new_ds.get_class_sizes(class_key='col') >>> new_ds.get_class_sizes(class_key='col')
{a:150 b:750, c:300} {'a': 150, 'b': 750, 'c': 300}
>>> from ria.dataset_manager.builders import AWGN_Builder()
>>> builder = AWGN_Builder()
>>> builder.download_and_prepare()
>>> ds = builder.as_dataset()
>>> ds.get_class_sizes(class_key='col')
{a:50, b:20, c:130}
>>> new_ds = ds.augment(class_key='col', level=0.5)
>>> new_ds.get_class_sizes(class_key='col')
{a:75 b:30, c:195}
""" """
if augmentations is None: if augmentations is None:

View File

@ -28,7 +28,7 @@ class Recording:
Metadata is stored in a dictionary of key value pairs, Metadata is stored in a dictionary of key value pairs,
to include information such as sample_rate and center_frequency. to include information such as sample_rate and center_frequency.
Annotations are a list of :ref:`Annotation <ria_toolkit_oss.datatypes.Annotation>`, Annotations are a list of :class:`~ria_toolkit_oss.datatypes.Annotation`,
defining bounding boxes in time and frequency with labels and metadata. defining bounding boxes in time and frequency with labels and metadata.
Here, signal data is represented as a NumPy array. This class is then extended in the RIA Backends to provide Here, signal data is represented as a NumPy array. This class is then extended in the RIA Backends to provide
@ -48,7 +48,7 @@ class Recording:
:param metadata: Additional information associated with the recording. :param metadata: Additional information associated with the recording.
:type metadata: dict, optional :type metadata: dict, optional
:param annotations: A collection of ``Annotation`` objects defining bounding boxes. :param annotations: A collection of :class:`~ria_toolkit_oss.datatypes.Annotation` objects defining bounding boxes.
:type annotations: list of Annotations, optional :type annotations: list of Annotations, optional
:param dtype: Explicitly specify the data-type of the complex samples. Must be a complex NumPy type, such as :param dtype: Explicitly specify the data-type of the complex samples. Must be a complex NumPy type, such as
@ -444,34 +444,6 @@ class Recording:
else: else:
raise ValueError(f"Key {key} is protected and cannot be modified or removed.") raise ValueError(f"Key {key} is protected and cannot be modified or removed.")
def view(self, output_path: Optional[str] = "images/signal.png", **kwargs) -> None:
"""Create a plot of various signal visualizations as a PNG image.
:param output_path: The output image path. Defaults to "images/signal.png".
:type output_path: str, optional
:param kwargs: Keyword arguments passed on to ria_toolkit_oss.view.view_sig.
:type: dict of keyword arguments
**Examples:**
Create a recording and view it as a plot in a .png image:
>>> import numpy
>>> from ria_toolkit_oss.datatypes import Recording
>>> samples = numpy.ones(10000, dtype=numpy.complex64)
>>> metadata = {
>>> "sample_rate": 1e6,
>>> "center_frequency": 2.44e9,
>>> }
>>> recording = Recording(data=samples, metadata=metadata)
>>> recording.view()
"""
from ria_toolkit_oss.view import view_sig
view_sig(recording=self, output_path=output_path, **kwargs)
def to_sigmf(self, filename: Optional[str] = None, path: Optional[os.PathLike | str] = None) -> None: def to_sigmf(self, filename: Optional[str] = None, path: Optional[os.PathLike | str] = None) -> None:
"""Write recording to a set of SigMF files. """Write recording to a set of SigMF files.
@ -487,22 +459,6 @@ class Recording:
:raises IOError: If there is an issue encountered during the file writing process. :raises IOError: If there is an issue encountered during the file writing process.
:return: None :return: None
**Examples:**
Create a recording and view it as a plot in a `.png` image:
>>> import numpy
>>> from ria_toolkit_oss.datatypes import Recording
>>> samples = numpy.ones(10000, dtype=numpy.complex64)
>>> metadata = {
... "sample_rate": 1e6,
... "center_frequency": 2.44e9,
... }
>>> recording = Recording(data=samples, metadata=metadata)
>>> recording.view()
""" """
from ria_toolkit_oss.io.recording import to_sigmf from ria_toolkit_oss.io.recording import to_sigmf