stouputils.data_science.dataset package#
Package for advanced dataset handling.
Provides comprehensive tools for loading, processing and managing image datasets with special handling for augmented data and group-aware operations.
Main Components:
Dataset : Core class for storing and managing dataset splits with metadata
DatasetLoader : Handles dataset loading from directories with various strategies
DatasetSplitter : Manages stratified splitting while maintaining group integrity
GroupingStrategy : Enum defining image grouping approaches (NONE/SIMPLE/CONCATENATE)
XyTuple : Specialized container for features/labels with file tracking
Key Features:
Augmented data handling with original file mapping
Prevention of data leakage between train/test sets
Support for multiple grouping strategies at subject/image level
Class-aware dataset splitting with stratification
Comprehensive metadata tracking (class distributions, file paths)
Compatibility with keras.image_dataset_from_directory
Group-aware k-fold cross validation support
Submodules#
- stouputils.data_science.dataset.dataset module
DEFAULT_IMAGE_KWARGSDatasetDataset._training_dataDataset._val_dataDataset._test_dataDataset.num_classesDataset.nameDataset.loading_typeDataset.grouping_strategyDataset.labelsDataset.class_distributionDataset.original_datasetDataset._get_num_classes()Dataset._update_class_distribution()Dataset.exclude_augmented_images_from_val_test()Dataset.get_experiment_name()
- stouputils.data_science.dataset.dataset_loader module
- stouputils.data_science.dataset.grouping_strategy module
- stouputils.data_science.dataset.image_loader module
- stouputils.data_science.dataset.xy_tuple module
XyTupleXyTuple._XXyTuple._yXyTuple.filepathsXyTuple.augmented_filesXyTuple.n_samplesXyTuple.is_empty()XyTuple.update_augmented_files()XyTuple.group_by_original()XyTuple.get_indices_from_originals()XyTuple.create_subset()XyTuple.remove_augmented_files()XyTuple.split()XyTuple.kfold_split()XyTuple.ungrouped_array()XyTuple.empty()