stouputils.data_science.data_processing.image_preprocess module#
- class ImageDatasetPreprocess(techniques: list[ProcessingTechnique] | None = None)[source]#
Bases:
object
Image dataset preprocessing class. Check the class constructor for more information.
- get_files_recursively(source: str, destination: str, extensions: tuple[str, ...] = ('.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.tif')) dict[str, str] [source]#
Recursively get all files in a directory and their destinations.
- Parameters:
source (str) – Path to the source directory
destination (str) – Path to the destination directory
extensions (tuple[str,...]) – Tuple of extensions to consider (e.g. (“.jpg”, “.png”))
- Returns:
Dictionary mapping source paths to destination paths
- Return type:
dict[str, str]
- get_queue(dataset_path: str, destination_path: str) list[tuple[str, str, list[ProcessingTechnique]]] [source]#
Get the queue of images to process with their techniques.
This method converts the processing techniques ranges to fixed values and builds a queue of files to process by recursively finding all images in the dataset path.
- Parameters:
dataset_path (str) – Path to the dataset directory
destination_path (str) – Path to the destination directory where processed images will be saved
- Returns:
Queue of (source_path, dest_path, techniques) tuples
- Return type:
list[tuple[str, str, list[ProcessingTechnique]]]
- process_dataset(dataset_path: str, destination_path: str, max_workers: int = 4, ignore_confirmation: bool = False) None [source]#
Preprocess the dataset by applying the given processing techniques to the images.
- Parameters:
dataset_path (str) – Path to the dataset
destination_path (str) – Path to the destination dataset
max_workers (int) – Number of workers to use (Defaults to CPU_COUNT)
ignore_confirmation (bool) – If True, don’t ask for confirmation
- static apply_techniques(path: str, dest: str, techniques: list[ProcessingTechnique], use_padding: bool = True) None [source]#
Apply the processing techniques to the image.
- Parameters:
path (str) – Path to the image
dest (str) – Path to the destination image
techniques (list[ProcessingTechnique]) – List of processing techniques to apply
use_padding (bool) – If True, add padding to the image before applying techniques