API
|
This class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc. |
- class im.ChannelRange(value)
Bases:
EnumAn enumeration.
- BOOL = 'BOOL'
- FLOAT = 'FLOAT'
- UINT8 = 'UINT8'
- class im.Im(arr: Union[Im, Tensor, Image, list[PIL.Image.Image], tuple[PIL.Image.Image], ndarray, str, Path], channel_range: Optional[ChannelRange] = None, **kwargs)
Bases:
objectThis class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc.
Note: Be careful when using this class directly as part of a training pipeline. Many operations will cause the underlying data to convert between formats (e.g., Tensor -> Pillow) and move the data back to system memory and/or incur loss of precision (e.g., float -> uint8). In addition, we do not guarantee bit-consistency over different versions as we may the internal representation or backend computation of a function. Some operations are in place operations even if they return an Im object.
Specifically, we make the following design choices:
All inputs are internally represented by either a ndarray or tensor with shape (B, H, W, C) or (B, C, H, W)
Convinience is prioritized over efficiency. We make copies or perform in-place operation with little consistency. We may re-evaluate this design decision in the future.
- add_border(border: int, color: Tuple[int, int, int])
Adds solid color border to all sides of an image
- property batch_size
- bool_to_rgb(*args, **kwargs)
- property channels
Returns the number of channels in the image (e.g., 3 for RGB or 1 for BW)
- colorize(*args, **kwargs)
- static concat_vertical(*args, **kwargs) Im
Concatenates images vertically (i.e. stacked on top of each other)
- convert_opencv_color(*args, **kwargs)
- property copy
Returns a deep copy of the image.
- crop(*args, **kwargs)
- default_normalize_mean = [0.4265, 0.4489, 0.4769]
- default_normalize_std = [0.2053, 0.2206, 0.2578]
- denormalize(clamp: tuple[float, float] = (0, 1.0), **kwargs) Im
De-normalizes image, optionally clamping values to specified range.
- encode_video(*args, **kwargs)
- ex = Im of type: ndarray, shape: (256, 256, 3), device: <im.device object>
- get_np(order=ChannelOrder.HWC, range=ChannelRange.UINT8) ndarray
Converts the image to a NumPy Array with specified channel order and range.
- get_opencv(*args, **kwargs)
- get_pil() Union[Image, list[PIL.Image.Image]]
Converts the image to a PIL Image. Returns a list for batched images.
- get_torch(order=ChannelOrder.CHW, range=ChannelRange.FLOAT) Tensor
Converts the image to a PyTorch Tensor with specified channel order and range.
- grid(*args, **kwargs)
- property height
Returns the height of the image.
- property image_shape
Returns the height and width of the image as a tuple (H, W)
- static new(h: int = 256, w: int = 256, color=(255, 255, 255))
Creates a new image with the specified height and width and color
- normalize(normalize_min_max: bool = False, **kwargs) Im
Normalizes image using either the current min-max or given a mean & std.
- normalize_setup(mean=[0.4265, 0.4489, 0.4769], std=[0.2053, 0.2206, 0.2578])
- property np: ndarray
Converts the image to a NumPy Array with specified channel order and range.
- static open(filepath: Path, use_imageio=False) Im
Opens an image from disk and returns an Im object
- property opencv
- pca(*args, **kwargs)
- property pil: Union[Image, list[PIL.Image.Image]]
Converts the image to a PIL Image. Returns a list for batched images.
- static random(h: int = 256, w: int = 256, cache: bool = False) Im
Creates a random image from unsplash or picsum
- property range_max
Returns the maximum value of the image range (e.g., 255 for UINT8 or 1.0 for FLOAT)
- resize(*args, **kwargs)
- save(filepath: Optional[Path] = None, filetype: str = 'png', optimize: bool = False, quality: Optional[float] = None, **kwargs) Path
Saves the image to a file, optionally optimizing and compressing the image. By default, the image is saved to $CWD/outputs with a timestamp as the filename, and a PNG filetype. If the image is batched, the images will be saved as a grid.
- save_video(filepath: Optional[Path] = None, fps: int = 4, format='mp4', use_pyav: bool = False)
Saves a video to disk. If filepath is not specified, the video will be saved to $CWD/outputs with a timestamp as the filename.
- scale_to_height(new_height: int, **kwargs) Im
Scales the image to desired height, preserving the aspect ratio.
- scale_to_width(new_width: int, **kwargs) Im
Scales the image to desired width, preserving the aspect ratio.
- show()
Displays the image in the default image viewer (e.g., in the terminal or in ipython).
- to(device: torch.device)
Move tensor to device. In-place operation.
- property torch: Tensor
Converts the image to a PyTorch Tensor with specified channel order and range.
- property width
Returns the width of the image.
- write_text(*args, **kwargs)
- im.broadcast_arrays(im1_arr, im2_arr) Tuple[ImArr, ImArr]
Broadcasts two image arrays to compatible shapes for concatenation operations. Specifically, takes […, H, W, C] and […, H, W, C] and broadcasts them to the same shape.
TODO: Support broadcasting with different H/W/C. E.g., currently: [1, H, W, C] and [H // 2, W, C] fail to broadcast
- class im.callable_staticmethod
Bases:
staticmethod
- im.concat_horizontal_(im1: Im, im2: Im, spacing: int = 0, **kwargs) Im
Concatenates two images horizontally with optional spacing between them.
- im.concat_variable(concat_func: Callable[[...], Im], *args: Im, **kwargs) Im
Helper function to concatenate variable number of images using a specified concatenation type.
- im.concat_vertical_(im1: Im, im2: Im, spacing: int = 0, **kwargs) Im
Concatenates two images vertically with optional spacing between them.
- im.dispatch_op(obj: ImArr, np_op, torch_op, *args)
- im.identity(x)
- im.is_arr(obj: ImArr)
- im.is_dtype(arr: ImArr, dtype: Union[Float, Integer, Bool])
- im.is_ndarray(obj: ImArr)
- im.is_pil(obj: ImArr)
- im.is_tensor(obj: ImArr)