API

im.Im(arr[, channel_range])

This class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc.

class im.ChannelOrder(value)

Bases: Enum

An enumeration.

CHW = 'CHW'

HWC = 'HWC'

class im.ChannelRange(value)

Bases: Enum

An enumeration.

BOOL = 'BOOL'

FLOAT = 'FLOAT'

UINT8 = 'UINT8'

class im.Im(arr: Union[Im, Tensor, Image, list[PIL.Image.Image], tuple[PIL.Image.Image], ndarray, str, Path], channel_range: Optional[ChannelRange] = None, **kwargs)

Bases: object

This class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc.

Note: Be careful when using this class directly as part of a training pipeline. Many operations will cause the underlying data to convert between formats (e.g., Tensor -> Pillow) and move the data back to system memory and/or incur loss of precision (e.g., float -> uint8). In addition, we do not guarantee bit-consistency over different versions as we may the internal representation or backend computation of a function. Some operations are in place operations even if they return an Im object.

Specifically, we make the following design choices:

All inputs are internally represented by either a ndarray or tensor with shape (B, H, W, C) or (B, C, H, W)
Convinience is prioritized over efficiency. We make copies or perform in-place operation with little consistency. We may re-evaluate this design decision in the future.

add_border(border: int, color: Tuple[int, int, int]): Adds solid color border to all sides of an image

property batch_size

bool_to_rgb(*args, **kwargs)

property channels: Returns the number of channels in the image (e.g., 3 for RGB or 1 for BW)

colorize(*args, **kwargs)

static concat_horizontal(*args, **kwargs) → Im: Concatenates images horizontally (i.e. left to right)

static concat_vertical(*args, **kwargs) → Im: Concatenates images vertically (i.e. stacked on top of each other)

convert_opencv_color(*args, **kwargs)

property copy: Returns a deep copy of the image.

crop(*args, **kwargs)

default_normalize_mean = [0.4265, 0.4489, 0.4769]

default_normalize_std = [0.2053, 0.2206, 0.2578]

denormalize(clamp: tuple[float, float] = (0, 1.0), **kwargs) → Im: De-normalizes image, optionally clamping values to specified range.

encode_video(*args, **kwargs)

ex = Im of type: ndarray, shape: (256, 256, 3), device: <im.device object>

get_np(order=ChannelOrder.HWC, range=ChannelRange.UINT8) → ndarray: Converts the image to a NumPy Array with specified channel order and range.

get_opencv(*args, **kwargs)

get_pil() → Union[Image, list[PIL.Image.Image]]: Converts the image to a PIL Image. Returns a list for batched images.

get_torch(order=ChannelOrder.CHW, range=ChannelRange.FLOAT) → Tensor: Converts the image to a PyTorch Tensor with specified channel order and range.

grid(*args, **kwargs)

property height: Returns the height of the image.

property image_shape: Returns the height and width of the image as a tuple (H, W)

static new(h: int = 256, w: int = 256, color=(255, 255, 255)): Creates a new image with the specified height and width and color

normalize(normalize_min_max: bool = False, **kwargs) → Im: Normalizes image using either the current min-max or given a mean & std.

normalize_setup(mean=[0.4265, 0.4489, 0.4769], std=[0.2053, 0.2206, 0.2578])

property np: ndarray: Converts the image to a NumPy Array with specified channel order and range.

static open(filepath: Path, use_imageio=False) → Im: Opens an image from disk and returns an Im object

property opencv

pca(*args, **kwargs)

property pil: Union[Image, list[PIL.Image.Image]]: Converts the image to a PIL Image. Returns a list for batched images.

static random(h: int = 256, w: int = 256, cache: bool = False) → Im: Creates a random image from unsplash or picsum

property range_max: Returns the maximum value of the image range (e.g., 255 for UINT8 or 1.0 for FLOAT)

resize(*args, **kwargs)

save(filepath: Optional[Path] = None, filetype: str = 'png', optimize: bool = False, quality: Optional[float] = None, **kwargs) → Path: Saves the image to a file, optionally optimizing and compressing the image. By default, the image is saved to $CWD/outputs with a timestamp as the filename, and a PNG filetype. If the image is batched, the images will be saved as a grid.

save_video(filepath: Optional[Path] = None, fps: int = 4, format='mp4', use_pyav: bool = False): Saves a video to disk. If filepath is not specified, the video will be saved to $CWD/outputs with a timestamp as the filename.

scale(scale: float, **kwargs) → Im: Scales the image by a factor, preserving the aspect ratio.

scale_to_height(new_height: int, **kwargs) → Im: Scales the image to desired height, preserving the aspect ratio.

scale_to_width(new_width: int, **kwargs) → Im: Scales the image to desired width, preserving the aspect ratio.

show(): Displays the image in the default image viewer (e.g., in the terminal or in ipython).

square(size: int) → Im: Returns a square image, resizing and padding while preserving aspect ratio

to(device: torch.device): Move tensor to device. In-place operation.

property torch: Tensor: Converts the image to a PyTorch Tensor with specified channel order and range.

property width: Returns the width of the image.

write_text(*args, **kwargs)

class im.Tensor: Bases: object

im.broadcast_arrays(im1_arr, im2_arr) → Tuple[ImArr, ImArr]

Broadcasts two image arrays to compatible shapes for concatenation operations. Specifically, takes […, H, W, C] and […, H, W, C] and broadcasts them to the same shape.

TODO: Support broadcasting with different H/W/C. E.g., currently: [1, H, W, C] and [H // 2, W, C] fail to broadcast

class im.callable_staticmethod: Bases: staticmethod

im.concat_along_dim(arr_1: ImArr, arr_2: ImArr, dim: int)

im.concat_horizontal_(im1: Im, im2: Im, spacing: int = 0, **kwargs) → Im: Concatenates two images horizontally with optional spacing between them.

im.concat_variable(concat_func: Callable[[...], Im], *args: Im, **kwargs) → Im: Helper function to concatenate variable number of images using a specified concatenation type.

im.concat_vertical_(im1: Im, im2: Im, spacing: int = 0, **kwargs) → Im: Concatenates two images vertically with optional spacing between them.

class im.device(type: str): Bases: object

im.dispatch_op(obj: ImArr, np_op, torch_op, *args)

im.get_arr_hwc(im: Im)

im.identity(x)

im.is_arr(obj: ImArr)

im.is_dtype(arr: ImArr, dtype: Union[Float, Integer, Bool])

im.is_ndarray(obj: ImArr)

im.is_pil(obj: ImArr)

im.is_tensor(obj: ImArr)

im.new_like(arr, shape, fill: Optional[tuple[int]] = None) → ImArr

class im.staticproperty(fget): Bases: object

im.torch_to_numpy(arr: Tensor)

im.warning_guard(message: str)