API

im.Im(arr[, channel_range])

This class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc.

class im.ChannelOrder(value)

Bases: Enum

An enumeration.

CHW = 'CHW'
HWC = 'HWC'
class im.ChannelRange(value)

Bases: Enum

An enumeration.

BOOL = 'BOOL'
FLOAT = 'FLOAT'
UINT8 = 'UINT8'
class im.Im(arr: Union[Im, Tensor, Image, list[PIL.Image.Image], tuple[PIL.Image.Image], ndarray, str, Path], channel_range: Optional[ChannelRange] = None, **kwargs)

Bases: object

This class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc.

Note: Be careful when using this class directly as part of a training pipeline. Many operations will cause the underlying data to convert between formats (e.g., Tensor -> Pillow) and move the data back to system memory and/or incur loss of precision (e.g., float -> uint8). In addition, we do not guarantee bit-consistency over different versions as we may the internal representation or backend computation of a function. Some operations are in place operations even if they return an Im object.

Specifically, we make the following design choices:

  • All inputs are internally represented by either a ndarray or tensor with shape (B, H, W, C) or (B, C, H, W)

  • Convinience is prioritized over efficiency. We make copies or perform in-place operation with little consistency. We may re-evaluate this design decision in the future.

add_border(border: int, color: Tuple[int, int, int])

Adds solid color border to all sides of an image

property batch_size
bool_to_rgb(*args, **kwargs)
property channels

Returns the number of channels in the image (e.g., 3 for RGB or 1 for BW)

colorize(*args, **kwargs)
static concat_horizontal(*args, **kwargs) Im

Concatenates images horizontally (i.e. left to right)

static concat_vertical(*args, **kwargs) Im

Concatenates images vertically (i.e. stacked on top of each other)

convert_opencv_color(*args, **kwargs)
property copy

Returns a deep copy of the image.

crop(*args, **kwargs)
default_normalize_mean = [0.4265, 0.4489, 0.4769]
default_normalize_std = [0.2053, 0.2206, 0.2578]
denormalize(clamp: tuple[float, float] = (0, 1.0), **kwargs) Im

De-normalizes image, optionally clamping values to specified range.

encode_video(*args, **kwargs)
ex = Im of type: ndarray, shape: (256, 256, 3), device: <im.device object>
get_np(order=ChannelOrder.HWC, range=ChannelRange.UINT8) ndarray

Converts the image to a NumPy Array with specified channel order and range.

get_opencv(*args, **kwargs)
get_pil() Union[Image, list[PIL.Image.Image]]

Converts the image to a PIL Image. Returns a list for batched images.

get_torch(order=ChannelOrder.CHW, range=ChannelRange.FLOAT) Tensor

Converts the image to a PyTorch Tensor with specified channel order and range.

grid(*args, **kwargs)
property height

Returns the height of the image.

property image_shape

Returns the height and width of the image as a tuple (H, W)

static new(h: int = 256, w: int = 256, color=(255, 255, 255))

Creates a new image with the specified height and width and color

normalize(normalize_min_max: bool = False, **kwargs) Im

Normalizes image using either the current min-max or given a mean & std.

normalize_setup(mean=[0.4265, 0.4489, 0.4769], std=[0.2053, 0.2206, 0.2578])
property np: ndarray

Converts the image to a NumPy Array with specified channel order and range.

static open(filepath: Path, use_imageio=False) Im

Opens an image from disk and returns an Im object

property opencv
pca(*args, **kwargs)
property pil: Union[Image, list[PIL.Image.Image]]

Converts the image to a PIL Image. Returns a list for batched images.

static random(h: int = 256, w: int = 256, cache: bool = False) Im

Creates a random image from unsplash or picsum

property range_max

Returns the maximum value of the image range (e.g., 255 for UINT8 or 1.0 for FLOAT)

resize(*args, **kwargs)
save(filepath: Optional[Path] = None, filetype: str = 'png', optimize: bool = False, quality: Optional[float] = None, **kwargs) Path

Saves the image to a file, optionally optimizing and compressing the image. By default, the image is saved to $CWD/outputs with a timestamp as the filename, and a PNG filetype. If the image is batched, the images will be saved as a grid.

save_video(filepath: Optional[Path] = None, fps: int = 4, format='mp4', use_pyav: bool = False)

Saves a video to disk. If filepath is not specified, the video will be saved to $CWD/outputs with a timestamp as the filename.

scale(scale: float, **kwargs) Im

Scales the image by a factor, preserving the aspect ratio.

scale_to_height(new_height: int, **kwargs) Im

Scales the image to desired height, preserving the aspect ratio.

scale_to_width(new_width: int, **kwargs) Im

Scales the image to desired width, preserving the aspect ratio.

show()

Displays the image in the default image viewer (e.g., in the terminal or in ipython).

square(size: int) Im

Returns a square image, resizing and padding while preserving aspect ratio

to(device: torch.device)

Move tensor to device. In-place operation.

property torch: Tensor

Converts the image to a PyTorch Tensor with specified channel order and range.

property width

Returns the width of the image.

write_text(*args, **kwargs)
class im.Tensor

Bases: object

im.broadcast_arrays(im1_arr, im2_arr) Tuple[ImArr, ImArr]

Broadcasts two image arrays to compatible shapes for concatenation operations. Specifically, takes […, H, W, C] and […, H, W, C] and broadcasts them to the same shape.

TODO: Support broadcasting with different H/W/C. E.g., currently: [1, H, W, C] and [H // 2, W, C] fail to broadcast

class im.callable_staticmethod

Bases: staticmethod

im.concat_along_dim(arr_1: ImArr, arr_2: ImArr, dim: int)
im.concat_horizontal_(im1: Im, im2: Im, spacing: int = 0, **kwargs) Im

Concatenates two images horizontally with optional spacing between them.

im.concat_variable(concat_func: Callable[[...], Im], *args: Im, **kwargs) Im

Helper function to concatenate variable number of images using a specified concatenation type.

im.concat_vertical_(im1: Im, im2: Im, spacing: int = 0, **kwargs) Im

Concatenates two images vertically with optional spacing between them.

class im.device(type: str)

Bases: object

im.dispatch_op(obj: ImArr, np_op, torch_op, *args)
im.get_arr_hwc(im: Im)
im.identity(x)
im.is_arr(obj: ImArr)
im.is_dtype(arr: ImArr, dtype: Union[Float, Integer, Bool])
im.is_ndarray(obj: ImArr)
im.is_pil(obj: ImArr)
im.is_tensor(obj: ImArr)
im.new_like(arr, shape, fill: Optional[tuple[int]] = None) ImArr
class im.staticproperty(fget)

Bases: object

im.torch_to_numpy(arr: Tensor)
im.warning_guard(message: str)