im.Im

class im.Im(arr: Union[Im, Tensor, Image, list[PIL.Image.Image], tuple[PIL.Image.Image], ndarray, str, Path], channel_range: Optional[ChannelRange] = None, **kwargs)

This class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc.

Note: Be careful when using this class directly as part of a training pipeline. Many operations will cause the underlying data to convert between formats (e.g., Tensor -> Pillow) and move the data back to system memory and/or incur loss of precision (e.g., float -> uint8). In addition, we do not guarantee bit-consistency over different versions as we may the internal representation or backend computation of a function. Some operations are in place operations even if they return an Im object.

Specifically, we make the following design choices:

  • All inputs are internally represented by either a ndarray or tensor with shape (B, H, W, C) or (B, C, H, W)

  • Convinience is prioritized over efficiency. We make copies or perform in-place operation with little consistency. We may re-evaluate this design decision in the future.

__init__(arr: Union[Im, Tensor, Image, list[PIL.Image.Image], tuple[PIL.Image.Image], ndarray, str, Path], channel_range: Optional[ChannelRange] = None, **kwargs)

Methods

__init__(arr[, channel_range])

add_border(border, color)

Adds solid color border to all sides of an image

bool_to_rgb(*args, **kwargs)

colorize(*args, **kwargs)

concat_horizontal(*args, **kwargs)

Concatenates images horizontally (i.e.

concat_vertical(*args, **kwargs)

Concatenates images vertically (i.e.

convert_opencv_color(*args, **kwargs)

crop(*args, **kwargs)

denormalize([clamp])

De-normalizes image, optionally clamping values to specified range.

encode_video(*args, **kwargs)

get_np([order, range])

Converts the image to a NumPy Array with specified channel order and range.

get_opencv(*args, **kwargs)

get_pil()

Converts the image to a PIL Image.

get_torch([order, range])

Converts the image to a PyTorch Tensor with specified channel order and range.

grid(*args, **kwargs)

new([h, w, color])

Creates a new image with the specified height and width and color

normalize([normalize_min_max])

Normalizes image using either the current min-max or given a mean & std.

normalize_setup([mean, std])

open(filepath[, use_imageio])

Opens an image from disk and returns an Im object

pca(*args, **kwargs)

random([h, w, cache])

Creates a random image from unsplash or picsum

resize(*args, **kwargs)

save([filepath, filetype, optimize, quality])

Saves the image to a file, optionally optimizing and compressing the image.

save_video([filepath, fps, format, use_pyav])

Saves a video to disk.

scale(scale, **kwargs)

Scales the image by a factor, preserving the aspect ratio.

scale_to_height(new_height, **kwargs)

Scales the image to desired height, preserving the aspect ratio.

scale_to_width(new_width, **kwargs)

Scales the image to desired width, preserving the aspect ratio.

show()

Displays the image in the default image viewer (e.g., in the terminal or in ipython).

square(size)

Returns a square image, resizing and padding while preserving aspect ratio

to(device)

Move tensor to device.

write_text(*args, **kwargs)

Attributes

batch_size

channels

Returns the number of channels in the image (e.g., 3 for RGB or 1 for BW)

copy

Returns a deep copy of the image.

default_normalize_mean

default_normalize_std

ex

height

Returns the height of the image.

image_shape

Returns the height and width of the image as a tuple (H, W)

np

Converts the image to a NumPy Array with specified channel order and range.

opencv

pil

Converts the image to a PIL Image.

range_max

Returns the maximum value of the image range (e.g., 255 for UINT8 or 1.0 for FLOAT)

torch

Converts the image to a PyTorch Tensor with specified channel order and range.

width

Returns the width of the image.