im.Im
- class im.Im(arr: Union[Im, Tensor, Image, list[PIL.Image.Image], tuple[PIL.Image.Image], ndarray, str, Path], channel_range: Optional[ChannelRange] = None, **kwargs)
This class represents an image [or collection of batched images] and allows for simple conversion between formats (PIL/NumPy ndarray/PyTorch Tensor) and support for common image operations, regardless of input dtype, batching, normalization, etc.
Note: Be careful when using this class directly as part of a training pipeline. Many operations will cause the underlying data to convert between formats (e.g., Tensor -> Pillow) and move the data back to system memory and/or incur loss of precision (e.g., float -> uint8). In addition, we do not guarantee bit-consistency over different versions as we may the internal representation or backend computation of a function. Some operations are in place operations even if they return an Im object.
Specifically, we make the following design choices:
All inputs are internally represented by either a ndarray or tensor with shape (B, H, W, C) or (B, C, H, W)
Convinience is prioritized over efficiency. We make copies or perform in-place operation with little consistency. We may re-evaluate this design decision in the future.
- __init__(arr: Union[Im, Tensor, Image, list[PIL.Image.Image], tuple[PIL.Image.Image], ndarray, str, Path], channel_range: Optional[ChannelRange] = None, **kwargs)
Methods
__init__(arr[, channel_range])add_border(border, color)Adds solid color border to all sides of an image
bool_to_rgb(*args, **kwargs)colorize(*args, **kwargs)concat_horizontal(*args, **kwargs)Concatenates images horizontally (i.e.
concat_vertical(*args, **kwargs)Concatenates images vertically (i.e.
convert_opencv_color(*args, **kwargs)crop(*args, **kwargs)denormalize([clamp])De-normalizes image, optionally clamping values to specified range.
encode_video(*args, **kwargs)get_np([order, range])Converts the image to a NumPy Array with specified channel order and range.
get_opencv(*args, **kwargs)get_pil()Converts the image to a PIL Image.
get_torch([order, range])Converts the image to a PyTorch Tensor with specified channel order and range.
grid(*args, **kwargs)new([h, w, color])Creates a new image with the specified height and width and color
normalize([normalize_min_max])Normalizes image using either the current min-max or given a mean & std.
normalize_setup([mean, std])open(filepath[, use_imageio])Opens an image from disk and returns an Im object
pca(*args, **kwargs)random([h, w, cache])Creates a random image from unsplash or picsum
resize(*args, **kwargs)save([filepath, filetype, optimize, quality])Saves the image to a file, optionally optimizing and compressing the image.
save_video([filepath, fps, format, use_pyav])Saves a video to disk.
scale(scale, **kwargs)Scales the image by a factor, preserving the aspect ratio.
scale_to_height(new_height, **kwargs)Scales the image to desired height, preserving the aspect ratio.
scale_to_width(new_width, **kwargs)Scales the image to desired width, preserving the aspect ratio.
show()Displays the image in the default image viewer (e.g., in the terminal or in ipython).
square(size)Returns a square image, resizing and padding while preserving aspect ratio
to(device)Move tensor to device.
write_text(*args, **kwargs)Attributes
Returns the number of channels in the image (e.g., 3 for RGB or 1 for BW)
Returns a deep copy of the image.
Returns the height of the image.
Returns the height and width of the image as a tuple (H, W)
Converts the image to a NumPy Array with specified channel order and range.
Converts the image to a PIL Image.
Returns the maximum value of the image range (e.g., 255 for UINT8 or 1.0 for FLOAT)
Converts the image to a PyTorch Tensor with specified channel order and range.
Returns the width of the image.