Skip to content

Image Loading

All Autodistill base models (i.e. Grounding DINO or CLIP) support providing a file name and loading the corresponding image for use in labeling. Some models also enable passing images directly from the following formats:

  • PIL Image
  • cv2 image
  • URL, from which an image is retrieved
  • A file name, which is loaded as an image

This is handled by the low-level load_image function. This function allows you to pass any of the above formats. The PIL and cv2 formats are ideal if you already have an image in memory. Base models use this function to request the format the model needs. If a model needs an image in a format different from what you have provided -- for example, if you provided a file name and the model needs a PIL Image object -- the load_image function will convert the image to the correct format.

The following models support the load_image function. The PIL and cv2 states to what format load_image will convert your image (if necessary) to pass your image into a model.

  • AltCLIP: PIL
  • CLIP: PIL
  • Grounding DINO: cv2
  • MetaCLIP: PIL
  • RemoteCLIP: PIL
  • Transformers: PIL
  • SAM HQ: cv2
  • Segment Anything: cv2
  • DETIC: PIL
  • VLPart: PIL
  • CoDet: PIL
  • OWLv2: PIL
  • FastViT: PIL
  • FastSAM: cv2
  • SegGPT: PIL
  • OWLViT: PIL
  • BLIPv2: PIL
  • DINOv2: PIL
  • Grounded SAM: cv2
  • BLIP: PIL

load_image function

Load an image from a file path, URI, PIL image, or numpy array.

This function is for use by Autodistill modules. You don't need to use it directly.

Parameters:

Name Type Description Default
image Any

The image to load

required
return_format

The format to return the image in

'cv2'

Returns:

Type Description
Any

The image in the specified format

Source code in autodistill/helpers.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def load_image(
    image: Any,
    return_format="cv2",
) -> Any:
    """
    Load an image from a file path, URI, PIL image, or numpy array.

    This function is for use by Autodistill modules. You don't need to use it directly.

    Args:
        image: The image to load
        return_format: The format to return the image in

    Returns:
        The image in the specified format
    """
    if return_format not in ACCEPTED_RETURN_FORMATS:
        raise ValueError(f"return_format must be one of {ACCEPTED_RETURN_FORMATS}")

    if isinstance(image, Image.Image) and return_format == "PIL":
        return image
    elif isinstance(image, Image.Image) and return_format == "cv2":
        # channels need to be reversed for cv2
        return cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
    elif isinstance(image, Image.Image) and return_format == "numpy":
        return np.array(image)

    if isinstance(image, np.ndarray) and return_format == "PIL":
        return Image.fromarray(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
    elif isinstance(image, np.ndarray) and return_format == "cv2":
        return image
    elif isinstance(image, np.ndarray) and return_format == "numpy":
        return image

    if isinstance(image, str) and image.startswith("http"):
        if return_format == "PIL":
            response = requests.get(image)
            return Image.open(BytesIO(response.content))
        elif return_format == "cv2" or return_format == "numpy":
            response = requests.get(image)
            pil_image = Image.open(BytesIO(response.content))
            return np.array(pil_image)
    elif os.path.isfile(image):
        if return_format == "PIL":
            return Image.open(image)
        elif return_format == "cv2":
            # channels need to be reversed for cv2
            return cv2.cvtColor(np.array(Image.open(image)), cv2.COLOR_RGB2BGR)
        elif return_format == "numpy":
            pil_image = Image.open(image)
            return np.array(pil_image)
    else:
        raise ValueError(f"{image} is not a valid file path or URI")