Skip to content

Efficient YOLO-World

Segmentation Base Model

What is Efficient YOLO-World?

EfficientYOLOWorld is a combination of two models:

  1. YOLO-World, a zero-shot object detection model, and;
  2. EfficientSAM, an image segmentation model.

This model runs EfficientSAM on each bounding box region generated by YOLO-World. This allows you to retrieve both the bounding box and the segmentation mask for each object of interest in an image.


To use EfficientYOLOWorld with autodistill, you need to install the following dependency:

pip3 install autodistill-efficient-yolo-world


from autodistill_efficient_yolo_world import EfficientYOLOWorld
from autodistill.detection import CaptionOntology
import cv2
import supervision as sv

# define an ontology to map class names to our EfficientYOLOWorld prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = EfficientYOLOWorld(ontology=CaptionOntology({"book": "book"}))

# predict on an image
result = base_model.predict("bookshelf.jpeg", confidence=0.1)

image = cv2.imread("bookshelf.jpeg")

mask_annotator = sv.MaskAnnotator()
annotated_frame = mask_annotator.annotate(


base_model.label("./context_images", extension=".jpeg")


EfficientSAM is licensed under an Apache 2.0 license.

YOLO-World is licensed under a GPL-3.0 license.