AltCLIP
What is AltCLIP?¶
AltCLIP is a multi-modal vision model. With AltCLIP, you can compare the similarity between text and images, or the similarlity between two images. AltCLIP was trained on multi-lingual text-image pairs, which means it can be used for zero-shot classification with text prompts in different languages. Read the AltCLIP paper for more information.
The Autodistill AltCLIP module enables you to use AltCLIP for zero-shot classification.
Installation¶
To use AltCLIP with autodistill, you need to install the following dependency:
pip3 install autodistill-altclip
Quickstart¶
from autodistill_altclip import AltCLIP
from autodistill.detection import CaptionOntology
# define an ontology to map class names to our AltCLIP prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated results
# then, load the model
base_model = AltCLIP(
ontology=CaptionOntology(
{
"person": "person",
"a forklift": "forklift"
}
)
)
results = base_model.predict("construction.jpg")
print(results)
License¶
The AltCLIP model is licensed under an Apache 2.0 license. See the model README for more information.
🏆 Contributing¶
We love your input! Please see the core Autodistill contributing guide to get started. Thank you 🙏 to all our contributors!