Image Classification
Image Classification
Image classification은 시각적 콘텐츠를 기반으로 이미지에 레이블 또는 클래스를 할당하는 작업이 포함됩니다.
from transformers import pipeline
clf = pipeline("image-classification")
clf("dataset/mountain.jpg")
No model was supplied, defaulted to google/vit-base-patch16-224 and revision 5dca96d (https://huggingface.co/google/vit-base-patch16-224).
Using a pipeline without specifying a model name and revision in production is not recommended.
/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
config.json: 0%| | 0.00/69.7k [00:00<?, ?B/s]
model.safetensors: 0%| | 0.00/346M [00:00<?, ?B/s]
preprocessor_config.json: 0%| | 0.00/160 [00:00<?, ?B/s]
[{'label': 'valley, vale', 'score': 0.5141904950141907},
{'label': 'alp', 'score': 0.37611910700798035},
{'label': 'mountain tent', 'score': 0.03428410366177559},
{'label': 'volcano', 'score': 0.022554099559783936},
{'label': 'lakeside, lakeshore', 'score': 0.004615743178874254}]
ViT
from transformers import ViTImageProcessor, ViTForImageClassification
from PIL import Image
import requests
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)
processor = ViTImageProcessor.from_pretrained('google/vit-base-patch16-224')
model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224')
inputs = processor(
images=image,
return_tensors="pt"
)
outputs = model(**inputs)
logits = outputs.logits
# model predicts one of the 1000 ImageNet classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])
Predicted class: Egyptian cat
Last updated