Image Classification

Image classification은 시각적 콘텐츠를 기반으로 이미지에 레이블 또는 클래스를 할당하는 작업이 포함됩니다.

from transformers import pipeline

clf = pipeline("image-classification")
clf("dataset/mountain.jpg")

No model was supplied, defaulted to google/vit-base-patch16-224 and revision 5dca96d (https://huggingface.co/google/vit-base-patch16-224).
Using a pipeline without specifying a model name and revision in production is not recommended.
/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(



config.json:   0%|          | 0.00/69.7k [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/346M [00:00<?, ?B/s]



preprocessor_config.json:   0%|          | 0.00/160 [00:00<?, ?B/s]





[{'label': 'valley, vale', 'score': 0.5141904950141907},
 {'label': 'alp', 'score': 0.37611910700798035},
 {'label': 'mountain tent', 'score': 0.03428410366177559},
 {'label': 'volcano', 'score': 0.022554099559783936},
 {'label': 'lakeside, lakeshore', 'score': 0.004615743178874254}]

ViT

from transformers import ViTImageProcessor, ViTForImageClassification
from PIL import Image
import requests

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

processor = ViTImageProcessor.from_pretrained('google/vit-base-patch16-224')
model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224')

inputs = processor(
    images=image, 
    return_tensors="pt"
)
outputs = model(**inputs)
logits = outputs.logits

# model predicts one of the 1000 ImageNet classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])

Predicted class: Egyptian cat

PreviousDepth Estimation NextObject Detection

Last updated 1 year ago

Image Classification

Image classification은 시각적 콘텐츠를 기반으로 이미지에 레이블 또는 클래스를 할당하는 작업이 포함됩니다.

from transformers import pipeline

clf = pipeline("image-classification")
clf("dataset/mountain.jpg")

No model was supplied, defaulted to google/vit-base-patch16-224 and revision 5dca96d (https://huggingface.co/google/vit-base-patch16-224).
Using a pipeline without specifying a model name and revision in production is not recommended.
/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(



config.json:   0%|          | 0.00/69.7k [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/346M [00:00<?, ?B/s]



preprocessor_config.json:   0%|          | 0.00/160 [00:00<?, ?B/s]





[{'label': 'valley, vale', 'score': 0.5141904950141907},
 {'label': 'alp', 'score': 0.37611910700798035},
 {'label': 'mountain tent', 'score': 0.03428410366177559},
 {'label': 'volcano', 'score': 0.022554099559783936},
 {'label': 'lakeside, lakeshore', 'score': 0.004615743178874254}]

ViT

from transformers import ViTImageProcessor, ViTForImageClassification
from PIL import Image
import requests

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

processor = ViTImageProcessor.from_pretrained('google/vit-base-patch16-224')
model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224')

inputs = processor(
    images=image, 
    return_tensors="pt"
)
outputs = model(**inputs)
logits = outputs.logits

# model predicts one of the 1000 ImageNet classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])

Predicted class: Egyptian cat