Aspect-Based Sentiment Analysis

기존의 감성 분석은 일반적으로 전체 문장이나 문서에 단일 감성 점수(예: 긍정, 부정, 중립)를 부여하지만, ABSA는 여기서 더 나아가 텍스트를 특정 측면이나 속성과 관련된 작은 단위로 세분화하여 각 측면에 대한 감성을 분석합니다.

제품 리뷰: 제품의 특정 기능에 대한 고객 피드백을 분석합니다.
비즈니스 인텔리전스: 서비스 또는 비즈니스 운영의 다양한 측면에 대한 고객의 의견을 파악합니다.
시장 조사: 고객이 다양한 제품 기능에 대해 무엇을 좋아하고 싫어하는지에 대한 자세한 인사이트 수집.

Transformer Pipeline

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_name = "yangheng/deberta-v3-large-absa-v1.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

classifier = pipeline(
    "text-classification", 
    model=model, 
    tokenizer=tokenizer
)

/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
tokenizer_config.json: 100%|██████████| 397/397 [00:00<00:00, 770kB/s]
spm.model: 100%|██████████| 2.46M/2.46M [00:01<00:00, 2.17MB/s]
added_tokens.json: 100%|██████████| 18.0/18.0 [00:00<00:00, 42.5kB/s]
special_tokens_map.json: 100%|██████████| 156/156 [00:00<00:00, 355kB/s]
/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py:550: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.
  warnings.warn(
config.json: 100%|██████████| 1.03k/1.03k [00:00<00:00, 2.05MB/s]
model.safetensors: 100%|██████████| 1.74G/1.74G [01:29<00:00, 19.4MB/s]

Calssification

aspects = ["camera", "performance", "weight"]

text = """
The camera quality of this phone is amazing; however, it is too heavy for 
a smartphone, and due to the next-generation CPUs, it's very fast.
"""

for aspect in aspects:
   print(aspect, classifier(text,  text_pair=aspect))

camera [{'label': 'Positive', 'score': 0.9998691082000732}]
performance [{'label': 'Positive', 'score': 0.719463050365448}]
weight [{'label': 'Negative', 'score': 0.9993507266044617}]

PreviousZero-shot Classification NextFeature Extraction

Last updated 1 year ago

Aspect-Based Sentiment Analysis

제품 리뷰: 제품의 특정 기능에 대한 고객 피드백을 분석합니다.

비즈니스 인텔리전스: 서비스 또는 비즈니스 운영의 다양한 측면에 대한 고객의 의견을 파악합니다.

시장 조사: 고객이 다양한 제품 기능에 대해 무엇을 좋아하고 싫어하는지에 대한 자세한 인사이트 수집.

Transformer Pipeline

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_name = "yangheng/deberta-v3-large-absa-v1.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

classifier = pipeline(
    "text-classification", 
    model=model, 
    tokenizer=tokenizer
)

/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
tokenizer_config.json: 100%|██████████| 397/397 [00:00<00:00, 770kB/s]
spm.model: 100%|██████████| 2.46M/2.46M [00:01<00:00, 2.17MB/s]
added_tokens.json: 100%|██████████| 18.0/18.0 [00:00<00:00, 42.5kB/s]
special_tokens_map.json: 100%|██████████| 156/156 [00:00<00:00, 355kB/s]
/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py:550: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.
  warnings.warn(
config.json: 100%|██████████| 1.03k/1.03k [00:00<00:00, 2.05MB/s]
model.safetensors: 100%|██████████| 1.74G/1.74G [01:29<00:00, 19.4MB/s]

Calssification

aspects = ["camera", "performance", "weight"]

text = """
The camera quality of this phone is amazing; however, it is too heavy for 
a smartphone, and due to the next-generation CPUs, it's very fast.
"""

for aspect in aspects:
   print(aspect, classifier(text,  text_pair=aspect))

camera [{'label': 'Positive', 'score': 0.9998691082000732}]
performance [{'label': 'Positive', 'score': 0.719463050365448}]
weight [{'label': 'Negative', 'score': 0.9993507266044617}]