Natural Language Processing Tasks with Hugging Face
Natural Language Processing (NLP) Models in Hugging Face
Natural Language Processing (NLP) enables computers to understand, interpret, and generate human language, and Hugging Face provides one of the largest collections of pretrained NLP models for developers and researchers. With Hugging Face, you can easily perform tasks like text classification, question answering, text summarization, translation, and token classification using powerful models such as BERT, DistilBERT, BART, TAPAS, and MarianMT. In this tutorial, we'll guide you through the most popular Hugging Face NLP models, show practical examples, and explain how to use smaller, faster variants to save time and storage while experimenting. To save time and storage, we will use smaller versions of the original models in these examples. The smaller models behave similarly and are suitable for learning and experimentation.
Table Question Answering Examples
How to Use Table Question Answering Models with Hugging Face Pipeline
We will use the pipeline for Table Question Answering. We will create simple synthetic data. The table will display the names of the products and the number of products.
from transformers import pipeline
import pandas as pd
# prepare table
data = {"Products": ["jeans", "jackets", "shirts"], "Number of products": ["87", "53", "69"]}
table = pd.DataFrame.from_dict(data)
#prepare your question
question = "how many shirts are there?"
# pipeline model
tqa = pipeline(task="table-question-answering", model="google/tapas-large-finetuned-wtq", aggregator="SUM")
# result
print(tqa(table=table, query=question))
{'answer': 'SUM > 69', 'coordinates': [(2, 1)], 'cells': ['69'], 'aggregator': 'SUM'}
If we change the data and make two columns for the shirts, the answer changes:
#new data
data = {"Products": ["jeans", "jackets", "shirts", "shirts"], "Number of products": ["87", "53", "69", "21"]}
print(tqa(table=table, query=question))
The answer:
{'answer': 'COUNT > 69, 21', 'coordinates': [(2, 1), (3, 1)], 'cells': ['69', '21'], 'aggregator': 'COUNT'}
We can get the total number of shirts:
z = tqa(table=table, query=question)["cells"]
x= []
for i in z:
x.append(int(i))
print(sum(x))
The answer is 90.
*google/tapas-large-finetuned-wtq model from Hugging Face — licensed under the Apache 2.0 License.
How to use TAPAS Table Question Answering Model Without the Pipeline
In this example, we'll use the google/tapas-base-finetuned-wtq model to perform table question answering on the same data without using the pipeline:
from transformers import TapasTokenizer, TapasForQuestionAnswering
import pandas as pd
import torch
# Load model and tokenizer
model_name = "google/tapas-base-finetuned-wtq"
tokenizer = TapasTokenizer.from_pretrained(model_name)
model = TapasForQuestionAnswering.from_pretrained(model_name)
# Example table
data = {"Products": ["jeans", "jackets", "shirts"], "Number of products": ["87", "53", "69"]}
table = pd.DataFrame.from_dict(data)
# Question
question = "how many shirts are there?"
# Tokenize inputs
inputs = tokenizer(table=table, queries=[question], return_tensors="pt")
# Forward pass
with torch.no_grad():
outputs = model(**inputs)
# Decode predicted answer
logits = outputs.logits
logits_agg = outputs.logits_aggregation
# Get the most probable cell answer
predicted_answer_coordinates, predicted_aggregation_indices = tokenizer.convert_logits_to_predictions(
inputs,
outputs.logits,
outputs.logits_aggregation
)
# Extract the answer from the table
answers = []
for coordinates in predicted_answer_coordinates:
if not coordinates:
answers.append("No answer found.")
else:
cell_values = [table.iat[row, column] for row, column in coordinates]
answers.append(", ".join(cell_values))
# Print the result
print("Answer:", answers[0])
Answer: 69
*The pipeline model, google/tapas-large-finetuned-wtq based on code from https://huggingface.co/google/tapas-large-finetuned-wtq (Apache 2.0)
The model's syntax can be a bit complex. Let's analyze this step by step. logits are raw output scores. logits_aggregation returns the scores of numeric aggregation operations. The model can perform basic operations like SUM using the table data.
For more details, you can refer to the google/tapas-base-finetuned-wtq model card on Hugging Face.
Zero-Shot Classification Example
How to Use Zero-Shot Classification with Hugging Face Pipeline
Zero-shot classification is used to predict the class of unknown data. Zero-shot classification models require text and labels. Let's see an example of Zero-shot classification using a pipeline:
from transformers import pipeline
classifier = pipeline("zero-shot-classification")
print(classifier(
"Is this a good time to buy gold?",
candidate_labels=["education", "politics", "business", "finance"]
))
{'sequence': 'Is this a good time to buy gold?', 'labels': ['finance', 'business', 'education', 'politics'], 'scores': [0.5152193307876587, 0.38664010167121887, 0.057615164667367935, 0.040525417774915695]}
You see the results in descending order. The "finance" label has the highest score.
How to Use BART Zero-Shot Classification Model Without the Pipeline
In this example, we'll use the facebook/bart-large-mnli model to perform zero-shot classification without using the pipeline. We will load the model directly:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch.nn.functional as F
# Load model and tokenizer
model_name = "facebook/bart-large-mnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Input sentence
sequence = "The pi is the ratio of the circumference of any circle to the diameter of that circle"
# Candidate labels
labels = ["education", "psychology", "sports", "finance", "math"]
# Create NLI-style premise-hypothesis pairs
premise = sequence
hypotheses = [f"This text is about {label}." for label in labels]
# Tokenize and get model outputs for each hypothesis
inputs = tokenizer([premise]*len(hypotheses), hypotheses, return_tensors="pt", padding=True, truncation=True)
with torch.no_grad():
logits = model(**inputs).logits
# Convert logits to probabilities (softmax over entailment class)
entailment_logits = logits[:, 2]
probabilities = F.softmax(entailment_logits, dim=0)
print(probabilities)
# Print results
for label, score in zip(labels, probabilities):
print(f"{label}: {score:.4f}")
tensor([0.0125, 0.0091, 0.0089, 0.0109, 0.9586])
education: 0.0125
psychology: 0.0091
sports: 0.0089
finance: 0.0109
math: 0.9586
*facebook/bart-large-mnli model from Hugging Face — licensed under the MIT License.
The BART MNLI model has complex syntax rules. Let's simplify this. We need to get model outputs for each hypothesis. There are 5 labels. Therefore, the premise ("sequence") must be provided five times. The model returns logits for contradiction, neutral, and entailment. We are interested in entailment, and its index is 2. That's why we selected the logits at index 2.
For more details about this model, please refer to the facebook/bart-large-mnli model card on Hugging Face.
What's softmax?
softmax in PyTorch is applied to all slices along dim, and will re-scale them so that the elements lie in the range [0, 1] and sum to 1.
The sum of the scores [0.0125 + 0.0091 + 0.0089 + 0.0109 + 0.9586] in the example above is 1 and the "math" label has the highest score.
For more information about softmax, visit the PyTorch docs.
Fill-Mask Example
How to Use Fill-Mask Models with Hugging Face Pipeline
The fill-mask models replace the masked word/words in a sentence.
from transformers import pipeline
unmasker = pipeline("fill-mask")
print(unmasker("The most popular sport in the world is <mask>.", top_k=2))
[{'score': 0.11612111330032349, 'token': 4191, 'token_str': ' soccer', 'sequence': 'The most popular sport in the world is soccer.'},
{'score': 0.10927936434745789, 'token': 5630, 'token_str': ' cricket', 'sequence': 'The most popular sport in the world is cricket.'}]
How to Use BERT Fill-Mask Model Without the Pipeline
In this example, we'll use the google-bert/bert-base-uncased model to perform fill-mask tasks without using the pipeline:
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
"google-bert/bert-base-uncased"
)
model = AutoModelForMaskedLM.from_pretrained(
"google-bert/bert-base-uncased",
torch_dtype=torch.float16,
device_map="auto",
attn_implementation="sdpa"
)
#See the device type explanation below
inputs = tokenizer("The most popular sport in the world is [MASK].", return_tensors="pt").to("mps")
with torch.no_grad():
outputs = model(**inputs)
predictions = outputs.logits
masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1]
predicted_token_id = predictions[0, masked_index].argmax(dim=-1)
prediction = tokenizer.decode(predicted_token_id)
print(f"The most popular sport in the world is {prediction}.")
The most popular sport in the world is football.
*google-bert/bert-base-uncased model from Hugging Face — licensed under the Apache 2.0 License.
You can use "mps" for macOS and "cuda" for devices compatible with CUDA. You can also remove it.
For more details about this model, please refer to the google-bert/bert-base-uncased model card on Hugging Face.
What's argmax?
The argmax returns the indices of the maximum value of all elements in the input tensor.
It returns the index of the maximum value to decode in the example above. For more information about argmax, visit the PyTorch docs.
Question Answering Example
How to Use Question Answering with Hugging Face Pipeline
There are different types of Question Answering (QA) tasks. If you use a pipeline for QA without specifying a model, the distilbert/distilbert-base-cased-distilled-squad model is used. It is used for extractive QA tasks. In other words, the model extracts the answer from a given text. Let's see an example of an extractive QA task using a pipeline:
from transformers import pipeline
question_answerer = pipeline("question-answering")
print(question_answerer(
question="Where does Julia live?",
context="Julia is 40 years old. She lives in London and she works as a nurse."
))
{'score': 0.9954689741134644, 'start': 36, 'end': 42, 'answer': 'London'}
How to Use BERT Question Answering Model Without the Pipeline
In this example, we'll use the deepset/bert-base-cased-squad2 model to perform question answering without using the pipeline. You can load the QA model directly:
from transformers import AutoTokenizer, BertForQuestionAnswering
import torch
tokenizer = AutoTokenizer.from_pretrained("deepset/bert-base-cased-squad2")
model = BertForQuestionAnswering.from_pretrained("deepset/bert-base-cased-squad2")
#question, text
question, text = "Where does Julia live?", "Julia is 40 years old. She lives in London and she works as a nurse."
#tokenize question and text
inputs = tokenizer(question, text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
answer_start_index = outputs.start_logits.argmax()
answer_end_index = outputs.end_logits.argmax()
predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
result = tokenizer.decode(predict_answer_tokens, skip_special_tokens=True)
print(result)
{'score': 0.9954689741134644, 'start': 36, 'end': 42, 'answer': 'London'}
*deepset/bert-base-cased-squad2 model from Hugging Face — licensed under the CC BY 4.0 License.
For more details about this model, please refer to the deepset/bert-base-cased-squad2 model card on Hugging Face.
Translation Example
How to Use Translation with Hugging Face Pipeline
Our model will translate a sentence from French to English. However, there are other models for other languages.
from transformers import pipeline
translator = pipeline("translation", "Helsinki-NLP/opus-mt-fr-en")
print(translator("C'est un beau roman."))
How to Use MarianMT Translation Model Without the Pipeline
In this example, we'll use the Helsinki-NLP/opus-mt-en-fr model to translate the same sentence from English-to-French without using the pipeline:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-fr")
text = "The food is very delicious."
inputs = tokenizer(text, return_tensors="pt").input_ids
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-fr")
outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
*"Helsinki-NLP/opus-mt-en-fr”model from Hugging Face — licensed under the Apache 2.0 License.
For more details about this model, please refer to the Helsinki-NLP/opus-mt-en-fr model card on Hugging Face.
Summary Example
How to use Summary Models with Hugging Face Pipeline
You can use summary models to summarize a text:
from transformers import pipeline
from datasets import load_dataset
ds = load_dataset("dataset_name")
text = ds["train"][0]["context"]
classifier = pipeline("summarization", max_length=100)
print(classifier(text))
How to Use BART Summary Model Without the Pipeline
We'll use the Hugging Face dataset abisee/cnn_dailymail to perform text summarization. You can write your own paragraph. In this example, we'll use the facebook/bart-large-cnn model to perform text summarization without using the pipeline.
from transformers import AutoTokenizer, BartForConditionalGeneration
checkpoint = "facebook/bart-large-cnn"
model = BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn")
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
from datasets import load_dataset
ds = load_dataset("abisee/cnn_dailymail", "1.0.0")
text = ds["train"][0]["article"]
inputs = tokenizer(text, max_length=100, return_tensors="pt")
# Generate Summary
summary_ids = model.generate(inputs["input_ids"], max_length=180,
min_length=40,
do_sample=False,
no_repeat_ngram_size=3)
print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0])
Harry Potter star Daniel Radcliffe turns 18 on Monday. He gains access to a reported $41.1 million fortune. Radcliffe says he has no plans to fritter his cash away on fast cars.
*"abisee/cnn_dailymail", "1.0.0" dataset from Hugging Face — licensed under the Apache 2.0 License.
*"facebook/bart-large-cnn" model from Hugging Face — licensed under the MIT License.
You can control how the model generates a summary. For example, you might set the minimum and maximum length of the output, as shown above.
For more details about this model, please refer to the facebook/bart-large-cnn model card on Hugging Face.
Token Classification Example
How to use Token Classification with Hugging Face Pipeline
Token classification models are used to identify entities in a text. What type of entities can a token classification model identify? It depends on the model. For example, dslim/bert-base-NER can identify four types of entities: location (LOC), organizations (ORG), person (PER), and miscellaneous (MISC).
from transformers import pipeline
classifier = pipeline("token-classification")
z = "I'm Alicia and I live in Milano."
d = classifier(z)
print(d)
for token in d:
print(token["word"], token["entity"])
[{'entity': 'B-PER', 'score': np.float32(0.9941089), 'index': 4, 'word': 'Alicia', 'start': 4, 'end': 10},
{'entity': 'B-LOC', 'score': np.float32(0.9950382), 'index': 9, 'word': 'Milano', 'start': 25, 'end': 31}]
Alicia B-PER
Milano B-LOC
How to Use BERT Token Classification Model Without the Model
In this example, we'll use the dslim/bert-base-NER model to perform token classification (NER) on the same text as before without using the pipeline.
import torch
from transformers import BertTokenizerFast, BertForTokenClassification
# Load model and tokenizer
model_name = "dslim/bert-base-NER"
tokenizer = BertTokenizerFast.from_pretrained(model_name)
model = BertForTokenClassification.from_pretrained(model_name)
# Sample input
text = "I'm Alicia and I live in Milano."
# Tokenize
tokens = tokenizer(text, return_tensors="pt", truncation=True, is_split_into_words=False)
# Forward pass
with torch.no_grad():
outputs = model(**tokens)
logits = outputs.logits # shape: (batch_size, seq_len, num_labels)
# Get predicted class indices
predictions = torch.argmax(logits, dim=2)
# Convert IDs to label names
id2label = model.config.id2label
# Token IDs
input_ids = tokens["input_ids"][0]
predicted_labels = [id2label[label_id.item()] for label_id in predictions[0]]
print(predicted_labels)
['O', 'O', 'O', 'O', 'B-PER', 'O', 'O', 'O', 'O', 'B-LOC', 'O', 'O']
*"dslim/bert-base-NER" model from Hugging Face — licensed under the MIT License.
B refers to the beginning of the entity: B-PER - Beginning of a person's name right after another person's name,
B-LOC - Beginning of a location right after another location.
For more detailed information about the dslim/bert-base-NER model, please visit the dslim/bert-base-NER model card.
Text Classification Example
How to Use Text Classification with Hugging Face Pipeline
Text classification models are designed to categorize text into predefined labels. They are widely used in tasks like sentiment analysis, spam detection, and topic labeling. In the example below, the model will determine whether a given text expresses a positive or negative sentiment.
from transformers import pipeline
text = "Your dog is super cute."
pipe = pipeline("text-classification")
result = pipe(text)
print(result[0]["label"])
POSITIVE
How to Use DistilBERT Text Classification Model Without the Pipeline
In this example, we'll use the distilbert/distilbert-base-uncased-finetuned-sst-2-english model to perform text classification without using the pipeline.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased-finetuned-sst-2-english")
inputs = tokenizer("Your dog is super cute.", return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]
print(model.config.id2label[predicted_class_id])
POSITIVE
*"distilbert-base-uncased-finetuned-sst-2-english" model from Hugging Face (Apache 2.0).
We used a simple text, but you can use the model for more complicated texts like reviews as well.
For more details about this model, please refer to the distilbert/distilbert-base-uncased-finetuned-sst-2-english model card on Hugging Face.