Complete HuggingFace Tutorial: From Beginner to Expert (2026 Guide)

#1
by AYI-NEDJIMI - opened

Complete HuggingFace Tutorial: From Beginner to Expert (2026 Guide)

Author: AYI-NEDJIMI | Portfolio & Collections
Date: February 2026

This comprehensive tutorial walks you through everything you need to know about HuggingFace, the leading platform for Artificial Intelligence. Whether you are a beginner or an experienced developer, you will find practical examples, working code, and advanced tips throughout.


Table of Contents

  1. Introduction to HuggingFace
  2. Using Models
  3. Using Datasets
  4. Creating a Gradio Space
  5. Fine-tuning a Model
  6. Inference API and Endpoints
  7. Collections and Community
  8. Advanced Tips

1. Introduction to HuggingFace

What is HuggingFace?

HuggingFace is often described as the "GitHub of Artificial Intelligence". Founded in 2016, the platform has become the central ecosystem for the global AI community. It hosts over 800,000 models, 200,000 datasets, and tens of thousands of interactive applications called Spaces.

Unlike GitHub, which focuses on source code, HuggingFace specializes in:

  • Machine Learning models: LLMs, vision models, audio models, multimodal systems
  • Datasets: structured data, text, images, audio collections
  • Spaces: interactive web applications deployed in minutes
  • Inference: APIs to run models without managing infrastructure

Key Concepts

Models
A model on HuggingFace is a set of files containing the trained weights of a neural network. Each model comes with a "Model Card" describing how it works, its performance benchmarks, and its limitations. You can download, use, and even fine-tune these models for free.

Datasets
Datasets are organized, versioned, and documented collections of data. HuggingFace supports standardized formats (Parquet, JSON, CSV) with powerful loading tools through the datasets library. Datasets can range from a few kilobytes to several terabytes.

Spaces
Spaces are web applications hosted directly on HuggingFace. They support Gradio, Streamlit, and Docker, allowing you to create interactive demos of your models without managing any server infrastructure. Free CPU hosting is included, with GPU options available.

Creating an Account and Getting an API Token

  1. Visit huggingface.co and click "Sign Up"
  2. Fill in your information and confirm your email address
  3. To get an API token:
    • Navigate to Settings > Access Tokens
    • Click "New token"
    • Name your token and choose the permissions (Read or Write)
    • Copy and securely store your token
# Authentication with your token
from huggingface_hub import login
login(token="hf_your_token_here")

# Or via environment variable
import os
os.environ["HF_TOKEN"] = "hf_your_token_here"

Security tip: Never share your token in public code. Always use environment variables or .env files to manage sensitive credentials.


2. Using Models

Browsing and Searching Models

The Model Hub provides powerful filtering capabilities:

  • By task: text-generation, text-classification, image-classification, translation, etc.
  • By framework: PyTorch, TensorFlow, JAX, ONNX
  • By language: filter models by supported language
  • By license: Apache 2.0, MIT, CC-BY, commercial-friendly, etc.

For a detailed comparison of the best open-source LLMs in 2026, check out our guide: Open Source LLM Comparison 2026

Loading a Model with Transformers

The transformers library from HuggingFace is the reference tool for loading and running models:

# Installation
# pip install transformers torch accelerate

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load a text generation model
model_name = "Qwen/Qwen2.5-1.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Generate text
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain machine learning in 3 sentences."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Using pipeline() - The Easiest Method

The pipeline() function is the fastest way to use any model on HuggingFace:

from transformers import pipeline

# Text Classification
classifier = pipeline("text-classification", model="cardiffnlp/twitter-roberta-base-sentiment-latest")
result = classifier("HuggingFace is an amazing platform for AI!")
print(result)
# [{"label": "positive", "score": 0.97}]

# Text Generation
generator = pipeline("text-generation", model="Qwen/Qwen2.5-0.5B-Instruct")
result = generator("Artificial intelligence will", max_new_tokens=100)
print(result[0]["generated_text"])

# Translation
translator = pipeline("translation_en_to_fr", model="Helsinki-NLP/opus-mt-en-fr")
result = translator("Hello, how are you today?")
print(result)

# Question-Answering
qa = pipeline("question-answering", model="deepset/roberta-base-squad2")
result = qa(
    question="What is HuggingFace?",
    context="HuggingFace is an AI platform that hosts models, datasets, and applications for machine learning."
)
print(f"Answer: {result['answer']} (score: {result['score']:.4f})")

# Embeddings (for RAG and semantic search)
embedder = pipeline("feature-extraction", model="sentence-transformers/all-MiniLM-L6-v2")
embedding = embedder("This is an example text for generating embeddings")
print(f"Embedding dimension: {len(embedding[0][0])}")

# Named Entity Recognition
ner = pipeline("ner", model="dslim/bert-base-NER", aggregation_strategy="simple")
result = ner("HuggingFace was founded in New York by Clement Delangue and Julien Chaumond.")
for entity in result:
    print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.4f})")

# Summarization
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
text = (
    "HuggingFace is a platform that hosts machine learning models, datasets, "
    "and applications. It was founded in 2016 and has become the central hub for the "
    "AI community. The platform supports multiple frameworks including PyTorch and "
    "TensorFlow, and offers both free and paid tiers for inference."
)
result = summarizer(text, max_length=50, min_length=20)
print(result[0]["summary_text"])

To learn more about using embeddings in a RAG architecture, check out: RAG Guide - Retrieval Augmented Generation

Using the Free Inference API

You can use models without installing anything locally thanks to the Inference API:

from huggingface_hub import InferenceClient

client = InferenceClient(token="hf_your_token")

# Text generation
response = client.text_generation(
    "The main advantages of machine learning are",
    model="Qwen/Qwen2.5-72B-Instruct",
    max_new_tokens=200
)
print(response)

# Chat completion (OpenAI-compatible format)
response = client.chat_completion(
    messages=[
        {"role": "user", "content": "Explain deep learning in simple terms."}
    ],
    model="Qwen/Qwen2.5-72B-Instruct",
    max_tokens=500
)
print(response.choices[0].message.content)

# Image classification
result = client.image_classification(
    "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Cat_November_2010-1a.jpg/1200px-Cat_November_2010-1a.jpg",
    model="google/vit-base-patch16-224"
)
print(result)

3. Using Datasets

Browsing Datasets

The Dataset Hub contains over 200,000 datasets covering every domain: NLP, computer vision, audio, tabular data, and more.

Filter by:

  • Task: question-answering, text-classification, image-classification
  • Size: from a few KB to several TB
  • Language: English, French, multilingual, and 100+ others
  • License: open, commercial, research-only

Loading a Dataset with the datasets Library

# Installation
# pip install datasets

from datasets import load_dataset

# Load a popular dataset
dataset = load_dataset("imdb")
print(dataset)
# DatasetDict({
#     train: Dataset({features: ["text", "label"], num_rows: 25000}),
#     test: Dataset({features: ["text", "label"], num_rows: 25000})
# })

# Access the data
print(dataset["train"][0])
print(f"Number of training examples: {len(dataset['train'])}")

# Load a specific configuration with a subset
dataset = load_dataset("squad_v2", split="train[:1000]")  # First 1000 examples
print(dataset[0])

# Load a CSV file as a dataset
dataset = load_dataset("csv", data_files="my_data.csv")

# Load from a JSON file
dataset = load_dataset("json", data_files="my_data.json")

Filtering, Splitting, and Processing

from datasets import load_dataset

dataset = load_dataset("imdb", split="train")

# Filter positive examples
positive = dataset.filter(lambda x: x["label"] == 1)
print(f"Positive examples: {len(positive)}")

# Apply a transformation (map)
def tokenize_function(examples):
    from transformers import AutoTokenizer
    tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
    return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=512)

tokenized = dataset.map(tokenize_function, batched=True)

# Split into train/validation
split_dataset = dataset.train_test_split(test_size=0.2, seed=42)
print(f"Train: {len(split_dataset['train'])}, Validation: {len(split_dataset['test'])}")

# Shuffle the data
shuffled = dataset.shuffle(seed=42)

# Select specific columns
subset = dataset.select_columns(["text"])

# Sort by a column
sorted_dataset = dataset.sort("label")

# Rename columns
renamed = dataset.rename_column("text", "input_text")

# Remove columns
cleaned = dataset.remove_columns(["label"])

Creating and Uploading Your Own Dataset

from datasets import Dataset, DatasetDict

# Create a dataset from a dictionary
data = {
    "question": [
        "What is GDPR?",
        "How does AES encryption work?",
        "What is a phishing attack?",
        "Define the principle of least privilege",
        "What is ISO 27001?",
        "What is a zero-day vulnerability?",
        "Explain multi-factor authentication",
        "What is a SOC (Security Operations Center)?"
    ],
    "answer": [
        "GDPR is the General Data Protection Regulation of the European Union.",
        "AES is a symmetric block cipher that encrypts data in 128-bit blocks.",
        "Phishing is a social engineering technique aimed at stealing information.",
        "Least privilege means granting only the minimum permissions necessary.",
        "ISO 27001 is the international standard for information security management systems.",
        "A zero-day is a vulnerability unknown to the vendor with no available patch.",
        "MFA requires two or more verification factors to access a resource.",
        "A SOC is a centralized unit that monitors and responds to security incidents."
    ],
    "category": [
        "compliance", "cryptography", "threats", "access-control",
        "compliance", "vulnerabilities", "authentication", "operations"
    ]
}

dataset = Dataset.from_dict(data)

# Split into train/test
dataset_dict = DatasetDict({
    "train": dataset.select(range(6)),
    "test": dataset.select(range(6, 8))
})

# Save locally
dataset_dict.save_to_disk("./my_cybersec_dataset")

# Upload to HuggingFace
dataset_dict.push_to_hub(
    "your-username/cybersecurity-qa-en",
    token="hf_your_token",
    private=False
)
print("Dataset uploaded successfully!")

Explore cybersecurity datasets on the AYI-NEDJIMI profile for real-world examples.


4. Creating a Gradio Space

What are Spaces?

HuggingFace Spaces are free-hosted web applications. They support three frameworks:

  • Gradio: ideal for ML model demos (most popular choice)
  • Streamlit: perfect for data dashboards and visualizations
  • Docker: for fully customized applications

Benefits:

  • Free hosting (basic CPU tier)
  • GPU available (free community GPU or paid for dedicated power)
  • Automatic deployment on every git commit
  • Custom domain support available
  • Built-in authentication options

Creating a Simple Gradio Application

# app.py - Gradio Demo Application
import gradio as gr
from transformers import pipeline

# Load the model
classifier = pipeline("text-classification", model="cardiffnlp/twitter-roberta-base-sentiment-latest")

def analyze_sentiment(text):
    if not text.strip():
        return "Please enter some text."
    result = classifier(text)[0]
    label = result["label"]
    score = result["score"]
    sentiment_map = {"positive": "Positive", "negative": "Negative", "neutral": "Neutral"}
    sentiment = sentiment_map.get(label, label)
    return f"Sentiment: {sentiment} - Confidence: {score:.2%}"

def compare_texts(text1, text2):
    if not text1.strip() or not text2.strip():
        return "Please enter both texts."
    result1 = classifier(text1)[0]
    result2 = classifier(text2)[0]
    return f"Text 1: {result1['label']} ({result1['score']:.2%})\nText 2: {result2['label']} ({result2['score']:.2%})"

# Build the interface with tabs
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("# Sentiment Analysis Tool")
    gr.Markdown("Analyze text sentiment using a fine-tuned RoBERTa model.")

    with gr.Tab("Single Analysis"):
        input_text = gr.Textbox(label="Enter your text", placeholder="Type text to analyze...", lines=3)
        output_text = gr.Textbox(label="Analysis Result")
        analyze_btn = gr.Button("Analyze", variant="primary")
        analyze_btn.click(fn=analyze_sentiment, inputs=input_text, outputs=output_text)

        gr.Examples(
            examples=[
                ["HuggingFace is an incredible platform for AI development!"],
                ["This service is really disappointing and slow."],
                ["The meeting tomorrow is at 2 PM in the conference room."]
            ],
            inputs=input_text
        )

    with gr.Tab("Compare Two Texts"):
        with gr.Row():
            text1 = gr.Textbox(label="Text 1", lines=3)
            text2 = gr.Textbox(label="Text 2", lines=3)
        compare_output = gr.Textbox(label="Comparison Result")
        compare_btn = gr.Button("Compare", variant="primary")
        compare_btn.click(fn=compare_texts, inputs=[text1, text2], outputs=compare_output)

demo.launch()

Deploying to HuggingFace

Method 1: Via the Web Interface

  1. Go to huggingface.co/new-space
  2. Choose a name, select "Gradio" as the SDK
  3. Upload your app.py file
  4. Add a requirements.txt if needed

Method 2: Via Command Line

# Install the CLI
pip install huggingface_hub

# Log in
huggingface-cli login

# Create the Space
huggingface-cli repo create my-space --type space --space-sdk gradio

# Clone and deploy
git clone https://huggingface.co/spaces/your-username/my-space
cd my-space
# Copy your app.py and requirements.txt files
git add .
git commit -m "Initial deployment"
git push

Method 3: Via the Python API

from huggingface_hub import HfApi

api = HfApi(token="hf_your_token")

# Create the Space
api.create_repo(
    repo_id="your-username/my-space-demo",
    repo_type="space",
    space_sdk="gradio",
    private=False
)

# Upload files
api.upload_file(
    path_or_fileobj="app.py",
    path_in_repo="app.py",
    repo_id="your-username/my-space-demo",
    repo_type="space"
)

api.upload_file(
    path_or_fileobj="requirements.txt",
    path_in_repo="requirements.txt",
    repo_id="your-username/my-space-demo",
    repo_type="space"
)

print("Space deployed successfully!")

requirements.txt Example

gradio>=4.0.0
transformers
torch
accelerate
sentencepiece

Secrets and Environment Variables

To protect your API keys and tokens:

  1. In the Space settings, go to "Repository secrets"
  2. Add your secrets (e.g., HF_TOKEN, OPENAI_API_KEY)
  3. Access them in your code:
import os

hf_token = os.environ.get("HF_TOKEN")
api_key = os.environ.get("OPENAI_API_KEY")

# Secrets are never exposed in logs or to users

You can also set environment variables (non-secret) in the Space settings for configuration values that do not need to be hidden.

Check out our Spaces in action: Dataset-Explorer and Model-Playground


5. Fine-tuning a Model

What is Fine-tuning?

Fine-tuning is the process of adapting a pre-trained model to your specific use case. Instead of training a model from scratch (which can cost millions of dollars), you start with an existing model and refine it with your own data.

Benefits:

  • Superior performance on your specific domain and tasks
  • Requires less data (hundreds to thousands of examples are often sufficient)
  • Training is much faster and more cost-effective than training from scratch
  • The model retains its general knowledge while gaining domain expertise

For a detailed guide on fine-tuning with LoRA and QLoRA, see: LLM Fine-tuning with LoRA/QLoRA

QLoRA Explained Simply

LoRA (Low-Rank Adaptation) is a technique that modifies only a small fraction of the model parameters (typically 0.1% to 1%) by adding low-rank matrices to existing layers. This dramatically reduces the memory and compute requirements for fine-tuning.

QLoRA (Quantized LoRA) takes this further:

  1. The base model is quantized to 4-bit precision (approximately 75% memory reduction)
  2. LoRA adapters are trained in normal precision (16-bit)
  3. Result: You can fine-tune a 7B parameter model on a single GPU with just 16 GB of VRAM
Original model (7B params, 14 GB) --> 4-bit quantized (3.5 GB) + LoRA adapters (50 MB)
= Fine-tuning possible on a consumer-grade GPU!

Why this matters: Before QLoRA, fine-tuning large models required expensive multi-GPU setups. Now, anyone with a modern gaming GPU can fine-tune state-of-the-art models.

Step-by-Step Fine-tuning with PEFT

# Required installation
# pip install transformers peft trl datasets bitsandbytes accelerate

import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments
)
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
from trl import SFTTrainer
from datasets import load_dataset

# 1. Configure 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True  # Double quantization for extra savings
)

# 2. Load the model and tokenizer
model_name = "Qwen/Qwen2.5-1.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

# 3. Prepare the model for training
model = prepare_model_for_kbit_training(model)

# 4. Configure LoRA
lora_config = LoraConfig(
    r=16,                      # Rank of LoRA matrices (higher = more capacity)
    lora_alpha=32,             # Scaling factor (typically 2x the rank)
    target_modules=[           # Layers to apply LoRA to
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ],
    lora_dropout=0.05,         # Dropout for regularization
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
# Trainable params: ~4M / Total: ~1.5B = 0.27%

# 5. Prepare the training data
dataset = load_dataset("json", data_files="training_data.json", split="train")

def format_instruction(example):
    return f"### Instruction:\n{example['instruction']}\n\n### Response:\n{example['output']}"

# 6. Configure training hyperparameters
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,    # Effective batch size = 4 * 4 = 16
    learning_rate=2e-4,
    weight_decay=0.01,
    warmup_ratio=0.03,
    lr_scheduler_type="cosine",
    logging_steps=10,
    save_strategy="epoch",
    fp16=True,
    optim="paged_adamw_32bit",        # Memory-efficient optimizer
    report_to="none",
    gradient_checkpointing=True        # Saves memory at cost of speed
)

# 7. Launch training
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    tokenizer=tokenizer,
    args=training_args,
    formatting_func=format_instruction,
    max_seq_length=2048
)

trainer.train()

# 8. Save and upload the fine-tuned model
trainer.save_model("./my-finetuned-model")

# Upload to HuggingFace
model.push_to_hub("your-username/my-cybersec-model-en", token="hf_your_token")
tokenizer.push_to_hub("your-username/my-cybersec-model-en", token="hf_your_token")
print("Fine-tuned model uploaded successfully!")

Evaluating Your Fine-tuned Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load the base model and LoRA adapter
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
model = PeftModel.from_pretrained(base_model, "your-username/my-cybersec-model-en")
tokenizer = AutoTokenizer.from_pretrained("your-username/my-cybersec-model-en")

# Test with a domain-specific prompt
messages = [
    {"role": "system", "content": "You are a cybersecurity expert."},
    {"role": "user", "content": "What are the key steps in incident response?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

To deploy your fine-tuned model in production, see: Deploying LLMs in Production with GPU


6. Inference API and Endpoints

The Free Inference API

HuggingFace provides a free Inference API for testing and prototyping with models. It is ideal for experimentation and low-volume use cases.

from huggingface_hub import InferenceClient

client = InferenceClient(token="hf_your_token")

# Chat completion
response = client.chat_completion(
    messages=[
        {"role": "system", "content": "You are a cybersecurity expert."},
        {"role": "user", "content": "What are the top 3 security threats in 2026?"}
    ],
    model="Qwen/Qwen2.5-72B-Instruct",
    max_tokens=1000,
    temperature=0.7
)
print(response.choices[0].message.content)

# Streaming (real-time output)
stream = client.chat_completion(
    messages=[{"role": "user", "content": "Tell me a story about AI"}],
    model="Qwen/Qwen2.5-72B-Instruct",
    max_tokens=500,
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

# Image generation
image = client.text_to_image(
    "A robot learning to code, cyberpunk style, high quality, detailed",
    model="stabilityai/stable-diffusion-xl-base-1.0"
)
image.save("robot_coder.png")

# Speech recognition
result = client.automatic_speech_recognition(
    "audio.mp3",
    model="openai/whisper-large-v3"
)
print(result["text"])

# Text-to-speech
audio = client.text_to_speech(
    "Welcome to HuggingFace, the leading platform for AI!",
    model="facebook/mms-tts-eng"
)
with open("output.mp3", "wb") as f:
    f.write(audio)

Dedicated Inference Endpoints

For production workloads, Inference Endpoints offer guaranteed performance and availability:

from huggingface_hub import InferenceClient

# Create an endpoint via the web interface or API, then use it:
client = InferenceClient(
    model="https://your-endpoint.endpoints.huggingface.cloud",
    token="hf_your_token"
)

response = client.chat_completion(
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=200
)
print(response.choices[0].message.content)

Endpoint Pricing (indicative, 2026):

  • CPU: from ~0.06 USD/hour
  • GPU T4 (16 GB): ~0.60 USD/hour
  • GPU A10G (24 GB): ~1.30 USD/hour
  • GPU A100 (80 GB): ~4.00 USD/hour
  • GPU H100: contact sales for pricing

Key features of Inference Endpoints:

  • Auto-scaling (scale to zero when idle to save costs)
  • Private link connectivity (VPC peering)
  • Custom container support
  • Monitoring and logging built-in

Using the Python Client

from huggingface_hub import InferenceClient

client = InferenceClient(token="hf_your_token")

# Embeddings (for RAG pipelines)
embeddings = client.feature_extraction(
    "This is example text for generating embeddings",
    model="sentence-transformers/all-MiniLM-L6-v2"
)
print(f"Dimension: {len(embeddings[0])}")

# Zero-shot classification
result = client.zero_shot_classification(
    "HuggingFace launches a new inference feature",
    labels=["technology", "sports", "politics", "finance"],
    model="facebook/bart-large-mnli"
)
print(result)

# Summarization
summary = client.summarization(
    "HuggingFace is an AI platform that hosts models, datasets, and applications. "
    "It enables developers to share and discover machine learning models. "
    "The platform supports many frameworks including PyTorch and TensorFlow. "
    "It was founded in 2016 and has grown to become the central hub for AI.",
    model="facebook/bart-large-cnn"
)
print(summary)

# Fill-mask (predict missing words)
result = client.fill_mask(
    "The capital of France is [MASK].",
    model="bert-base-uncased"
)
for prediction in result:
    print(f"{prediction['token_str']}: {prediction['score']:.4f}")

Rate Limits and Pricing

  • Free API: Limited in request rate (varies by model popularity and your account tier)
  • Response time: Variable; popular models have faster cold-start times
  • Cold models: First request may be slower as the model loads into memory
  • PRO subscription: HuggingFace PRO (9 USD/month) provides higher rate limits and priority access
  • Enterprise: Custom rate limits and SLAs available for organizations

7. Collections and Community

Creating Collections

Collections allow you to organize and share related groups of models, datasets, and Spaces:

from huggingface_hub import HfApi

api = HfApi(token="hf_your_token")

# Create a collection
collection = api.create_collection(
    title="Cybersecurity AI - Essential Tools",
    description="Curated collection of the best models and datasets for cybersecurity",
    namespace="your-username",
    private=False
)

# Add items to the collection
api.add_collection_item(
    collection_slug=collection.slug,
    item_id="your-username/cybersec-model",
    item_type="model"
)
api.add_collection_item(
    collection_slug=collection.slug,
    item_id="your-username/cybersec-dataset",
    item_type="dataset"
)
api.add_collection_item(
    collection_slug=collection.slug,
    item_id="your-username/cybersec-space",
    item_type="space"
)

print(f"Collection created: {collection.slug}")

Explore the collections by AYI-NEDJIMI to discover curated resources on cybersecurity and AI.

Community Discussions

Every model, dataset, and Space has a "Community" tab for discussions:

from huggingface_hub import HfApi

api = HfApi(token="hf_your_token")

# Create a discussion
discussion = api.create_discussion(
    repo_id="your-username/my-model",
    title="Question about model performance",
    description="I tested the model on an English dataset and here are my results..."
)

# Comment on an existing discussion
api.comment_discussion(
    repo_id="your-username/my-model",
    discussion_num=discussion.num,
    comment="Thanks for the feedback! Here are some suggestions..."
)

Likes, Bookmarks, and Following

  • Likes: Show appreciation by liking repositories
  • Bookmarks: Save repos to revisit later
  • Following: Follow users and organizations to get notified of new publications
from huggingface_hub import HfApi

api = HfApi(token="hf_your_token")

# Like a model
api.like("Qwen/Qwen2.5-72B-Instruct")

# Like a dataset
api.like("imdb", repo_type="dataset")

# Like a Space
api.like("AYI-NEDJIMI/Dataset-Explorer", repo_type="space")

8. Advanced Tips

Model Card Best Practices

A well-written Model Card increases visibility and trust in your model:

---
language:
  - en
  - fr
license: apache-2.0
tags:
  - cybersecurity
  - text-classification
  - fine-tuned
  - qlora
  - security
datasets:
  - your-username/cybersec-dataset
metrics:
  - accuracy
  - f1
  - precision
  - recall
pipeline_tag: text-classification
base_model: Qwen/Qwen2.5-1.5B-Instruct
---

Include in your Model Card:

  • Description: Clear explanation of the model and its purpose
  • Performance: Table of metrics (Accuracy, F1, Precision, Recall)
  • Usage: Working code examples showing how to use the model
  • Training Details: Base model, method, data size, hardware used
  • Limitations: Known limitations and potential biases
  • Citation: How to cite your model in academic work

SEO for Your HuggingFace Repositories

To maximize the visibility and discoverability of your repos:

  1. Descriptive title: Include primary keywords in the repo name (e.g., cybersec-threat-classifier-en rather than my-model-v2)
  2. Relevant tags: Use all applicable tags (language, task, framework, domain)
  3. Rich description: Write a detailed README with practical code examples
  4. External links: Link to your website, blog posts, and related documentation
  5. Regular updates: Keep your models and datasets updated with new versions
  6. Community engagement: Respond to questions and participate in discussions promptly
  7. Collections: Organize your repos into thematic collections for easy discovery
  8. Benchmarks: Include performance metrics and comparisons with other models
  9. Examples: Provide working code examples and demo Spaces
  10. Cross-linking: Reference your other repos and resources within each project

Monetization Options

HuggingFace offers several monetization possibilities:

  • Private models: Paid access to your models via dedicated endpoints
  • Paid Spaces: Applications with premium GPU hardware
  • Consulting: Showcase your expertise through your profile and published work
  • Enterprise Hub: HuggingFace Enterprise solutions for organizations
  • Training and courses: Create educational content linked to your repositories
  • Custom inference: Offer specialized inference endpoints for specific industries

Organization Accounts

For teams and enterprises:

from huggingface_hub import HfApi

api = HfApi(token="hf_your_token")

# Create a repo under an organization
api.create_repo(
    repo_id="my-organization/my-model",
    repo_type="model",
    private=True
)

# Manage members (via the web interface)
# Settings > Members > Invite

Organizations enable:

  • Centralized management of all repositories
  • Granular access control (read, write, admin roles)
  • Unified billing for paid features
  • Customized profile page with organization branding
  • Team collaboration features
  • SSO/SAML authentication for enterprise security

Conclusion

HuggingFace has become an indispensable pillar of the AI ecosystem. Whether you want to use existing models, create your own datasets, deploy interactive applications, or fine-tune state-of-the-art models, the platform provides all the tools you need.

The combination of free hosting, powerful APIs, an active community, and comprehensive tooling makes HuggingFace the best starting point for any AI project in 2026.

Further Reading:

Author Profile: AYI-NEDJIMI on HuggingFace
Spaces: Dataset-Explorer | Model-Playground


This tutorial is published under the CC-BY-4.0 license. Feel free to share it and contribute via community discussions.

Sign up or log in to comment