Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.12.0
AI Agency Pro - Usage Examples
Real-world usage patterns demonstrating vendor-driven development with HuggingFace libraries.
Quick Start
Using the Web Interface
- Navigate to the AI Agency Pro Space
- Select a tab (Summarizer, Classifier, Q&A Agent, or Chat Agent)
- Enter your input and click the action button
- View results instantly with GPU acceleration
Python API Examples
Example 1: Text Summarization
from huggingface_hub import InferenceClient
# Initialize client (official vendor pattern)
client = InferenceClient()
# Summarize text using official model
text = """
Artificial intelligence has transformed how businesses operate.
From automating customer service to optimizing supply chains,
AI technologies are driving unprecedented efficiency gains.
Companies investing in AI report 40% productivity improvements.
"""
result = client.summarization(
text,
model="facebook/bart-large-cnn",
parameters={"max_length": 150, "min_length": 30}
)
print(result)
Example 2: Zero-Shot Classification
from huggingface_hub import InferenceClient
client = InferenceClient()
# Classify text without training
text = "I just bought the new iPhone and it's amazing!"
labels = ["technology", "sports", "politics", "entertainment"]
result = client.zero_shot_classification(
text,
labels,
model="facebook/bart-large-mnli"
)
for label, score in zip(result["labels"], result["scores"]):
print(f"{label}: {score:.2%}")
Example 3: Question Answering
from huggingface_hub import InferenceClient
client = InferenceClient()
context = """
Hugging Face was founded in 2016 by Clement Delangue,
Julien Chaumond, and Thomas Wolf. The company is
headquartered in New York City and has raised over
$160 million in funding.
"""
question = "When was Hugging Face founded?"
result = client.question_answering(
question=question,
context=context,
model="deepset/roberta-base-squad2"
)
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['score']:.2%}")
Example 4: Chat with LLM
from huggingface_hub import InferenceClient
client = InferenceClient()
messages = [
{"role": "user", "content": "Explain quantum computing in simple terms."}
]
response = client.chat_completion(
messages,
model="mistralai/Mistral-7B-Instruct-v0.3",
max_tokens=500
)
print(response.choices[0].message.content)
Example 5: Streaming Chat Response
from huggingface_hub import InferenceClient
client = InferenceClient()
messages = [{"role": "user", "content": "Write a haiku about AI."}]
# Stream response for real-time output
for token in client.chat_completion(
messages,
model="mistralai/Mistral-7B-Instruct-v0.3",
max_tokens=100,
stream=True
):
print(token.choices[0].delta.content, end="")
Gradio Integration Examples
Example 6: Custom Summarization Interface
import gradio as gr
from huggingface_hub import InferenceClient
client = InferenceClient()
def summarize(text, max_length=150):
result = client.summarization(
text,
model="facebook/bart-large-cnn",
parameters={"max_length": max_length}
)
return result
iface = gr.Interface(
fn=summarize,
inputs=[
gr.Textbox(label="Text to Summarize", lines=10),
gr.Slider(50, 300, value=150, label="Max Length")
],
outputs=gr.Textbox(label="Summary"),
title="Text Summarizer"
)
iface.launch()
Example 7: Multi-Tab Application
import gradio as gr
from huggingface_hub import InferenceClient
client = InferenceClient()
def summarize(text):
return client.summarization(text, model="facebook/bart-large-cnn")
def classify(text, labels):
label_list = [l.strip() for l in labels.split(",")]
result = client.zero_shot_classification(text, label_list)
return {l: s for l, s in zip(result["labels"], result["scores"])}
with gr.Blocks() as demo:
gr.Markdown("# Multi-Agent System")
with gr.Tab("Summarizer"):
text_in = gr.Textbox(label="Input")
text_out = gr.Textbox(label="Summary")
gr.Button("Summarize").click(summarize, text_in, text_out)
with gr.Tab("Classifier"):
cls_text = gr.Textbox(label="Text")
cls_labels = gr.Textbox(label="Labels (comma-separated)")
cls_out = gr.Label(label="Results")
gr.Button("Classify").click(classify, [cls_text, cls_labels], cls_out)
demo.launch()
ZeroGPU Examples
Example 8: GPU-Accelerated Processing
import spaces
import gradio as gr
from huggingface_hub import InferenceClient
client = InferenceClient()
@spaces.GPU(duration=60) # Request GPU for 60 seconds
def heavy_processing(text):
"""GPU-accelerated inference."""
# Long-running inference task
result = client.text_generation(
text,
model="mistralai/Mistral-7B-Instruct-v0.3",
max_new_tokens=1000
)
return result
iface = gr.Interface(
fn=heavy_processing,
inputs=gr.Textbox(label="Prompt"),
outputs=gr.Textbox(label="Generated Text")
)
iface.launch()
Batch Processing Examples
Example 9: Process Multiple Documents
from huggingface_hub import InferenceClient
import asyncio
client = InferenceClient()
async def batch_summarize(documents):
"""Summarize multiple documents efficiently."""
results = []
for doc in documents:
summary = client.summarization(
doc,
model="facebook/bart-large-cnn"
)
results.append(summary)
return results
# Usage
documents = [
"First document text...",
"Second document text...",
"Third document text..."
]
summaries = asyncio.run(batch_summarize(documents))
for i, summary in enumerate(summaries):
print(f"Document {i+1}: {summary}")
Error Handling Examples
Example 10: Robust API Calls
from huggingface_hub import InferenceClient
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
client = InferenceClient()
def safe_summarize(text, max_retries=3):
"""Summarize with error handling and retries."""
for attempt in range(max_retries):
try:
if not text or len(text.strip()) < 10:
return "Error: Text too short to summarize."
result = client.summarization(
text,
model="facebook/bart-large-cnn"
)
return result
except Exception as e:
logger.warning(f"Attempt {attempt + 1} failed: {e}")
if attempt == max_retries - 1:
return f"Error: {str(e)}"
return "Error: Max retries exceeded."
Integration Patterns
Example 11: FastAPI Integration
from fastapi import FastAPI
from huggingface_hub import InferenceClient
from pydantic import BaseModel
app = FastAPI()
client = InferenceClient()
class SummarizeRequest(BaseModel):
text: str
max_length: int = 150
@app.post("/summarize")
async def summarize(request: SummarizeRequest):
result = client.summarization(
request.text,
model="facebook/bart-large-cnn",
parameters={"max_length": request.max_length}
)
return {"summary": result}
Example 12: Flask Integration
from flask import Flask, request, jsonify
from huggingface_hub import InferenceClient
app = Flask(__name__)
client = InferenceClient()
@app.route("/classify", methods=["POST"])
def classify():
data = request.json
result = client.zero_shot_classification(
data["text"],
data["labels"],
model="facebook/bart-large-mnli"
)
return jsonify(result)
Best Practices Demonstrated
- Always use InferenceClient - Official HuggingFace pattern
- Implement error handling - Graceful degradation
- Use @spaces.GPU - Efficient GPU allocation
- Add logging - Observability and debugging
- Validate inputs - Prevent API errors
- Use official model IDs - Reliability and updates