# AI Agency Pro - Usage Examples Real-world usage patterns demonstrating vendor-driven development with HuggingFace libraries. --- ## Quick Start ### Using the Web Interface 1. Navigate to the [AI Agency Pro Space](https://huggingface.co/spaces/dlynch90/AI-Agency-Pro) 2. Select a tab (Summarizer, Classifier, Q&A Agent, or Chat Agent) 3. Enter your input and click the action button 4. View results instantly with GPU acceleration --- ## Python API Examples ### Example 1: Text Summarization ```python from huggingface_hub import InferenceClient # Initialize client (official vendor pattern) client = InferenceClient() # Summarize text using official model text = """ Artificial intelligence has transformed how businesses operate. From automating customer service to optimizing supply chains, AI technologies are driving unprecedented efficiency gains. Companies investing in AI report 40% productivity improvements. """ result = client.summarization( text, model="facebook/bart-large-cnn", parameters={"max_length": 150, "min_length": 30} ) print(result) ``` ### Example 2: Zero-Shot Classification ```python from huggingface_hub import InferenceClient client = InferenceClient() # Classify text without training text = "I just bought the new iPhone and it's amazing!" labels = ["technology", "sports", "politics", "entertainment"] result = client.zero_shot_classification( text, labels, model="facebook/bart-large-mnli" ) for label, score in zip(result["labels"], result["scores"]): print(f"{label}: {score:.2%}") ``` ### Example 3: Question Answering ```python from huggingface_hub import InferenceClient client = InferenceClient() context = """ Hugging Face was founded in 2016 by Clement Delangue, Julien Chaumond, and Thomas Wolf. The company is headquartered in New York City and has raised over $160 million in funding. """ question = "When was Hugging Face founded?" result = client.question_answering( question=question, context=context, model="deepset/roberta-base-squad2" ) print(f"Answer: {result['answer']}") print(f"Confidence: {result['score']:.2%}") ``` ### Example 4: Chat with LLM ```python from huggingface_hub import InferenceClient client = InferenceClient() messages = [ {"role": "user", "content": "Explain quantum computing in simple terms."} ] response = client.chat_completion( messages, model="mistralai/Mistral-7B-Instruct-v0.3", max_tokens=500 ) print(response.choices[0].message.content) ``` ### Example 5: Streaming Chat Response ```python from huggingface_hub import InferenceClient client = InferenceClient() messages = [{"role": "user", "content": "Write a haiku about AI."}] # Stream response for real-time output for token in client.chat_completion( messages, model="mistralai/Mistral-7B-Instruct-v0.3", max_tokens=100, stream=True ): print(token.choices[0].delta.content, end="") ``` --- ## Gradio Integration Examples ### Example 6: Custom Summarization Interface ```python import gradio as gr from huggingface_hub import InferenceClient client = InferenceClient() def summarize(text, max_length=150): result = client.summarization( text, model="facebook/bart-large-cnn", parameters={"max_length": max_length} ) return result iface = gr.Interface( fn=summarize, inputs=[ gr.Textbox(label="Text to Summarize", lines=10), gr.Slider(50, 300, value=150, label="Max Length") ], outputs=gr.Textbox(label="Summary"), title="Text Summarizer" ) iface.launch() ``` ### Example 7: Multi-Tab Application ```python import gradio as gr from huggingface_hub import InferenceClient client = InferenceClient() def summarize(text): return client.summarization(text, model="facebook/bart-large-cnn") def classify(text, labels): label_list = [l.strip() for l in labels.split(",")] result = client.zero_shot_classification(text, label_list) return {l: s for l, s in zip(result["labels"], result["scores"])} with gr.Blocks() as demo: gr.Markdown("# Multi-Agent System") with gr.Tab("Summarizer"): text_in = gr.Textbox(label="Input") text_out = gr.Textbox(label="Summary") gr.Button("Summarize").click(summarize, text_in, text_out) with gr.Tab("Classifier"): cls_text = gr.Textbox(label="Text") cls_labels = gr.Textbox(label="Labels (comma-separated)") cls_out = gr.Label(label="Results") gr.Button("Classify").click(classify, [cls_text, cls_labels], cls_out) demo.launch() ``` --- ## ZeroGPU Examples ### Example 8: GPU-Accelerated Processing ```python import spaces import gradio as gr from huggingface_hub import InferenceClient client = InferenceClient() @spaces.GPU(duration=60) # Request GPU for 60 seconds def heavy_processing(text): """GPU-accelerated inference.""" # Long-running inference task result = client.text_generation( text, model="mistralai/Mistral-7B-Instruct-v0.3", max_new_tokens=1000 ) return result iface = gr.Interface( fn=heavy_processing, inputs=gr.Textbox(label="Prompt"), outputs=gr.Textbox(label="Generated Text") ) iface.launch() ``` --- ## Batch Processing Examples ### Example 9: Process Multiple Documents ```python from huggingface_hub import InferenceClient import asyncio client = InferenceClient() async def batch_summarize(documents): """Summarize multiple documents efficiently.""" results = [] for doc in documents: summary = client.summarization( doc, model="facebook/bart-large-cnn" ) results.append(summary) return results # Usage documents = [ "First document text...", "Second document text...", "Third document text..." ] summaries = asyncio.run(batch_summarize(documents)) for i, summary in enumerate(summaries): print(f"Document {i+1}: {summary}") ``` --- ## Error Handling Examples ### Example 10: Robust API Calls ```python from huggingface_hub import InferenceClient import logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) client = InferenceClient() def safe_summarize(text, max_retries=3): """Summarize with error handling and retries.""" for attempt in range(max_retries): try: if not text or len(text.strip()) < 10: return "Error: Text too short to summarize." result = client.summarization( text, model="facebook/bart-large-cnn" ) return result except Exception as e: logger.warning(f"Attempt {attempt + 1} failed: {e}") if attempt == max_retries - 1: return f"Error: {str(e)}" return "Error: Max retries exceeded." ``` --- ## Integration Patterns ### Example 11: FastAPI Integration ```python from fastapi import FastAPI from huggingface_hub import InferenceClient from pydantic import BaseModel app = FastAPI() client = InferenceClient() class SummarizeRequest(BaseModel): text: str max_length: int = 150 @app.post("/summarize") async def summarize(request: SummarizeRequest): result = client.summarization( request.text, model="facebook/bart-large-cnn", parameters={"max_length": request.max_length} ) return {"summary": result} ``` ### Example 12: Flask Integration ```python from flask import Flask, request, jsonify from huggingface_hub import InferenceClient app = Flask(__name__) client = InferenceClient() @app.route("/classify", methods=["POST"]) def classify(): data = request.json result = client.zero_shot_classification( data["text"], data["labels"], model="facebook/bart-large-mnli" ) return jsonify(result) ``` --- ## Best Practices Demonstrated 1. **Always use InferenceClient** - Official HuggingFace pattern 2. **Implement error handling** - Graceful degradation 3. **Use @spaces.GPU** - Efficient GPU allocation 4. **Add logging** - Observability and debugging 5. **Validate inputs** - Prevent API errors 6. **Use official model IDs** - Reliability and updates --- ## References - [HuggingFace Hub Python Library](https://huggingface.co/docs/huggingface_hub) - [Gradio Documentation](https://gradio.app/docs) - [Spaces ZeroGPU](https://huggingface.co/docs/hub/spaces-zerogpu) - [Inference API](https://huggingface.co/docs/api-inference)