Upload folder using huggingface_hub

Browse files

Files changed (10) hide show

.gemini/extensions/huggingface/GEMINI.md +16 -0
.gemini/extensions/huggingface/KNOWLEDGE_BASE.md +33 -0
.gemini/extensions/huggingface/gemini-extension.json +10 -0
.gemini/extensions/huggingface/pyproject.toml +15 -0
.gemini/extensions/huggingface/run.sh +30 -0
.gemini/extensions/huggingface/src/huggingface/__init__.py +0 -0
.gemini/extensions/huggingface/src/huggingface/__pycache__/__init__.cpython-312.pyc +0 -0
.gemini/extensions/huggingface/src/huggingface/__pycache__/server.cpython-312.pyc +0 -0
.gemini/extensions/huggingface/src/huggingface/server.py +36 -0
.gemini/extensions/huggingface/src/huggingface/tools.py +156 -0

.gemini/extensions/huggingface/GEMINI.md ADDED Viewed

	@@ -0,0 +1,16 @@

+# Hugging Face Extension for Gemini CLI
+This extension provides tools to interact with the Hugging Face Hub and access the EbookBuilder knowledge base.
+## Available Tools
+- `search_repos`: Search for models, datasets, or spaces on the Hub.
+- `get_repo_info`: Get detailed information about a repository.
+- `download_repo`: Download a repository to a local directory.
+- `query_knowledge_base`: Search the EbookBuilder knowledge base for documentation and guides.
+## Usage Guidelines
+- Always use the `HUGGINGFACE_HUB_TOKEN` environment variable for authenticated requests.
+- When searching, specify the `repo_type` (model, dataset, or space).
+- The knowledge base contains specific workflows for ebook generation. Query it if you need guidance on the ebook building process.

.gemini/extensions/huggingface/KNOWLEDGE_BASE.md ADDED Viewed

	@@ -0,0 +1,33 @@

+# EbookBuilder Knowledge Base
+## Overview
+EbookBuilder is a system designed to leverage AI models on Hugging Face to generate, format, and publish ebooks.
+## Key Components
+### 1. Content Generation
+- **Models**: Use Large Language Models (LLMs) like Llama-3, Mistral, or Gemini (via HF) to generate book content.
+- **Prompting**: Use structured prompts to ensure consistent style and tone across chapters.
+### 2. Formatting & Conversion
+- **Markdown**: The primary format for draft content.
+- **Conversion Tools**: Use libraries like `pandoc` or custom Python scripts to convert Markdown to EPUB, PDF, or MOBI.
+- **Images**: Generate covers and illustrations using Diffusion models (e.g., Stable Diffusion) hosted on Hugging Face.
+### 3. Hosting & Deployment
+- **Hugging Face Spaces**: Host the EbookBuilder interface (using Gradio or Streamlit) for user interaction.
+- **Datasets**: Store generated book metadata and files in Hugging Face Datasets for versioning and sharing.
+## Workflows
+### Creating a New Ebook
+1. Define the book title and outline.
+2. Generate content chapter by chapter using a chosen LLM.
+3. Review and edit the markdown content.
+4. Generate cover art using an image model.
+5. Compile the final ebook using the conversion module.
+## Best Practices
+- **Human-in-the-loop**: Always review AI-generated content for accuracy and flow.
+- **Modular Design**: Keep the generation logic separate from the formatting logic for easier updates.
+- **Token Management**: Monitor API usage when using hosted models to avoid hitting rate limits.

.gemini/extensions/huggingface/gemini-extension.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "name": "huggingface",
+  "version": "0.1.0",
+  "mcpServers": {
+    "HuggingFaceMcpServer": {
+      "command": "${extensionPath}/run.sh",
+      "args": []
+    }
+  }
+}

.gemini/extensions/huggingface/pyproject.toml ADDED Viewed

	@@ -0,0 +1,15 @@

+[project]
+name = "huggingface"
+version = "0.1.0"
+description = "Hugging Face MCP server extension"
+dependencies = [
+    "huggingface_hub",
+    "mcp[server]",
+    "fastmcp",
+    "python-dotenv",
+    "absl-py"
+]
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"

.gemini/extensions/huggingface/run.sh ADDED Viewed

	@@ -0,0 +1,30 @@

+#!/bin/bash
+set -e
+# Change to the directory containing this script.
+cd "$(dirname "${BASH_SOURCE[0]}")" || { echo "ERROR: Could not change to script directory." >&2; exit 1; }
+# Check for uv and install if not present
+if ! command -v uv &> /dev/null
+then
+    echo "'uv' is not found. Attempting to install it..."
+    curl -LsSf https://astral.sh/uv/install.sh | sh
+    export PATH="$HOME/.local/bin:$PATH"
+fi
+# Create virtual environment
+if [ ! -d ".venv" ]; then
+    echo "Creating virtual environment..."
+    uv venv
+fi
+# Install dependencies
+echo "Installing dependencies..."
+uv pip install .
+# Set the token from the user's provided value if it's in the environment
+# The agent will handle the token securely.
+# Start the server
+echo "Starting Hugging Face server..."
+./.venv/bin/python3 -m huggingface.server

.gemini/extensions/huggingface/src/huggingface/__init__.py ADDED Viewed

File without changes

.gemini/extensions/huggingface/src/huggingface/__pycache__/__init__.cpython-312.pyc ADDED Viewed

Binary file (184 Bytes). View file

.gemini/extensions/huggingface/src/huggingface/__pycache__/server.cpython-312.pyc ADDED Viewed

Binary file (1.08 kB). View file

.gemini/extensions/huggingface/src/huggingface/server.py ADDED Viewed

	@@ -0,0 +1,36 @@

+import sys
+from absl import app
+from mcp.server import fastmcp
+from .tools import HuggingFaceTools
+def main(argv):
+    hf_tools = HuggingFaceTools()
+    mcp = fastmcp.FastMCP("HuggingFaceMcpServer")
+    mcp.add_tool(hf_tools.meta_orchestrator)
+    mcp.add_tool(hf_tools.search_repos)
+    mcp.add_tool(hf_tools.get_repo_info)
+    mcp.add_tool(hf_tools.download_repo)
+    mcp.add_tool(hf_tools.query_knowledge_base)
+    # New Agent Prototypes
+    mcp.add_tool(hf_tools.fact_check_text)
+    mcp.add_tool(hf_tools.generate_illustration_prompt)
+    mcp.add_tool(hf_tools.synthesize_audiobook_chapter)
+    mcp.add_tool(hf_tools.generate_marketing_package)
+    mcp.add_tool(hf_tools.audit_project_costs)
+    mcp.add_tool(hf_tools.request_human_review)
+    # Advanced Agent Prototypes
+    mcp.add_tool(hf_tools.localize_content)
+    mcp.add_tool(hf_tools.legal_copyright_scan)
+    mcp.add_tool(hf_tools.create_character_companion)
+    mcp.add_tool(hf_tools.analyze_reader_sentiment)
+    mcp.add_tool(hf_tools.optimize_accessibility)
+    mcp.add_tool(hf_tools.build_continuity_graph)
+    mcp.add_tool(hf_tools.perform_market_research)
+    mcp.run()
+if __name__ == "__main__":
+    app.run(main)

.gemini/extensions/huggingface/src/huggingface/tools.py ADDED Viewed

	@@ -0,0 +1,156 @@

+import os
+import json
+from huggingface_hub import HfApi, hf_hub_download, snapshot_download, InferenceClient
+class HuggingFaceTools:
+    def __init__(self, token=None):
+        self.token = token or os.environ.get("HUGGINGFACE_HUB_TOKEN") or os.environ.get("HUGGINGFACE_ACCESS_KEY")
+        self.api = HfApi(token=self.token)
+        self.client = InferenceClient(token=self.token)
+    def search_repos(self, query: str, repo_type: str = "model", limit: int = 5):
+        """Search for repositories on the Hugging Face Hub."""
+        if repo_type == "model":
+            repos = self.api.list_models(search=query, limit=limit)
+        elif repo_type == "dataset":
+            repos = self.api.list_datasets(search=query, limit=limit)
+        elif repo_type == "space":
+            repos = self.api.list_spaces(search=query, limit=limit)
+        else:
+            return f"Invalid repo_type: {repo_type}"
+        return [{"id": r.id, "author": r.author, "lastModified": r.lastModified} for r in repos]
+    def get_repo_info(self, repo_id: str, repo_type: str = "model"):
+        """Get detailed information about a repository."""
+        if repo_type == "model":
+            return self.api.model_info(repo_id)
+        elif repo_type == "dataset":
+            return self.api.dataset_info(repo_id)
+        elif repo_type == "space":
+            return self.api.space_info(repo_id)
+        return "Invalid repo_type"
+    def download_repo(self, repo_id: str, repo_type: str = "model", local_dir: str = None):
+        """Download a whole repository."""
+        return snapshot_download(repo_id=repo_id, repo_type=repo_type, local_dir=local_dir)
+    def query_knowledge_base(self, query: str):
+        """Query the EbookBuilder knowledge base for information."""
+        kb_path = os.path.join(os.path.dirname(__file__), "../../KNOWLEDGE_BASE.md")
+        if not os.path.exists(kb_path):
+            return "Knowledge base not found."
+        with open(kb_path, "r") as f:
+            content = f.read()
+            if query.lower() in content.lower():
+                return f"Found information in KB: {content[:1000]}..."
+            return "No matching information found in knowledge base."
+    # --- Upgraded Agents ---
+    def meta_orchestrator(self, task: str):
+        """The Meta Agent: Analyzes a complex request and maps it to specific agent tools."""
+        prompt = f"Given the task: '{task}', identify the sequence of tools needed from the following list: [search_repos, get_repo_info, fact_check_text, generate_illustration_prompt, synthesize_audiobook_chapter, generate_marketing_package, localize_content, legal_copyright_scan]. Return a JSON plan."
+        # Use InferenceClient to act as a planner
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=200)
+        return {"task": task, "suggested_plan": response}
+    def fact_check_text(self, text: str):
+        """Research Agent: Validates claims using LLM-based verification."""
+        prompt = f"Fact check the following text and identify any potential inaccuracies: '{text}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=300)
+        return {"status": "Complete", "analysis": response}
+    def generate_illustration_prompt(self, chapter_text: str):
+        """Illustrator Agent: Refines a narrative description into a high-quality Diffusion prompt."""
+        prompt = f"Convert this text into a detailed prompt for an image generator (like Stable Diffusion): '{chapter_text[:500]}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=150)
+        return {"image_prompt": response}
+    def synthesize_audiobook_chapter(self, text: str, voice_id: str = "default"):
+        """Audiobook Agent: Prepares audio synthesis parameters."""
+        # Note: Real TTS often requires binary output handling; this prepares the request metadata
+        return {
+            "status": "Ready",
+            "voice_profile": voice_id,
+            "text_preview": text[:100] + "...",
+            "recommended_model": "facebook/parler-tts-mini-v1"
+        }
+    def generate_marketing_package(self, title: str, summary: str):
+        """Marketing Agent: Generates copy for multiple platforms."""
+        prompt = f"Create an Amazon blurb and 3 social media posts for a book titled '{title}' with this summary: '{summary}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=400)
+        return {"marketing_materials": response}
+    def audit_project_costs(self, tokens_used: int):
+        """Auditor Agent: Financial tracking."""
+        return {
+            "tokens_total": tokens_used,
+            "estimated_spend_usd": tokens_used * 0.000002,
+            "currency": "USD",
+            "status": "Audit Complete"
+        }
+    def request_human_review(self, content_id: str, content_preview: str):
+        """Review Agent: Checkpoint."""
+        return {
+            "review_endpoint": f"https://huggingface.co/spaces/Brettapps/EbookBuilder/review",
+            "content_id": content_id,
+            "status": "WAITING_FOR_USER"
+        }
+    def localize_content(self, text: str, target_lang: str):
+        """Localization Agent: Real-time translation."""
+        prompt = f"Translate the following text to {target_lang}: '{text}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=500)
+        return {"target_lang": target_lang, "translation": response}
+    def legal_copyright_scan(self, text: str):
+        """Legal Guard: Risk analysis."""
+        prompt = f"Analyze this text for potential copyright or trademark risks: '{text[:1000]}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=200)
+        return {"legal_report": response}
+    def create_character_companion(self, character_bio: str):
+        """Companion Agent: Chatbot system prompt generation."""
+        prompt = f"Create a system instruction for an AI to act as this character: '{character_bio}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=300)
+        return {"system_instruction": response}
+    def analyze_reader_sentiment(self, feedback: str):
+        """Analyst Agent: Sentiment analysis."""
+        result = self.client.text_classification(feedback, model="distilbert-base-uncased-finetuned-sst-2-english")
+        return {"sentiment_analysis": result}
+    def optimize_accessibility(self, text: str):
+        """Accessibility Agent: Content simplification and Alt-text."""
+        prompt = f"Summarize this text for high accessibility and suggest Alt-text for any implied visuals: '{text[:500]}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=200)
+        return {"accessibility_report": response}
+    def build_continuity_graph(self, manuscript: str):
+        """Continuity Agent: Entity extraction."""
+        prompt = f"Extract all main characters and their locations from this text: '{manuscript[:1000]}'"
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=300)
+        return {"world_entities": response}
+    def perform_market_research(self, niche: str):
+        """Market Research Agent: Analyzes market trends, competitors, and audience for a specific book niche."""
+        prompt = f"""
+        Act as a Senior Market Research Analyst. Perform a deep dive into the ebook market for the niche: '{niche}'.
+        1. Identify the top 3 trending sub-genres or topics in this niche.
+        2. Analyze the 'typical' successful competitor (price point, cover style, length).
+        3. Define the core target audience (demographics, pain points, what they are looking for).
+        4. Suggest 3 high-traffic keywords for Amazon/Google.
+        Provide a structured report.
+        """
+        response = self.client.text_generation(prompt, model="meta-llama/Meta-Llama-3-8B-Instruct", max_new_tokens=800)
+        return {
+            "niche": niche,
+            "research_report": response,
+            "status": "Success",
+            "source": "Hugging Face Inference (Llama-3)"
+        }