fara-7b-chat-test / DEPLOYMENT_GUIDE.md
thisisam
Initial commit: Fara-7B chat interface
4fe284b

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

πŸš€ Deployment Guide for Fara-7B Space

Quick Start

Your Hugging Face Space is ready to deploy! Follow these steps:

Step 1: Create a New Space on Hugging Face

  1. Go to huggingface.co/new-space
  2. Fill in the details:
    • Space name: fara-7b-chat (or any name you prefer)
    • License: MIT
    • Select SDK: Gradio
    • Space hardware: CPU Basic (free) - this is fine since we're using Inference API
    • Visibility: Public or Private (your choice)
  3. Click Create Space

Step 2: Upload Your Files

You have two options:

Option A: Git Upload (Recommended)

# Navigate to your space folder
cd "c:/Users/Amir/OneDrive - Digital Health CRC Limited/Projects/url2md/fara-7b-space"

# Initialize git repository
git init
git add .
git commit -m "Initial commit: Fara-7B chat interface"

# Add your HuggingFace Space as remote
# Replace YOUR_USERNAME and YOUR_SPACE_NAME
git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
git push -u origin main

Option B: Web Upload

  1. In your newly created Space, click Files β†’ Add file β†’ Upload files
  2. Drag and drop these files:
    • app.py
    • requirements.txt
    • README.md
    • .gitignore
  3. Click Commit changes to main

Step 3: Add Your HuggingFace Token as a Secret

This is CRITICAL - the app won't work without this:

  1. In your Space, go to Settings (gear icon)
  2. Scroll to Variables and secrets
  3. Click New secret
  4. Enter:
    • Name: HF_TOKEN
    • Value: Your Hugging Face token
    • ⚠️ Make sure this is marked as Secret (not a variable)
  5. Click Save
  6. The Space will automatically rebuild

Step 4: Wait for Build

  • The Space will install dependencies and start
  • This usually takes 1-3 minutes
  • Watch the Logs tab to see progress
  • Once you see "Running on local URL", it's ready!

Step 5: Test Your Space

  1. Go to the App tab
  2. Try a test message: "Help me find a good coffee shop"
  3. You should see Fara-7B respond!

Troubleshooting

Error: "HF_TOKEN not found"

  • Make sure you added the token as a Secret, not a Variable
  • Restart the Space after adding the secret

Error: "Model not found"

  • Check if microsoft/Fara-7B is publicly available
  • Ensure your token has inference permissions

Error: "Rate limit exceeded"

  • You're using the free inference tier
  • Wait a few minutes and try again
  • Consider upgrading to Inference Endpoints for production use

Space is slow

  • The free CPU tier is sufficient since inference happens on HF servers
  • Response time depends on model inference, not your Space hardware

Optional Enhancements

1. Request GPU Hardware

If you want faster Space loading (not needed for inference):

  • Settings β†’ Hardware β†’ Select a GPU tier
  • Note: This costs money, but inference API is separate

2. Add Custom Examples

Edit app.py and add example buttons:

gr.Examples(
    examples=[
        "Find Italian restaurants in Seattle",
        "Help me search for running shoes",
        "What's the process to book a hotel?"
    ],
    inputs=msg
)

3. Enable Analytics

  • Settings β†’ Enable visitor analytics
  • Track usage of your Space

Cost Breakdown

  • Space hosting: FREE (CPU Basic tier)
  • Inference API:
    • Free tier: Limited requests/day
    • Pro account: More requests
    • Inference Endpoints: ~$0.60-1.00/hour for dedicated

Next Steps

Once deployed, you can:

  1. Share your Space URL with others
  2. Embed it in websites
  3. Use the API endpoint programmatically
  4. Duplicate and customize for other models

Need Help?

Check the logs in your Space for detailed error messages, or refer to:


Your Space folder: c:/Users/Amir/OneDrive - Digital Health CRC Limited/Projects/url2md/fara-7b-space

Happy deploying! πŸš€