llama-php / README.md
enacimie's picture
Update README.md
1e73e89 verified
metadata
title: Llama-PHP
emoji: 🏆
colorFrom: green
colorTo: green
sdk: docker
pinned: true
short_description: Llama-PHP Demo
license: mit
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/64d4ca84887f55fb6ee86b87/AkUWiFa4keIpuIbQy2gFm.png

🏆 Llama PHP Demo

Llama PHP License

This Hugging Face Space demonstrates llama.php, a robust PHP wrapper for executing local Large Language Models using llama.cpp as the inference engine.

🌟 About llama.php

llama.php is a modular, and productive PHP wrapper that lets you run Large Language Models completely offline. With a clean API similar to OpenAI or Hugging Face but 100% self-contained, it brings the power of LLMs to PHP applications without external dependencies.

✨ Features Demonstrated

  • Local Inference: Runs completely offline using CPU
  • GGUF Support: Works with quantized models (Q4_K_M, Q5_K_S, etc.)
  • Chat Templates: Includes templates for Qwen, Llama 3, Mistral, and more
  • Text Generation: Generate responses to prompts
  • Embeddings: Create vector embeddings from text
  • JSON Output: Force structured JSON output with schema validation
  • Secure Execution: Proper shell argument escaping to prevent injection

🚀 How to Use This Demo

  1. Text Generation: Enter a prompt in the text box and click "Generate"
  2. Chat Mode: Start a conversation with the model in chat interface
  3. Embedding Demo: Convert text to vector embeddings
  4. JSON Mode: Generate structured JSON output based on a schema

Adjust parameters like temperature, max tokens, and top-p to control the generation behavior.

⚙️ Technical Details

  • PHP Version: 8.2
  • Inference Engine: llama.cpp
  • Model: Qwen3-0.6B-Q4_K_M (quantized for efficient CPU inference)
  • Embedding Model: Qwen3-Embedding-0.6B-Q4_K_M
  • Docker Base: Custom image with PHP 8.2 and llama.cpp binaries

🤝 Credits

This demo is powered by llama.php created by Eduardo Nacimiento-García.

📜 License

This demo and the underlying llama.php library are released under the MIT License.


Note: Due to resource limitations on Hugging Face Spaces, generation might be slower than on dedicated hardware. The model runs entirely on CPU with limited context window size.