---
title: Llama-PHP
emoji: 🏆
colorFrom: green
colorTo: green
sdk: docker
pinned: true
short_description: Llama-PHP Demo
license: mit
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/64d4ca84887f55fb6ee86b87/AkUWiFa4keIpuIbQy2gFm.png
---

# 🏆 Llama PHP Demo

![Llama PHP](https://img.shields.io/badge/PHP-8.1%2B-777BB4?style=flat-square&logo=php)
![License](https://img.shields.io/badge/License-MIT-green?style=flat-square)

This Hugging Face Space demonstrates **[llama.php](https://github.com/enacimie/llama-php)**, a robust PHP wrapper for executing local Large Language Models using `llama.cpp` as the inference engine.

## 🌟 About llama.php

llama.php is a modular, and productive PHP wrapper that lets you run Large Language Models completely offline. With a clean API similar to OpenAI or Hugging Face but 100% self-contained, it brings the power of LLMs to PHP applications without external dependencies.

## ✨ Features Demonstrated

- **Local Inference**: Runs completely offline using CPU
- **GGUF Support**: Works with quantized models (Q4_K_M, Q5_K_S, etc.)
- **Chat Templates**: Includes templates for Qwen, Llama 3, Mistral, and more
- **Text Generation**: Generate responses to prompts
- **Embeddings**: Create vector embeddings from text
- **JSON Output**: Force structured JSON output with schema validation
- **Secure Execution**: Proper shell argument escaping to prevent injection

## 🚀 How to Use This Demo

1. **Text Generation**: Enter a prompt in the text box and click "Generate"
2. **Chat Mode**: Start a conversation with the model in chat interface
3. **Embedding Demo**: Convert text to vector embeddings
4. **JSON Mode**: Generate structured JSON output based on a schema

Adjust parameters like temperature, max tokens, and top-p to control the generation behavior.

## ⚙️ Technical Details

- **PHP Version**: 8.2
- **Inference Engine**: llama.cpp
- **Model**: Qwen3-0.6B-Q4_K_M (quantized for efficient CPU inference)
- **Embedding Model**: Qwen3-Embedding-0.6B-Q4_K_M
- **Docker Base**: Custom image with PHP 8.2 and llama.cpp binaries

## 🤝 Credits

This demo is powered by **[llama.php](https://github.com/enacimie/llama-php)** created by Eduardo Nacimiento-García.

- **GitHub Repository**: https://github.com/enacimie/llama-php
- **Original Models**: Qwen3 series from Alibaba
- **Inference Backend**: https://github.com/ggerganov/llama.cpp

## 📜 License

This demo and the underlying llama.php library are released under the MIT License.

---

*Note: Due to resource limitations on Hugging Face Spaces, generation might be slower than on dedicated hardware. The model runs entirely on CPU with limited context window size.*