Spaces:
Paused
Paused
File size: 1,334 Bytes
ed2832b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | ---
license: mit
tags:
- amop-optimized
- gguf
---
# AMOP-Optimized GGUF Model: {repo_name}
This model was automatically optimized for CPU inference using the **Adaptive Model Optimization Pipeline (AMOP)**.
- **Base Model:** [{model_id}](https://huggingface.co/{model_id})
- **Optimization Date:** {optimization_date}
## Optimization Details
The following AMOP GGUF pipeline stages were applied:
- **GGUF Conversion & Quantization:** Enabled (Strategy: {quant_type})
## How to Use
This model is in GGUF format and can be run with libraries like `llama-cpp-python`.
First, install the necessary libraries:
```bash
pip install llama-cpp-python
```
Then, use the following Python code to run inference:
```python
from llama_cpp import Llama
from huggingface_hub import hf_hub_download
# Download the GGUF model from the Hub
model_path = hf_hub_download(
repo_id="{repo_id}",
filename="model.gguf" # Or the specific GGUF file name
)
# Instantiate the model
llm = Llama(
model_path=model_path,
n_ctx=2048, # Context window
)
# Run inference
prompt = "The future of AI is"
output = llm(
f"Q: {prompt} A: ", # Or your preferred prompt format
max_tokens=50,
stop=["Q:", "\n"],
echo=True
)
print(output)
```
## AMOP Pipeline Log
<details>
<summary>Click to expand</summary>
```
{pipeline_log}
```
</details> |