BrickJZ
/

llama.cpp-TQ3-Windows-NVIDIA

Model card Files Files and versions

llama.cpp-TQ3-Windows-NVIDIA / README.md

BrickJZ's picture

Update README.md

b67dc8a verified 17 days ago

|

history blame contribute delete

1.88 kB

	---
	license: mit
	language:
	- zh
	- en
	tags:
	- llama.cpp
	- TQ3
	- quantization
	- Windows
	- NVIDIA
	- GGUF
	---

	# llama.cpp-TQ3 专用推理环境 (Windows/NVIDIA版)
	# llama.cpp-TQ3 Inference Environment (Windows/NVIDIA Edition)

	## 简介 \| Intro
	这是一个预编译的 `llama.cpp` 环境，专为 TQ3 量化模型设计，支持 NVIDIA 显卡在 Windows 上一键运行。
	A pre-built `llama.cpp` environment optimized for TQ3 quantized models, enabling one-click inference on NVIDIA GPUs for Windows users.

	## 核心特性 \| Key Features
	✅ 原生支持 TQ3 格式（普通 llama.cpp 无法运行）
	✅ 已编译 CUDA 加速，专为 NVIDIA 显卡优化
	✅ 免配置依赖，解压即用，不包含模型权重
	✅ 支持命令行与 Web 服务两种运行方式

	✅ Native TQ3 support (works with models standard llama.cpp cannot run)
	✅ CUDA-accelerated, optimized for NVIDIA GPUs
	✅ No dependencies required — just extract and run (model weights not included)
	✅ Supports both CLI and Web server modes

	## 使用方法 \| Usage
	1. 下载解压：将文件解压到纯英文路径
	2. 放入模型：把 `.tq3.gguf` 格式的模型放到目录下
	3. 启动运行：使用 `llama-cli.exe` 或 `llama-server.exe` 加载模型

	1. Download & Extract: Unzip to a folder with an English-only path
	2. Add Model: Place your `.tq3.gguf` model in the same directory
	3. Run: Use `llama-cli.exe` or `llama-server.exe` to start inference

	## 注意事项 \| Notes
	- 仅支持 NVIDIA 显卡，AMD 显卡暂不兼容
	- 本项目不包含任何模型文件，请自行获取并遵守对应开源协议

	- NVIDIA-only: Not compatible with AMD GPUs
	- This package does not include model weights. Please obtain them legally and comply with their licenses.

	## 致谢 \| Credits
	- 核心源码: turbo-tan/llama.cpp-tq3
	- TQ3 量化: YTan2000