How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DaertML/LLaMA-Turrera-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DaertML/LLaMA-Turrera-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'
Use Docker
docker model run hf.co/DaertML/LLaMA-Turrera-7B
Quick Links

LLaMA Turrera

This model has been trained with the "turras" of "El Turrero", which can be found at: https://turrero.vercel.app/

The credits of the "turras": https://twitter.com/Recuenco The credits of the sysadmin of the site: https://twitter.com/k4rliky

With the objective to provide a true fine tuning of an LLM, and get beyond the capabilities of the GPTs, this model has been trained. You can also find a GPT that uses the content of the "turras" to chat with the user at: https://chat.openai.com/g/g-nam1wBUJm-turrero

"La LLaMA Turrera" can produce "turras" in the same manner as "El Turrero" and provide further feedback from other "turras". The model has been trained with a non-profit intent, for fun and to serve as a base for further development.

With the objective to learn about AI alignment, further versions of this model will be trained that attempt to avoid malicious usage from the users.

The model is meant to be used with HuggingFace transformers API in Python, a version for LLaMA.cpp is under evaluation.

Downloads last month
7
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support