馃 Lotus-12B (4-bit Quantized)

This is a 4-bit quantized version of Lotus-12B, converted to safetensors and compressed using llmcompressor.

Lotus-12B is a GPT-NeoX 12B model fine-tuned on 2.5GB of a diverse range of light novels, erotica, annotated literature, and public-domain conversations for the purpose of generating novel-like fictional text and conversations.

Quantization Details

This model was quantized using the One-Shot GPTQ method to reduce memory footprint and improve inference speed while maintaining generation quality.

Setting Value
Method GPTQ (W4A16)
Group Size 128
Dampening Fraction 0.01
Calibration Dataset neuralmagic/LLM_compression_calibration (512 samples)
Ignored Modules lm_head

Note: The lm_head was kept in full precision to ensure stability in text generation.

Model Description

The base model used for fine-tuning is Pythia 12B Deduped, which is a 12 billion parameter auto-regressive language model trained on The Pile.

Usage

vLLM (Recommended)

This model is optimized for vLLM, which automatically detects the compressed tensors config.

from vllm import LLM, SamplingParams

# Load Model
llm = LLM(
    model="Ryex/Lotus-12B-GPTQ",
    trust_remote_code=True,
    max_model_len=2048
)

# Prompt
prompt = '''[ Title: The Dunwich Horror; Author: H. P. Lovecraft; Genre: Horror ]
***
When a traveler'''

# Generate
params = SamplingParams(temperature=1.0, top_p=0.9, max_tokens=256)
outputs = llm.generate([prompt], sampling_params=params)

print(outputs[0].outputs[0].text)

Transformers

You can also run this using transformers with auto_gptq or compressed-tensors installed.

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "YOUR_USERNAME/Lotus-12B-GPTQ"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

prompt = '''[ Title: The Dunwich Horror; Author: H. P. Lovecraft; Genre: Horror ]
***
When a traveler'''

input_ids = tokenizer.encode(prompt, return_tensors='pt').to(model.device)
output = model.generate(
    input_ids, 
    do_sample=True, 
    temperature=1.0, 
    top_p=0.9, 
    repetition_penalty=1.2, 
    max_new_tokens=200, 
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(output[0]))

Training Data & Annotative Prompting

The data used in fine-tuning has been gathered from various sources such as the Gutenberg Project. The annotated fiction dataset has prepended tags to assist in generating towards a particular style. Here is an example prompt that shows how to use the annotations.

[ Title: The Dunwich Horror; Author: H. P. Lovecraft; Genre: Horror; Tags: 3rdperson, scary; Style: Dark ]
***
When a traveler in north central Massachusetts takes the wrong fork...

And for conversations which were scraped from My Discord Server and publicly available subreddits from Reddit:

[ Title: (2019) Cars getting transported on an open deck catch on fire after salty water shorts their batteries; Genre: CatastrophicFailure ]
***
Anonymous: Daaaaaamn try explaining that one to the owners
EDIT: who keeps reposting this for my comment to get 3k upvotes?
Anonymous: "Your car caught fire from some water"
Irythros: Lol, I wonder if any compensation was in order
Anonymous: Almost all of the carriers offer insurance but it isn鈥檛 cheap. I guarantee most of those owners declined the insurance.

The annotations can be mixed and matched to help generate towards a specific style.

Downstream Uses

This model can be used for entertainment purposes and as a creative writing assistant for fiction writers and chatbots.

Team members and Acknowledgements

This project would not have been possible without the work done by EleutherAI. Thank you!

In order to reach us, you can join our Discord server.

Discord Server ```

Downloads last month
25
Safetensors
Model size
2B params
Tensor type
I64
F32
I32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Ryex/Lotus-12B-GPTQ

Base model

hakurei/lotus-12B
Quantized
(1)
this model