pvlabs
/

PingVortexLM-20M

Text Generation

text-generation-inference

Model card Files Files and versions

PingVortex commited on 5 days ago

Commit

cdd919a

·

verified ·

1 Parent(s): 5a4710c

Update README.md

Files changed (1) hide show

README.md +65 -3

README.md CHANGED Viewed

@@ -1,3 +1,65 @@
----
-license: apache-2.0
----

+---
+language:
+- en
+license: apache-2.0
+pipeline_tag: text-generation
+tags:
+- llama
+- causal-lm
+- experimental
+---
+# PingVortexLM-20M
+A small experimental language model based on LLaMA architecture trained on custom English dataset with around 100M tokens.
+This model is just an experiment, it is not capable of basic English conversations.
+Built by [PingVortex Labs](https://github.com/PingVortexLabs).
+---
+## Model Details
++ **Parameters:** 20M
++ **Context length:** 8192 tokens
++ **Language:** English only
++ **Format:** ChatML
++ **License:** Apache 2.0
+---
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_path = "pvlabs/PingVortexLM-20M"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = AutoModelForCausalLM.from_pretrained(model_path, dtype=torch.float16)
+model.eval()
+prompt = "<|im_start|>user\nWhat is the capital of France?<|im_end|>\n<|im_start|>assistant\n"
+inputs = tokenizer(prompt, return_tensors="pt")
+with torch.no_grad():
+    output = model.generate(
+        **inputs,
+        max_new_tokens=200,
+        do_sample=True,
+        temperature=0.7,
+        top_p=0.9,
+        eos_token_id=tokenizer.convert_tokens_to_ids("<|im_end|>"),
+        pad_token_id=tokenizer.eos_token_id,
+    )
+generated = tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=False)
+print(generated)
+```
+---
+## Prompt Format (ChatML)
+The model uses the standard ChatML format:
+```
+<|im_start|>user
+Your message here<|im_end|>
+<|im_start|>assistant
+```
+---
+*Made by [PingVortex](https://pingvortex.com).*