Instructions to use rpa020/D2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rpa020/D2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="rpa020/D2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("rpa020/D2") model = AutoModelForCausalLM.from_pretrained("rpa020/D2") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use rpa020/D2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "rpa020/D2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rpa020/D2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/rpa020/D2
- SGLang
How to use rpa020/D2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "rpa020/D2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rpa020/D2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "rpa020/D2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rpa020/D2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use rpa020/D2 with Docker Model Runner:
docker model run hf.co/rpa020/D2
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("rpa020/D2")
model = AutoModelForCausalLM.from_pretrained("rpa020/D2")Model D2
This model uses a causal language modeling approach during training. This approach modifies the way the model accesses and processes words that precede the current token in the input sequence. Unlike masked language modeling in a sequence-to-sequence model, casual language modeling focuses on predicting the single next token. It does this by conditioning on all previous tokens in the sequence, ensuring that the model only has access to prior tokens and not future ones.
Model Details
When performing experiments with a decoder-only model, we selected BLOOM as the architecture.
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: Ronny Paul
- Model type: BLOOM
- Language(s) (NLP): Northern Sami
- Finetuned from model: TurkuNLP/gpt3-finnish-xl
Uses
The model serves as a foundational model, and is used in a plagiarism detection. It can support fine-tuning on a down-stream task with Northern Sami data.
Dataset
The model is trained with the rpa020/SALT dataset. The formatted dataset is named the SAmi LLM Token (SALT) dataset and contains around 22 million tokens and approximately 2 million sentences. On average, each sentence consists of around ten tokens. The dataset has been designed to support the pretraining phase for foundational model development.
How to Get Started with the Model
model = BloomForCausalLM.from_pretrained("rpa020/D2")
Performance
CE Loss: 4.27 Perplexity: 71.6 SELF-BLEU: 0.32
- Downloads last month
- 2
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="rpa020/D2")