# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("probabl-ai/ScikitLLM-Model")
model = AutoModelForCausalLM.from_pretrained("probabl-ai/ScikitLLM-Model")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))Quick Links
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
ScikitLLM is an LLM finetuned on writing references and code for the Scikit-Learn documentation.
Features of ScikitLLM includes:
- Support for RAG (three chunks)
- Sources and quotations using a modified version of the wiki syntax ("")
- Code samples and examples based on the code quoted in the chunks.
- Expanded knowledge/familiarity with the Scikit-Learn concepts and documentation.
Training
ScikitLLM is based on Mistral-OpenHermes 7B, a pre-existing finetune version of Mistral 7B. OpenHermes already include many desired capacities for the end use, including instruction tuning, source analysis, and native support for the chatML syntax.
As a fine-tune of a fine-tune, ScikitLLM has been trained with a lower learning rate than is commonly used in fine-tuning projects.
- Downloads last month
- 9
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="probabl-ai/ScikitLLM-Model") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)