olabs-ai
/

reflection_model

Model card Files Files and versions

reflection_model / README.md

olabs-ai's picture

Update README.md

deec282 verified over 1 year ago

|

history blame contribute delete

2.12 kB

	---
	license: apache-2.0
	---
	---
	language: en
	tags:
	- text-generation
	- causal-lm
	- fine-tuning
	- unsupervised
	---

	# Model Name: olabs-ai/reflection_model

	## Model Description

	The `olabs-ai/reflection_model` is a fine-tuned language model based on [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/Meta-Llama-3.1-8B-Instruct). It has been further fine-tuned using LoRA (Low-Rank Adaptation) for improved performance in specific tasks. This model is designed for text generation and can be used for various applications like conversational agents, content creation, and more.

	## Model Details

	- Base Model: Meta-Llama-3.1-8B-Instruct
	- Fine-Tuning Method: LoRA
	- Architecture: LlamaForCausalLM
	- Number of Parameters: 8 Billion (Base Model)
	- Training Data: [Details about the training data used for fine-tuning, if available]

	## Usage

	To use this model, you need to have the `transformers` and `unsloth` libraries installed. You can load the model and tokenizer as follows:

	```python
	from transformers import AutoConfig, AutoModel, AutoTokenizer
	from unsloth import FastLanguageModel

	# Load base model configuration
	base_model_name = "olabs-ai/Meta-Llama-3.1-8B-Instruct"
	base_config = AutoConfig.from_pretrained(base_model_name)
	base_model = AutoModel.from_pretrained(base_model_name, config=base_config)
	tokenizer = AutoTokenizer.from_pretrained(base_model_name)

	# Load LoRA adapter
	adapter_config_path = "path_to_your_adapter_config.json"
	adapter_weights_path = "path_to_your_adapter_weights"

	# Use FastLanguageModel to apply LoRA adapter
	model = FastLanguageModel.from_pretrained(
	model_name=base_model_name,
	adapter_weights=adapter_weights_path,
	config=adapter_config_path
	)

	# Set inference mode for LoRA
	FastLanguageModel.for_inference(model)

	# Prepare inputs
	custom_prompt = "What is a famous tall tower in Paris?"
	inputs = tokenizer([custom_prompt], return_tensors="pt").to("cuda")

	from transformers import TextStreamer
	text_streamer = TextStreamer(tokenizer)

	# Generate outputs
	outputs = model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)