olabs-ai
/

reflection_model

Model card Files Files and versions

olabs-ai commited on Sep 8, 2024

Commit

deec282

·

verified ·

1 Parent(s): 9296a08

Update README.md

Files changed (1) hide show

README.md +63 -3

README.md CHANGED Viewed

@@ -1,3 +1,63 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+---
+language: en
+tags:
+  - text-generation
+  - causal-lm
+  - fine-tuning
+  - unsupervised
+---
+# Model Name: olabs-ai/reflection_model
+## Model Description
+The `olabs-ai/reflection_model` is a fine-tuned language model based on [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/Meta-Llama-3.1-8B-Instruct). It has been further fine-tuned using LoRA (Low-Rank Adaptation) for improved performance in specific tasks. This model is designed for text generation and can be used for various applications like conversational agents, content creation, and more.
+## Model Details
+- **Base Model**: Meta-Llama-3.1-8B-Instruct
+- **Fine-Tuning Method**: LoRA
+- **Architecture**: LlamaForCausalLM
+- **Number of Parameters**: 8 Billion (Base Model)
+- **Training Data**: [Details about the training data used for fine-tuning, if available]
+## Usage
+To use this model, you need to have the `transformers` and `unsloth` libraries installed. You can load the model and tokenizer as follows:
+```python
+from transformers import AutoConfig, AutoModel, AutoTokenizer
+from unsloth import FastLanguageModel
+# Load base model configuration
+base_model_name = "olabs-ai/Meta-Llama-3.1-8B-Instruct"
+base_config = AutoConfig.from_pretrained(base_model_name)
+base_model = AutoModel.from_pretrained(base_model_name, config=base_config)
+tokenizer = AutoTokenizer.from_pretrained(base_model_name)
+# Load LoRA adapter
+adapter_config_path = "path_to_your_adapter_config.json"
+adapter_weights_path = "path_to_your_adapter_weights"
+# Use FastLanguageModel to apply LoRA adapter
+model = FastLanguageModel.from_pretrained(
+    model_name=base_model_name,
+    adapter_weights=adapter_weights_path,
+    config=adapter_config_path
+)
+# Set inference mode for LoRA
+FastLanguageModel.for_inference(model)
+# Prepare inputs
+custom_prompt = "What is a famous tall tower in Paris?"
+inputs = tokenizer([custom_prompt], return_tensors="pt").to("cuda")
+from transformers import TextStreamer
+text_streamer = TextStreamer(tokenizer)
+# Generate outputs
+outputs = model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)