Instructions to use bastienp/Gemma-2-2B-Instruct-structured-output with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use bastienp/Gemma-2-2B-Instruct-structured-output with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-2-2b-it") model = PeftModel.from_pretrained(base_model, "bastienp/Gemma-2-2B-Instruct-structured-output") - Transformers
How to use bastienp/Gemma-2-2B-Instruct-structured-output with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("bastienp/Gemma-2-2B-Instruct-structured-output", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use bastienp/Gemma-2-2B-Instruct-structured-output with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for bastienp/Gemma-2-2B-Instruct-structured-output to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for bastienp/Gemma-2-2B-Instruct-structured-output to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for bastienp/Gemma-2-2B-Instruct-structured-output to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="bastienp/Gemma-2-2B-Instruct-structured-output", max_seq_length=2048, )
Gemma-2 2B Instruct fine-tuned on JSON dataset
This model is a Gemma-2 2b model fine-tuned to paraloq/json_data_extraction.
The model has been fine-tuned to extract data from a text according to a json schema.
Prompt
The prompt used during training is:
"""Below is a text paired with input that provides further context. Write JSON output that matches the schema to extract information.
### Input:
{input}
### Schema:
{schema}
### Response:
"""
Using the Model
You can use the model with the transformer library or with the wrapper from [unsloth] (https://unsloth.ai/blog/gemma2), which allows faster inference.
import torch
from unsloth import FastLanguageModel
# Required to avoid cache size exceeded
torch._dynamo.config.accumulated_cache_size_limit = 2048
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = f"bastienp/Gemma-2-2B-it-JSON-data-extration",
max_seq_length = 2048,
dtype = torch.float16,
load_in_4bit = False,
token = HF_TOKEN_READ,
)
Using the Quantized model (llama.cpp)
The model is supplied in GGFU format in 4bit and 8bit.
Example code with Llamacpp:
from llama_cpp import Llama
llm = Llama.from_pretrained(
"bastienp/Gemma-2-2B-it-JSON-data-extration",
filename="*Q4_K_M.gguf", #*Q8_K_M.gguf for the 8 bit version
verbose=False,
)
The base model used for fine-tuning is google/gemma-2-2b-it. This repository is NOT affiliated with Google.
Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms.
- Developed by: bastienp
- License: gemma
- Finetuned from model : google/gemma-2-2b-it
- Downloads last month
- 90
4-bit
5-bit
8-bit
16-bit