#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models
Paper • 2308.07074 • Published
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("OFA-Sys/InsTagger")
model = AutoModelForCausalLM.from_pretrained("OFA-Sys/InsTagger")InsTagger is an tool for automatically providing instruction tags by distilling tagging results from InsTag.
InsTag aims analyzing supervised fine-tuning (SFT) data in LLM aligning with human preference. For local tagging deployment, we release InsTagger, fine-tuned on InsTag results, to tag the queries in SFT data. Through the scope of tags, we sample a 6K subset of open-resourced SFT data to fine-tune LLaMA and LLaMA-2 and the fine-tuned models TagLM-13B-v1.0 and TagLM-13B-v2.0 outperform many open-resourced LLMs on MT-Bench.
This model is directly developed with FastChat. So it can be easily infer or serve with FastChat selecting the vicuna template.
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="OFA-Sys/InsTagger")