Small Models
Collection
A list of all small models (=<1B) that I have published. • 9 items • Updated
This model is a fine-tuned version of microsoft/DialoGPT-medium, specialized for a custom persona and critical knowledge injection. It has been trained to balance conversational flexibility with specific factual recall.
This model is designed for creative assistant tasks and casual conversation.
The model underwent a full fine-tune on a custom dataset consisting of critical facts and casual chat examples.
Post-training analysis showed a significant shift in the LM Head weights (Absolute Shift: 4.4164), indicating a strong adaptation to the new conversational style while maintaining structural grammar stability in the transformer layers.
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Fu01978/DialoGPT-medium-distill-Kimi-K2-Instruct")
model = AutoModelForCausalLM.from_pretrained("Fu01978/DialoGPT-medium-distill-Kimi-K2-Instruct")
# For best results, use a temperature between 0.7 and 0.85