Instructions to use Sashank-810/LFT_Final_FineTuned_Increased_Metrics with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Sashank-810/LFT_Final_FineTuned_Increased_Metrics with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Sashank-810/LFT_Final_FineTuned_Increased_Metrics")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Sashank-810/LFT_Final_FineTuned_Increased_Metrics")
model = AutoModelForCausalLM.from_pretrained("Sashank-810/LFT_Final_FineTuned_Increased_Metrics")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Sashank-810/LFT_Final_FineTuned_Increased_Metrics with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Sashank-810/LFT_Final_FineTuned_Increased_Metrics"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sashank-810/LFT_Final_FineTuned_Increased_Metrics",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Sashank-810/LFT_Final_FineTuned_Increased_Metrics

SGLang

How to use Sashank-810/LFT_Final_FineTuned_Increased_Metrics with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Sashank-810/LFT_Final_FineTuned_Increased_Metrics" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sashank-810/LFT_Final_FineTuned_Increased_Metrics",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Sashank-810/LFT_Final_FineTuned_Increased_Metrics" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sashank-810/LFT_Final_FineTuned_Increased_Metrics",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Sashank-810/LFT_Final_FineTuned_Increased_Metrics with Docker Model Runner:
```
docker model run hf.co/Sashank-810/LFT_Final_FineTuned_Increased_Metrics
```

Model Card for LFT_Final_FineTuned_Increased_Metrics

Merged structure-aware LoRA deltas for the LfT math-tutoring student on top of Llama-3.1-8B-Instruct. This is the canonical LfT student for downstream math-tutor and IDC workflows.

Model Details

Model Description

--Developed by:-- YRS Aakanksha
--Shared by:-- YRS Aakanksha
--Model type:-- Instruction-tuned causal LM with merged LoRA deltas (global LfT)
--Language(s):-- English (math tutoring focus)
--License:-- Same as base model (Llama-3.1-8B-Instruct)
--Finetuned from:-- meta-llama/Llama-3.1-8B-Instruct

Model Sources

--Repository:-- https://huggingface.co/Sashank-810/LFT_Final_FineTuned_Increased_Metrics

Uses

Direct Use

Math tutoring / reasoning with structure-aware prompts (chapter/difficulty/LO tags).
Base student for two-stage LfT + IDC flows.

Downstream Use

Further task-specific fine-tuning for math reasoning or instructional tutoring.

Out-of-Scope Use

Non-math domains; safety-critical decisions; any deployment without alignment/safety layers.

Bias, Risks, and Limitations

Inherits biases and limitations of the base Llama-3.1-8B-Instruct model and the curated math datasets.
Not safety-tuned; avoid use in safety-critical settings.

Recommendations

Keep human oversight; add safety/filters for production.

How to Get Started

Load with transformers (merged weights)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Sashank-810/LFT_Final_FineTuned_Increased_Metrics"
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
if tok.pad_token is None:
    tok.pad_token = tok.eos_token

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)

prompt = "Explain the concept of vector projections with an example."
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(--inputs, max_new_tokens=128)
print(tok.decode(out[0], skip_special_tokens=True))

Serve with vLLM

pip install vllm
python -m vllm.entrypoints.openai.api_server \
  --model Sashank-810/LFT_Final_FineTuned_Increased_Metrics \
  --tensor-parallel-size 2 \
  --dtype auto

Then query via OpenAI-compatible endpoint (replace URL/key as needed):

import openai, os

openai.api_base = "http://localhost:8000/v1"
openai.api_key = "EMPTY"

resp = openai.ChatCompletion.create(
    model="Sashank-810/LFT_Final_FineTuned_Increased_Metrics",
    messages=[{"role": "user", "content": "Outline key steps in solving a probability problem involving Bayes' theorem."}],
    max_tokens=128,
)
print(resp["choices"][0]["message"]["content"])

Training Details

Fine-tuned with structure-aware SFT across all chapters; LoRA deltas merged into base. Specific hyperparameters and dataset splits are kept private.

Evaluation

The model was evaluated on a comprehensive test set of 2,617 math tutoring questions, comparing performance against the base Llama-3.1-8B-Instruct model.

Accuracy Results

Metric	Base Model	Fine-tuned Model	Improvement
--Correct Answers--	625 / 2617	843 / 2617	+218
--Accuracy--	23.88%	32.21%	--+8.33%--
--Questions Improved--	-	421	-
--Questions Regressed--	-	203	-

The fine-tuned model shows a --34.9% relative improvement-- in accuracy over the base model, with more than twice as many questions improved (421) compared to regressed (203).

Generation Quality Metrics

BLEU Score

Model	BLEU Score	Precision (1/2/3/4-gram)	BP	Sys Len	Ref Len
Base	38.24	58.8 / 67.8 / 63.8 / 59.8	0.612	3,765	5,612
Fine-tuned	--58.56--	57.1 / 65.4 / 60.0 / 53.9	--0.993--	5,573	5,612

The fine-tuned model achieves a --53.1% relative improvement-- in BLEU score (38.24 → 58.56), with significantly better length matching (BP: 0.612 → 0.993).

ROUGE Scores

Metric	Base Model	Fine-tuned Model	Improvement
--ROUGE-1--	0.2948	--0.4188--	+42.1%
--ROUGE-2--	0.0931	--0.1184--	+27.2%
--ROUGE-L--	0.2936	--0.4181--	+42.4%
--ROUGE-Lsum--	0.2938	--0.4185--	+42.4%

All ROUGE metrics show substantial improvements, indicating better recall and overlap with reference answers.

METEOR Score

Model	METEOR Score	Improvement
Base	0.1633	-
Fine-tuned	--0.2327--	--+42.5%--

The METEOR score improvement demonstrates better semantic alignment and synonym matching in generated responses.

Key Findings

--Substantial Accuracy Gains--: The model demonstrates a clear improvement in mathematical correctness, with accuracy rising from 23.88% to 32.21%.
--Improved Response Quality--: Across all automated metrics (BLEU, ROUGE, METEOR), the fine-tuned model shows 27-53% relative improvements, indicating more coherent and relevant responses.
--Better Length Calibration--: The brevity penalty improvement (0.612 → 0.993) shows the model generates more appropriately-sized responses that better match expected answer lengths.
--Positive Net Impact--: With 421 improved questions versus 203 regressed, the model shows a strong positive impact ratio of approximately 2:1.