Update README.md

b91f796 verified 5 months ago

3.83 kB

	---
	license: mit
	language: en
	base_model: microsoft/phi-2
	tags:
	- text-generation
	- voice-assistant
	- automotive
	- fine-tuned
	- peft
	- lora
	datasets:
	- synthetic
	widget:
	- text: "Navigate to the nearest EV charging station."
	- text: "Set the temperature to 22 degrees."
	---

	# 🚗 Fine-tuned MBUX Voice Assistant (phi-2)

	This repository contains a fine-tuned version of Microsoft's `microsoft/phi-2` model, specifically adapted to function as an in-car voice assistant similar to MBUX. The model is trained to understand and respond to common automotive commands.

	This model was created as part of an end-to-end MLOps project, from data creation and fine-tuning to deployment in an interactive application.

	## ✨ Live Demo

	You can interact with this model in a live, voice-to-voice application on Hugging Face Spaces:

	➡️ [Live MBUX Gradio Demo](https://huggingface.co/spaces/MrunangG/mbux-gradio-demo)



	---

	## 📝 Model Details

	* Base Model: `microsoft/phi-2`
	* Fine-tuning Method: Parameter-Efficient Fine-Tuning (PEFT) using LoRA.
	* Training Data: A synthetic, instruction-based dataset of in-car commands covering navigation, climate control, media, and vehicle settings.
	* Frameworks: PyTorch, Transformers, PEFT, TRL.

	### Intended Use

	This model is a proof-of-concept designed for demonstration purposes. It's intended to be used as the "brain" for a voice assistant application in an automotive context. It excels at understanding commands like:
	* "Navigate to the office."
	* "Set the fan speed to maximum."
	* "Play my 'Morning Commute' playlist."

	---

	## 🚀 How to Use

	While the model's core function is text generation, its primary intended use is within a full voice-to-voice pipeline.

	### Interactive Voice Demo
	For the complete, interactive experience including Speech-to-Text and Text-to-Speech, please visit the live application hosted on Hugging Face Spaces:

	➡️ [Live MBUX Gradio Demo](https://huggingface.co/spaces/MrunangG/mbux-gradio-demo)

	### Programmatic Use (Text-Only)

	The following Python code shows how to use the fine-tuned model for its core text-generation task programmatically.

	```python
	import torch
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Define the model repository IDs
	base_model_id = "microsoft/phi-2"
	peft_model_id = "MrunangG/phi-2-mbux-assistant"

	# Set device
	device = "cuda" if torch.cuda.is_available() else "cpu"

	# Load the base model
	base_model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	trust_remote_code=True,
	torch_dtype=torch.float16,
	device_map={"": device}
	)

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
	if tokenizer.pad_token is None:
	tokenizer.pad_token = tokenizer.eos_token

	# Load the PEFT model by merging the adapter
	model = PeftModel.from_pretrained(base_model, peft_model_id)

	# --- Inference ---
	prompt = "Set the temperature to 21 degrees."
	formatted_prompt = f"[INST] {prompt} [/INST]"

	inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)

	with torch.no_grad():
	outputs = model.generate(**inputs, max_new_tokens=50)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	cleaned_response = response.split('[/INST]')[-1].strip()

	print(cleaned_response)
	# Expected output: Okay, setting the cabin temperature to 21 degrees.
	```

	---

	## 🛠️ Training Procedure

	The model was fine-tuned using the `SFTTrainer` from the TRL library. Key training parameters included a learning rate of `2e-4`, the `paged_adamw_8bit` optimizer, and 4-bit quantization to ensure efficient training on consumer hardware.

	### Framework versions
	- PEFT 0.17.1
	- TRL: 0.22.1
	- Transformers: 4.56.0
	- Pytorch: 2.8.0
	- Datasets: 4.0.0
	- Tokenizers: 0.22.0