Vehicle Recall Retrieval-Augmented Generation (RAG) Pipeline

Introduction

This project implements a retrieval-augmented generation (RAG) system for vehicle recall summarization. Standard language models struggle with tasks that require current, domain-specific factual knowledge such as automotive recall information, which is updated continuously and not fully captured in model pretraining data. By integrating a retrieval module over official recall summaries with the Flan-T5 Base model, the system generates grounded, context-aware explanations tailored to a specific vehicle make, model, and year. This approach reduces hallucinations, improves factual accuracy, and produces clearer and more useful recall summaries than using a generative model alone.

Data

This RAG system uses vehicle recall summaries derived from the NHTSA recall dataset used in earlier project check-ins. The original dataset contained many fields, but only the columns needed for retrieval were kept: MAKE, MODEL, MODEL YEAR, and SUMMARY. During preprocessing, rows with missing or empty summaries were removed, text formatting was standardized, and entries without usable recall descriptions were filtered out.

These cleaned recall records were used in Check-ins 2 through 4 to evaluate retrieval quality and model outputs. A fixed set of evaluation queries was created, including vehicles such as Ford F-150 2021, Toyota Camry 2020, and Honda Civic 2022. Each query was paired with its correct recall description so that ground truth remained consistent before and after adding retrieval.

For the Hugging Face repository, a smaller demonstration file called sample_recalls.csv is included. It maintains the same structure as the full dataset but contains only a representative subset to make the full RAG workflow reproducible without loading the complete NHTSA dataset.

You can access the full dataset here: NHTSA Dataset

Methodology

The system follows a standard RAG architecture: an embedding model retrieves relevant recall summaries, and a generative model produces the final explanation. Based on experiments in Check-ins 3 and 4, the best-performing setup used the all-MiniLM-L6-v2 sentence-transformer for stable retrieval behavior and compatibility with Flan-T5's token limits. Each recall summary is converted to an embedding vector, and cosine similarity is used during inference to rank the most relevant entries for a given vehicle query.

The retrieved summaries are injected into an instruction-style prompt and passed to google/flan-t5-base for generation. Earlier evaluations showed that Flan-T5 alone produced short or repetitive outputs and lacked factual grounding, while retrieval provided domain-specific context that corrected these issues.

Evaluation

The evaluation used the same three vehicle-specific recall queries established earlier: Ford F-150 2021, Toyota Camry 2020, and Honda Civic 2022. TF-IDF cosine similarity was computed between each generated summary and its ground-truth recall description. Baseline models included Flan-T5 Base (no retrieval), DistilGPT-2, and TinyLlama-1.1B. The final RAG pipeline outperformed all baselines by retrieving domain-specific recall information and producing more accurate and grounded explanations.

Results

Model Ford F-150 2021 Toyota Camry 2020 Honda Civic 2022 Average Similarity
DistilGPT-2 <0.01 <0.01 <0.01 <0.01
TinyLlama-1.1B <0.01 <0.01 <0.01 <0.01
Flan-T5 Base (no RAG) 0.290 0.008 0.001 0.099
RAG Pipeline (MiniLM-L6-v2 + Flan-T5) 0.72 0.60 0.65 0.66

The RAG system improves average similarity by more than a factor of six relative to the standalone Flan-T5 model.

Usage and Intended Uses

This repository provides a lightweight retrieval-augmented generation pipeline for producing grounded recall summaries for a specific vehicle make, model, and year. It is intended for educational and research use. Users can load the included sample dataset, generate embeddings, perform retrieval, and run the full RAG pipeline locally.

Example Usage

from rag_pipeline import load_data, rag_answer

df = load_data("sample_recalls.csv")

result = rag_answer(df, "Ford", "F-150", 2021)

print(result)

Prompt Format

The RAG system uses an instruction-style prompt that embeds retrieved recall summaries followed by a question. This format was used throughout earlier project stages and consistently yielded the strongest results with Flan-T5 Base.

You are an assistant that provides grounded and accurate vehicle recall information.
Below are relevant recall summaries retrieved from the database:

{retrieved_context}

Using the information above, provide a clear recall explanation for the following vehicle:
Make: {make}
Model: {model}
Year: {year}

Expected Output Format

The model returns a short, grounded explanation written in natural language. The content stays tied to retrieved recall summaries and avoids speculation.

The 2021 Ford F-150 may experience loose underbody insulators that can detach while driving, increasing the risk of a crash. Owners should have the vehicle inspected and any affected components replaced as necessary.

Reproducibility

To reproduce the evaluation results:

  1. Load the sample dataset
df = load_data("sample_recalls.csv")
  1. Use the same evaluation queries used throughout the project
queries = [
    ("Ford", "F-150", 2021),
    ("Toyota", "Camry", 2020),
    ("Honda", "Civic", 2022),
]
  1. Run each query through the RAG pipeline
rag_answer(df, make, model, year)
  1. Compute TF-IDF cosine similarity between the generated output and the correct recall description.

This reproduces the similarity scores shown in the results table.

Limitations

This is a lightweight demonstration of retrieval-augmented generation and is not intended for real-world safety decisions. sample_recalls.csv is a small subset of the full NHTSA dataset, so recall coverage is limited. The system generates summaries rather than official guidance, and retrieved context may omit details found in full recall documentation. Flan-T5 Base has limited capacity compared to larger models, which may affect completeness and fluency. Results should be interpreted as educational examples only.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aowens25/vehicle-recall-rag

Finetuned
(872)
this model