electrical-embeddinggemma-ir_lora

Model Description

This model is a LoRA adapter fine-tuned from unsloth/embeddinggemma-300m — Unsloth's optimized mirror of Google's EmbeddingGemma-300M — for feature-extraction tasks, specifically dense Information Retrieval (IR) in the electrical and electronics engineering domain. The adapter has been trained to embed technical queries and passages with high semantic fidelity, enabling near-perfect retrieval over engineering documentation, standards, datasheets, and textbooks.

Unlike the merged variants, this repository contains only the LoRA adapter weights (~17 MB). Load it by stacking it on top of the base EmbeddingGemma-300M model using the sentence-transformers or PEFT library.

Training Data

The model was trained on the disham993/ElectricalElectronicsIR dataset — 20,000 question-passage pairs covering electrical engineering, electronics, power systems, and communications.

16k train / 2k validation / 2k test
Queries: 133–822 characters; passages: 586–5,590 characters
Topics include phased array antennas, IEC 61850 protocols, Josephson junctions, OTDR measurements, MIMO channel estimation, FPGA partial reconfiguration, and more

Model Details


Base Model	`unsloth/embeddinggemma-300m` (308M params)
Adapter type	LoRA
Task	Feature Extraction (Dense IR / Semantic Search)
Language	English (en)
Dataset	`disham993/ElectricalElectronicsIR`
Adapter size	~17 MB
License	MIT

Training Procedure

Training Hyperparameters


Method	LoRA via Unsloth's `FastSentenceTransformer`
LoRA rank / alpha	r=32, α=64
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Loss	`MultipleNegativesRankingLoss` (in-batch negatives)
Batch size	128 per device × 2 gradient accumulation = 256 effective
Learning rate	2e-5 (linear schedule, 3% warmup)
Max steps	100
Max sequence length	1024
Precision	bf16
Batch sampler	`NO_DUPLICATES`
Hardware	NVIDIA RTX 5090

Evaluation Results

Evaluated on the held-out test split (2,000 queries) of disham993/ElectricalElectronicsIR using sentence_transformers.evaluation.InformationRetrievalEvaluator.

Model	MAP@100	NDCG@10	MRR@10	Recall@10
`unsloth/embeddinggemma-300m` (baseline)	0.5753	0.6221	0.5682	0.7925
`electrical-embeddinggemma-ir_lora` (this model)	0.9795	0.9847	0.9795	1.0000
`electrical-embeddinggemma-ir_finetune_16bit`	0.9797	0.9849	0.9797	1.0000
`electrical-embeddinggemma-ir_f16`	0.9849	0.9887	0.9849	0.9995
`electrical-embeddinggemma-ir_q8_0`	0.9844	0.9883	0.9844	0.9995
`electrical-embeddinggemma-ir_q4_k_m`	0.9841	0.9879	0.9840	0.9990
`electrical-embeddinggemma-ir_q5_k_m`	0.9824	0.9866	0.9823	0.9990

+41 pp MAP@100 and +73% relative MRR@10 improvement over the general-purpose baseline.

Usage

# Install dependencies
pip install sentence-transformers torch

import torch
import torch.nn.functional as F
from sentence_transformers import SentenceTransformer

# === SEMANTIC SEARCH EXAMPLE ===
if __name__ == "__main__":
    print("Downloading and Booting Engine...")
    
    # SentenceTransformers flawlessly supports this repository natively!
    model = SentenceTransformer("disham993/electrical-electronics-gemma-ir_lora")
    
    query = "How do transformers step up voltage?"
    
    # A miniature corpus of engineering documents
    documents = [
        "Ohm's law defines the relationship between voltage, current, and resistance.",
        "AC circuits use alternating current which changes direction periodically.",
        "A step-up transformer has more turns on its secondary coil than its primary, increasing voltage.",
        "Capacitors store electrical energy in an electric field.",
        "Inductors resist changes in electric current passing through them.",
        "Transformers operate on Faraday's law of induction to transfer energy between circuits.",
        "Diodes allow current to pass in only one direction.",
        "Voltage is the electric potential difference between two points."
    ]
    
    print("Extracting Embeddings...")
    # Convert texts directly to PyTorch tensors
    query_emb = model.encode(query, convert_to_tensor=True)
    doc_embs = model.encode(documents, convert_to_tensor=True)
    
    # Calculate similarities natively 
    similarities = F.cosine_similarity(query_emb.unsqueeze(0), doc_embs)
    
    # Retrieve the top 3 highest scoring documents
    top_3_idx = torch.topk(similarities, k=3).indices.tolist()
    
    print(f"\n--- Top 3 Documents for Query: '{query}' ---")
    for rank, idx in enumerate(top_3_idx, 1):
        print(f"Rank {rank} (Score: {similarities[idx]:.4f}) | {documents[idx]}")

Limitations and Bias

While this model performs exceptionally well in the electrical and electronics engineering domain, it is not designed for use in other domains. Additionally, it may:

Underperform on queries that mix electrical engineering with unrelated domains (e.g., biomedical, legal, financial)
Show reduced performance on non-English text or highly colloquial phrasing
Require the base unsloth/embeddinggemma-300m model to be loaded alongside the adapter when using PEFT directly

This model is intended for research, educational, and production IR applications in the electrical engineering domain.

Training Infrastructure

For the complete fine-tuning and evaluation pipeline — from data loading to GGUF export — refer to the GitHub repository and the notebooks Finetuning_EmbeddingGemma_EEIR_RTX_5090.ipynb and Evaluate_All_Models.ipynb.

Last Update

2026-04-18

Citation

@misc{electrical-embeddinggemma-ir,
  author       = {disham993},
  title        = {Electrical \& Electronics Engineering Embedding Models},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/collections/disham993/electrical-and-electronics-engineering-embedding-models}},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for disham993/electrical-embeddinggemma-ir_lora

Base model

unsloth/embeddinggemma-300m

Adapter

(1)

this model

Dataset used to train disham993/electrical-embeddinggemma-ir_lora

Collection including disham993/electrical-embeddinggemma-ir_lora

Electrical and Electronics Engineering Embedding Models

Collection

Domain-specialized embedding models for Electrical and Electronics Engineering Information Retrieval tasks. Fine-tuned from EmbeddingGemma-300M. • 6 items • Updated Apr 20 • 1