In a Training Loop 🔄

3 1

SameerNajm PRO

samerzaher80

sameer-s-najm-15b43395

AI & ML interests

Recent Activity

updated a model 23 days ago

samerzaher80/AetherMind_SRL

upvoted an article 25 days ago

AetherMind_SRL: Self-Reflective Learning for Robust Natural Language Inference

posted an update about 1 month ago

AetherMind_SRL: How I beat 7B models on MMLU with 184M params and a $300 GPU I’m Sameer, a solo researcher from Iraq working on a single RTX 3050 8GB laptop.Today I’m releasing AetherMind_SRL – a 184M-parameter NLI model that was trained only on tasks (SNLI, MNLI, ANLI, and a small clinical Alzheimer’s dataset). It was never fine-tuned or even shown a single MMLU question during training.Yet here are the zero-shot MMLU (57 subjects) results:Model MMLU Zero-Shot Training Data AetherMind_SRL (me) 184M 36.05 % Only NLI (SNLI/MNLI/ANLI + ADNI) DeBERTa-v3-base 278M ~30.8 % General pre-training BERT-large 340M 27–30 % General pre-training LLaMA-1 7B 7B 34–35 % Massive text corpus LLaMA-2 7B 7B ~45 % Bigger + better data Yes – my 184M model beats every classic 300–400M model and the original 7-billion-parameter LLaMA-1, all while running at 300+ samples/sec on a $300 laptop GPU.How did this happen?I built a standardized self-improvement loop called AetherMind Self-Reflective Learning (SRL) v1.0:Train normally on NLI Let the model predict on hard adversarial data (ANLI) Log every mistake + low-confidence case Build a balanced “SMART” buffer (60% errors + 40% correct anchors) Fine-tune with tiny LR and error-weighted loss Repeat until stable That’s it. No external knowledge, no MMLU data, no cluster. Just pure reasoning transfer from entailment/contradiction patterns → real-world knowledge.Try it yourself python from transformers import pipeline import torch nli_pipeline = pipeline( "text-classification", model="samerzaher80/AetherMind_SRL", device=0 if torch.cuda.is_available() else -1 ) # DEFINE YOUR TEST HERE premise = "Patient shows progressive memory decline." hypothesis = "Patient shows progressive memory decline." input_text = f"{premise} [SEP] {hypothesis}" result = nli_pipeline(input_text)[0] print(f"Prediction: {result['label']}") print(f"Confidence: {result['score']: Model: https://huggingface.co/samerzaher80/AetherMind_SRL

View all activity

Organizations

None yet

updated a model 23 days ago

samerzaher80/AetherMind_SRL

Text Classification • 0.2B • Updated 23 days ago • 24 • 1

upvoted an article 25 days ago

Article

AetherMind_SRL: Self-Reflective Learning for Robust Natural Language Inference

Nov 21

•

posted an update about 1 month ago

Post

1627

AetherMind_SRL: How I beat 7B models on MMLU with 184M params and a $300 GPU
I’m Sameer, a solo researcher from Iraq working on a single RTX 3050 8GB laptop.Today I’m releasing AetherMind_SRL – a 184M-parameter NLI model that was trained only on tasks (SNLI, MNLI, ANLI, and a small clinical Alzheimer’s dataset).
It was never fine-tuned or even shown a single MMLU question during training.Yet here are the zero-shot MMLU (57 subjects) results:Model
MMLU Zero-Shot
Training Data
AetherMind_SRL (me)
184M
36.05 %
Only NLI (SNLI/MNLI/ANLI + ADNI)
DeBERTa-v3-base
278M
~30.8 %
General pre-training
BERT-large
340M
27–30 %
General pre-training
LLaMA-1 7B
7B
34–35 %
Massive text corpus
LLaMA-2 7B
7B
~45 %
Bigger + better data

Yes – my 184M model beats every classic 300–400M model and the original 7-billion-parameter LLaMA-1, all while running at 300+ samples/sec on a $300 laptop GPU.How did this happen?I built a standardized self-improvement loop called AetherMind Self-Reflective Learning (SRL) v1.0:Train normally on NLI
Let the model predict on hard adversarial data (ANLI)
Log every mistake + low-confidence case
Build a balanced “SMART” buffer (60% errors + 40% correct anchors)
Fine-tune with tiny LR and error-weighted loss
Repeat until stable
That’s it. No external knowledge, no MMLU data, no cluster.
Just pure reasoning transfer from entailment/contradiction patterns → real-world knowledge.Try it yourself python
from transformers import pipeline
import torch

nli_pipeline = pipeline(
"text-classification",
model="samerzaher80/AetherMind_SRL",
device=0 if torch.cuda.is_available() else -1
)

# DEFINE YOUR TEST HERE
premise = "Patient shows progressive memory decline."
hypothesis = "Patient shows progressive memory decline."

input_text = f"{premise} [SEP] {hypothesis}"
result = nli_pipeline(input_text)[0]
print(f"Prediction: {result['label']}")
print(f"Confidence: {result['score']:
Model: samerzaher80/AetherMind_SRL

posted an update about 1 month ago

Post

275

Need Help Getting arXiv Endorsement for My AI Research Paper

Hi everyone,
I hope you're doing well. I’m trying to publish my new AI research paper on arXiv under the cs.AI category, but I currently need an endorser who is already authorized for cs.AI submissions.

If anyone here is registered as a cs.AI endorser and is willing to help, I would truly appreciate it.

Here is the official arXiv endorsement request link:

🔗 https://arxiv.org/auth/endorse?x=EZEMO7
(Backup: http://arxiv.org/auth/endorse.php — Code: EZEMO7)

My research:
It’s part of the AetherMind project — a self-reflective NLI reasoning system inspired by human cognitive consistency and used also in Alzheimer’s research. If needed, I can share the abstract or full PDF.

Thank you so much to anyone who can support.

— Sameer S.Najm

posted an update about 1 month ago

Post

1851

reacted to aufklarer's post with 🔥 about 1 month ago

Post

1448

Couple months ago I fine‑tuned Qwen3 Embeddings with LoRA on the LSPC dataset. This time I went the opposite way: a small, task‑specific 80M encoder with bidirectional attention, trained end‑to‑end. It outperforms the Qwen3 LoRA baseline on the same data (0.9315 macro‑F1 vs 0.8360). Details and code: https://blog.ivan.digital/beating-qwen3-lora-with-a-tiny-pytorch-encoder-on-the-large-scale-product-corpus-afe536de205f

posted an update about 1 month ago

Post

868

New Research Release: AetherMind_SRL — Self-Reflective Learning for Robust Natural Language Inference (NLI)
By: Sameer S. Najm

After more than a year of continuous work, experimentation, and refinement, I am proud to officially publish AetherMind_SRL, a self-improving Transformer model trained using Self-Reflective Learning (SRL) — a technique that enables the model to learn from its own mistakes.

This research integrates:

🔹 Knowledge Distillation from DeBERTa-v3-base
🔹 Self-Reflective Learning loops
🔹 Adversarial ANLI training
🔹 Clinical Alzheimer’s reasoning (ADNI)
🔹 SMART Error Buffers for hard-example mining

What makes it unique?
AetherMind_SRL continuously improves through structured error logs, balanced correction buffers, and clinical-domain adaptation. It achieves strong performance across general NLI benchmarks, adversarial datasets, and Alzheimer’s-focused reasoning tasks.

Official Research Preprint (Zenodo DOI):
https://doi.org/10.5281/zenodo.17670971
Model Repository (Hugging Face):
samerzaher80/AetherMind_SRL
Full community article (Hugging Face Blog):
https://huggingface.co/blog/samerzaher80/aethermind-srl

New activity in samerzaher80/AetherMind_SRL about 1 month ago

AetherMind_SRL Released – SRL-ANLI Round 12 Paper (DOI Inside)

#1 opened about 1 month ago by

samerzaher80

published an article about 1 month ago

Article

AetherMind_SRL: Self-Reflective Learning for Robust Natural Language Inference

Nov 21

•

published a model about 1 month ago

samerzaher80/AetherMind_SRL

Text Classification • 0.2B • Updated 23 days ago • 24 • 1

updated a dataset about 1 month ago

samerzaher80/NLI_DataSets

Viewer • Updated Nov 21 • 6.4k • 39

published a dataset about 1 month ago

samerzaher80/NLI_DataSets

Viewer • Updated Nov 21 • 6.4k • 39

published an article about 1 month ago

Article

AetherMind-KD-Student: A Robust and Efficient Knowledge-Distilled NLI Model

Nov 15

posted an update about 1 month ago

Post

281

AetherMind-KD-Student is a 184M-parameter Natural Language Inference (NLI) model distilled from a DeBERTa-v3 teacher using a multi-stage, adversarial-aware knowledge distillation pipeline.
The model is designed to provide:

High accuracy on standard NLI benchmarks
Strong robustness on adversarial datasets
Excellent zero-shot generalization to unseen datasets
High inference efficiency on consumer GPUs

This makes it suitable for research and practical applications that require fast and reliable sentence-level reasoning.
samerzaher80/AetherMind-KD-Student

reacted to evalstate's post with 🤗 about 1 month ago

Post

2357

Hugging Face MCP Server v0.2.46
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Add "discover" to Dynamic Space tool. Recommend deselecting "space_search" if using dynamic spaces.