Abu-Sameer-66/SciPeerBench
Viewer β’ Updated β’ 644 β’ 61
How to use Abu-Sameer-66/SciPeerAI-7B with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Abu-Sameer-66/SciPeerAI-7B")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Abu-Sameer-66/SciPeerAI-7B", dtype="auto")How to use Abu-Sameer-66/SciPeerAI-7B with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Abu-Sameer-66/SciPeerAI-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Abu-Sameer-66/SciPeerAI-7B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/Abu-Sameer-66/SciPeerAI-7B
How to use Abu-Sameer-66/SciPeerAI-7B with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Abu-Sameer-66/SciPeerAI-7B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Abu-Sameer-66/SciPeerAI-7B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Abu-Sameer-66/SciPeerAI-7B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Abu-Sameer-66/SciPeerAI-7B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use Abu-Sameer-66/SciPeerAI-7B with Docker Model Runner:
docker model run hf.co/Abu-Sameer-66/SciPeerAI-7B
World's first LLM fine-tuned specifically for scientific fraud detection.
Built by Sameer Nadeem β BS Data Science student, Bahawalpur, Pakistan.
SciPeerAI-7B analyzes scientific papers and outputs a structured JSON report covering 14 fraud dimensions simultaneously β something no other model does.
Given a paper's title, authors, abstract, and text, it returns:
| Field | Value |
|---|---|
| Base model | mistralai/Mistral-7B-Instruct-v0.3 |
| Fine-tuning method | QLoRA (4-bit, nf4) |
| LoRA rank | r=16, alpha=32 |
| Target modules | q_proj, k_proj, v_proj, o_proj |
| Training steps | 500 |
| Final training loss | 0.2352 |
| Training hardware | Kaggle T4 x2 (32GB RAM) |
| Training time | 147.7 minutes |
| Dataset | SciPeerBench v1.1 (644 papers) |
| Test Paper | Expected | Model Output |
|---|---|---|
| Wakefield 1998 (Lancet) β retracted vaccine-autism fraud | FRAUD | fraud_confidence=0.99 β |
| NDM-1 resistance paper (Lancet ID, 2010) β landmark clean paper | CLEAN | fraud_confidence=0.05 β |
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base = "mistralai/Mistral-7B-Instruct-v0.3"
adapter = "Abu-Sameer-66/SciPeerAI-7B"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
prompt = """<s>[INST] Analyze this scientific paper for fraud and integrity issues:
Title: Your Paper Title
Authors: Author Names
Year: 2024
Journal: Journal Name
Abstract: Your abstract text here...
Provide a detailed JSON analysis with fraud scores across all 14 dimensions. [/INST]
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=300, do_sample=False)
print(tokenizer.decode(output[0], skip_special_tokens=True))
| Module | What It Detects |
|---|---|
| Statistical Audit | p-hacking, sample size issues, round numbers |
| Figure Forensics | pHash + ELA + brightness manipulation |
| Methodology Checker | causation claims, missing control groups |
| Citation Analyzer | self-citation rings, unsupported claims |
| Reproducibility Scanner | code/data/ethics/preregistration |
| Novelty Scorer | structural signals + Semantic Scholar API |
| GRIM Test | mathematically impossible means |
| SPRITE Test | impossible distributions + SD verification |
| Granularity Analyzer | digit preference, Benford Law |
| P-Curve Analyzer | publication bias, p-value clustering |
| Effect Size Validator | Cohen d, power analysis, inflated effects |
| Retraction Checker | retracted citations via CrossRef live API |
| Citation Cartel Detector | citation rings, network manipulation |
| LLM Paper Detector | burstiness, TTR, uniformity patterns |
@misc{nadeem2026scipeerai,
title={SciPeerAI: Multi-dimensional Automated Scientific Integrity Analysis System},
author={Sameer Nadeem},
year={2026},
url={https://huggingface.co/Abu-Sameer-66/SciPeerAI-7B}
}
Base model
mistralai/Mistral-7B-v0.3