36 3 2

Aamer Mihaysi

O96a

https://www.mehaisi.com/

AI & ML interests

Ethical AI, NLP & Cognitive architectures

Recent Activity

updated a Space about 7 hours ago

O96a/exp-006-stale-memory-validator

published a Space about 7 hours ago

O96a/exp-006-stale-memory-validator

published a Space 1 day ago

O96a/graphrag-memory-exp

View all activity

Organizations

O96a 's Spaces 51

Implicit Memory Conflict Validator

🧠

Evaluate LLM responses for outdated memory conflicts

Graphrag Memory Exp

🐨

Sudanese CoT Reasoning Benchmark

🧠

Run Sudanese Arabic reasoning benchmark with step-by-step analysis

COPSD Sudanese Reasoning Demo

🚀

Compare Sudanese math reasoning with and without English context

Sudanese Arabic Context Optimization

🧪

PrefixGuard Demo - Agent Failure Detection

🛡

Detect potential agent failures from execution traces

DCI vs Semantic RAG: Testing Direct Corpus Interaction

🔍

LoPE Demo - Prompt Perturbation for Reasoning Exploration

🧠

Compare baseline and perturbed reasoning for tasks

Generate Sudanese Arabic poetry from any topic

Sudanese Poetry Experiment

🚀

Generate Sudanese Arabic poems on any topic

LenVM Token-Level Length Control Demo

📏

Lost-in-Thought Benchmark

🧠

Run a benchmark to see how reasoning steps affect retrieval accuracy

Sudanese Dialect Mt Stress

🏃

Master Key Capability Demo

🔑

Show expected accuracy boost for a math problem via steering

AutoResearchBench Explorer

🔬

AutoResearchBench Explorer

🔬

OneManCompany Talent Market Explorer

🚀

OneManCompany Talent Market Explorer

🚀

Agentic World Model Explorer

🚀

Explore world model levels, laws, and rollouts interactively

COSPLAY Skill Bank Demo

🚀

Generate baseline vs skill‑augmented LLM answer

COSPLAY Skill Bank Demo

🚀

Aamer Mihaysi

AI & ML interests

Recent Activity

Organizations

O96a 's Spaces 51 Sort: Recently updated

Implicit Memory Conflict Validator

Graphrag Memory Exp

Sudanese CoT Reasoning Benchmark

COPSD Sudanese Reasoning Demo

Sudanese Arabic Context Optimization

PrefixGuard Demo - Agent Failure Detection

DCI vs Semantic RAG: Testing Direct Corpus Interaction

LoPE Demo - Prompt Perturbation for Reasoning Exploration

ARIS Adversarial Review Demo

Hierarchical Tree RAG Demo

Step-level Cascade for Efficient Agents

Sudanese Poetry Experiment

Sudanese Poetry Experiment

LenVM Token-Level Length Control Demo

Lost-in-Thought Benchmark

Sudanese Dialect Mt Stress

Master Key Capability Demo

AutoResearchBench Explorer

AutoResearchBench Explorer

OneManCompany Talent Market Explorer

OneManCompany Talent Market Explorer

Agentic World Model Explorer

COSPLAY Skill Bank Demo

COSPLAY Skill Bank Demo

O96a 's Spaces 51