Spaces:
Sleeping
Sleeping
RAGDebugEnv Reference Context
This file is a fast orientation guide for contributors and coding agents working in this repository.
What This Repository Implements
A simulated OpenEnv environment for debugging RAG retrieval pipelines by RL-style interaction.
- Server app:
server/app.py - Environment logic:
server/rag_debug_env_environment.py - Action/observation models:
models.py - Client:
client.py - Inference script:
inference.py(competition-ready, uses OpenAI client) - Corpus build pipeline:
corpora/build_corpus.pyandcorpora/stages/*
Project Layout (Current)
rag_debug_env/
corpora/
build_corpus.py
software/
climate/
medical/
stages/
s1_load.py
s2_chunk.py
s3_queries.py
s4_multihop.py
s5_embed.py
s6_grade.py
verify.py
playground.py
docs/
ARCHITECTURE.md
BUILD_STATUS.md
CLAUDE.md
CORPUS_BUILD_PLAN.md
MODELS_REFERENCE.md
outputs/
eval_agent.py
train_grpo.py
server/
app.py
constants.py
corpus.py
fault_math.py
rag_debug_env_environment.py
client.py
inference.py
models.py
pyproject.toml
openenv.yaml
Dockerfile
Runtime Facts To Keep In Mind
- All tasks currently use
max_steps=10 PipelineConfig.similarity_thresholddefault is0.3(not0.7)- Task 3 starts with
embedding_model=legalintentionally - Task success is based on task score thresholds in
_check_success, not raw coverage alone - Synthetic corpus fallback exists in
server/corpus.pyfor missing artifacts - HF Spaces port is
7860(set in README frontmatter, Dockerfile, and openenv.yaml)
Corpus Build Facts
corpora/build_corpus.py runs all six stages and calls verify_corpus.
Outputs per domain:
docs.jsonchunks.jsonqueries.jsonground_truth.jsonS_true_general.npyS_true_medical.npyS_true_legal.npyS_true_code.npycorpus_stats.json
Scripts
outputs/eval_agent.py— GPT-4o-mini zero-shot eval agent (actively usable).outputs/train_grpo.py— GRPO training scaffold (stub with TODOs).inference.py— Competition inference script with [START]/[STEP]/[END] logging.
Commands
# Build corpus for one domain
python -m corpora.build_corpus --domain software
# Build all domains
python -m corpora.build_corpus --domain all
# Run server
uvicorn server.app:app --host 0.0.0.0 --port 7860
# Run baseline evaluator
python outputs/eval_agent.py --task 1 --episodes 3
# Run inference script
python inference.py
# Validate OpenEnv integration
openenv validate