Spaces:
Running
Running
Merge pull request #1 from tyy0811/feature/ci-pipeline
Browse files- .github/workflows/ci.yaml +37 -3
- README.md +2 -0
- agent_bench/__init__.py +2 -0
.github/workflows/ci.yaml
CHANGED
|
@@ -1,13 +1,47 @@
|
|
| 1 |
name: CI
|
| 2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
jobs:
|
| 4 |
test:
|
| 5 |
runs-on: ubuntu-latest
|
| 6 |
steps:
|
| 7 |
- uses: actions/checkout@v4
|
|
|
|
| 8 |
- uses: actions/setup-python@v5
|
| 9 |
with:
|
| 10 |
python-version: "3.11"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
- run: pip install -e ".[dev]"
|
| 12 |
-
|
| 13 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
name: CI
|
| 2 |
+
|
| 3 |
+
on:
|
| 4 |
+
push:
|
| 5 |
+
branches: [main]
|
| 6 |
+
pull_request:
|
| 7 |
+
branches: [main]
|
| 8 |
+
|
| 9 |
jobs:
|
| 10 |
test:
|
| 11 |
runs-on: ubuntu-latest
|
| 12 |
steps:
|
| 13 |
- uses: actions/checkout@v4
|
| 14 |
+
|
| 15 |
- uses: actions/setup-python@v5
|
| 16 |
with:
|
| 17 |
python-version: "3.11"
|
| 18 |
+
|
| 19 |
+
- uses: actions/cache@v4
|
| 20 |
+
with:
|
| 21 |
+
path: ~/.cache/pip
|
| 22 |
+
key: ${{ runner.os }}-pip-${{ hashFiles('pyproject.toml') }}
|
| 23 |
+
restore-keys: ${{ runner.os }}-pip-
|
| 24 |
+
|
| 25 |
- run: pip install -e ".[dev]"
|
| 26 |
+
|
| 27 |
+
- name: Lint
|
| 28 |
+
run: ruff check agent_bench/ tests/
|
| 29 |
+
|
| 30 |
+
- name: Type check
|
| 31 |
+
run: mypy agent_bench/ --ignore-missing-imports
|
| 32 |
+
|
| 33 |
+
- name: Run tests
|
| 34 |
+
run: pytest tests/ -v --tb=short
|
| 35 |
+
|
| 36 |
+
docker:
|
| 37 |
+
runs-on: ubuntu-latest
|
| 38 |
+
steps:
|
| 39 |
+
- uses: actions/checkout@v4
|
| 40 |
+
|
| 41 |
+
- name: Build Docker image
|
| 42 |
+
run: docker build -f docker/Dockerfile -t agent-bench:ci .
|
| 43 |
+
|
| 44 |
+
- name: Smoke test
|
| 45 |
+
run: |
|
| 46 |
+
docker run --rm agent-bench:ci python -c \
|
| 47 |
+
"from agent_bench import __version__; print(__version__)"
|
README.md
CHANGED
|
@@ -1,5 +1,7 @@
|
|
| 1 |
# agent-bench
|
| 2 |
|
|
|
|
|
|
|
| 3 |
Agentic RAG system with a 27-question evaluation harness, hybrid retrieval (FAISS + BM25 + RRF), tool use, and zero hallucinated citations — built from API primitives.
|
| 4 |
|
| 5 |
Built as a portfolio project demonstrating AI engineering depth: provider abstraction, evaluation infrastructure, production patterns (FastAPI, Docker, CI, structured logging).
|
|
|
|
| 1 |
# agent-bench
|
| 2 |
|
| 3 |
+

|
| 4 |
+
|
| 5 |
Agentic RAG system with a 27-question evaluation harness, hybrid retrieval (FAISS + BM25 + RRF), tool use, and zero hallucinated citations — built from API primitives.
|
| 6 |
|
| 7 |
Built as a portfolio project demonstrating AI engineering depth: provider abstraction, evaluation infrastructure, production patterns (FastAPI, Docker, CI, structured logging).
|
agent_bench/__init__.py
CHANGED
|
@@ -1 +1,3 @@
|
|
| 1 |
"""Evaluation-first agentic RAG system built from API primitives."""
|
|
|
|
|
|
|
|
|
| 1 |
"""Evaluation-first agentic RAG system built from API primitives."""
|
| 2 |
+
|
| 3 |
+
__version__ = "0.1.0"
|