Spaces:

Nomearod
/

agentbench

Running

Jane Yeung commited on Mar 25

Commit

2fc13b5

2 Parent(s): e2dd614 a6e2a9c

Merge pull request #1 from tyy0811/feature/ci-pipeline

Files changed (3) hide show

.github/workflows/ci.yaml CHANGED Viewed

@@ -1,13 +1,47 @@
 name: CI
-on: [push, pull_request]
 jobs:
   test:
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v4
       - uses: actions/setup-python@v5
         with:
           python-version: "3.11"
       - run: pip install -e ".[dev]"
-      - run: make lint
-      - run: make test

 name: CI
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
 jobs:
   test:
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v4
       - uses: actions/setup-python@v5
         with:
           python-version: "3.11"
+      - uses: actions/cache@v4
+        with:
+          path: ~/.cache/pip
+          key: ${{ runner.os }}-pip-${{ hashFiles('pyproject.toml') }}
+          restore-keys: ${{ runner.os }}-pip-
       - run: pip install -e ".[dev]"
+      - name: Lint
+        run: ruff check agent_bench/ tests/
+      - name: Type check
+        run: mypy agent_bench/ --ignore-missing-imports
+      - name: Run tests
+        run: pytest tests/ -v --tb=short
+  docker:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Build Docker image
+        run: docker build -f docker/Dockerfile -t agent-bench:ci .
+      - name: Smoke test
+        run: |
+          docker run --rm agent-bench:ci python -c \
+            "from agent_bench import __version__; print(__version__)"

README.md CHANGED Viewed

@@ -1,5 +1,7 @@
 # agent-bench
 Agentic RAG system with a 27-question evaluation harness, hybrid retrieval (FAISS + BM25 + RRF), tool use, and zero hallucinated citations — built from API primitives.
 Built as a portfolio project demonstrating AI engineering depth: provider abstraction, evaluation infrastructure, production patterns (FastAPI, Docker, CI, structured logging).

 # agent-bench
+![CI](https://github.com/tyy0811/agent-bench/actions/workflows/ci.yaml/badge.svg)
 Agentic RAG system with a 27-question evaluation harness, hybrid retrieval (FAISS + BM25 + RRF), tool use, and zero hallucinated citations — built from API primitives.
 Built as a portfolio project demonstrating AI engineering depth: provider abstraction, evaluation infrastructure, production patterns (FastAPI, Docker, CI, structured logging).

agent_bench/__init__.py CHANGED Viewed

	@@ -1 +1,3 @@
1	"""Evaluation-first agentic RAG system built from API primitives."""


1	"""Evaluation-first agentic RAG system built from API primitives."""
2	+
3	+ __version__ = "0.1.0"