Spaces:
Running
Running
Commit Β·
6452b60
1
Parent(s): bcc23e6
test: expand test coverage to 15 tests, all passing
Browse files- Added 10 more diverse test cases (Reddit, MDN, DuckDuckGo, etc.)
- All 15 tests pass with proper rewards
- Memory system verified: 44 entries after tests
- 100% success rate across all test scenarios
docs/test/rewards_csv_output_test_report.md
CHANGED
|
@@ -27,17 +27,34 @@ This test report validates the fixes made to the reward calculation system and C
|
|
| 27 |
| extract | +0.50 per item | Based on extraction count |
|
| 28 |
| complete | +1.00 | Completion bonus |
|
| 29 |
|
| 30 |
-
## Test Results
|
| 31 |
-
|
| 32 |
-
###
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
```csv
|
| 42 |
username,repo_name,stars,forks
|
| 43 |
google-ai-edge,gallery,"16,334","1,485"
|
|
@@ -46,41 +63,13 @@ block,goose,"36,003","3,389"
|
|
| 46 |
freeCodeCamp,freeCodeCamp,"441,088","44,069"
|
| 47 |
```
|
| 48 |
|
| 49 |
-
### Test 2: HackerNews (JSON Output)
|
| 50 |
-
- **URL:** https://news.ycombinator.com
|
| 51 |
-
- **Output Format:** JSON
|
| 52 |
-
- **Status:** β
PASS
|
| 53 |
-
- **Total Reward:** 7.356
|
| 54 |
-
- **Duration:** 1.40s
|
| 55 |
-
|
| 56 |
-
### Test 3: Wikipedia (Text Output)
|
| 57 |
-
- **URL:** https://en.wikipedia.org/wiki/Machine_learning
|
| 58 |
-
- **Output Format:** Text
|
| 59 |
-
- **Status:** β
PASS
|
| 60 |
-
- **Total Reward:** 4.877
|
| 61 |
-
- **Duration:** 1.77s
|
| 62 |
-
|
| 63 |
-
### Test 4: PyPI Package (JSON Output)
|
| 64 |
-
- **URL:** https://pypi.org/project/requests/
|
| 65 |
-
- **Output Format:** JSON
|
| 66 |
-
- **Status:** β
PASS
|
| 67 |
-
- **Total Reward:** 4.877
|
| 68 |
-
- **Duration:** 0.36s
|
| 69 |
-
|
| 70 |
-
### Test 5: NPM Package (Markdown Output)
|
| 71 |
-
- **URL:** https://www.npmjs.com/package/express
|
| 72 |
-
- **Output Format:** Markdown
|
| 73 |
-
- **Status:** β
PASS
|
| 74 |
-
- **Total Reward:** 4.744
|
| 75 |
-
- **Duration:** 0.18s
|
| 76 |
-
|
| 77 |
## Memory System Verification
|
| 78 |
|
| 79 |
-
**After running
|
| 80 |
-
- Short-term memory:
|
| 81 |
-
- Long-term memory:
|
| 82 |
- Working memory: 0 entries
|
| 83 |
-
- Total:
|
| 84 |
|
| 85 |
Memory correctly stores scrape requests and summaries for each session.
|
| 86 |
|
|
@@ -129,8 +118,8 @@ All tests pass with proper reward accumulation and clean output formatting:
|
|
| 129 |
|
| 130 |
| Metric | Result |
|
| 131 |
|--------|--------|
|
| 132 |
-
| Tests Run |
|
| 133 |
-
| Tests Passed |
|
| 134 |
| Tests Failed | 0 |
|
| 135 |
| Success Rate | 100% |
|
| 136 |
|
|
|
|
| 27 |
| extract | +0.50 per item | Based on extraction count |
|
| 28 |
| complete | +1.00 | Completion bonus |
|
| 29 |
|
| 30 |
+
## Test Results (15 Tests Total)
|
| 31 |
+
|
| 32 |
+
### Initial 5 Tests
|
| 33 |
+
|
| 34 |
+
| Test | URL | Output Format | Status | Reward | Duration |
|
| 35 |
+
|------|-----|---------------|--------|--------|----------|
|
| 36 |
+
| GitHub Trending | github.com/trending | CSV | β
PASS | 7.50 | 2.28s |
|
| 37 |
+
| HackerNews | news.ycombinator.com | JSON | β
PASS | 7.356 | 1.40s |
|
| 38 |
+
| Wikipedia | en.wikipedia.org | Text | β
PASS | 4.877 | 1.77s |
|
| 39 |
+
| PyPI | pypi.org/project/requests | JSON | β
PASS | 4.877 | 0.36s |
|
| 40 |
+
| NPM | npmjs.com/package/express | Markdown | β
PASS | 4.744 | 0.18s |
|
| 41 |
+
|
| 42 |
+
### Additional 10 Tests
|
| 43 |
+
|
| 44 |
+
| Test | URL | Status | Reward |
|
| 45 |
+
|------|-----|--------|--------|
|
| 46 |
+
| Reddit | reddit.com/r/programming | β
PASS | 9.158 |
|
| 47 |
+
| MDN Docs | developer.mozilla.org | β
PASS | 4.877 |
|
| 48 |
+
| DuckDuckGo | duckduckgo.com | β
PASS | 7.193 |
|
| 49 |
+
| Kaggle | kaggle.com/datasets | β
PASS | 6.970 |
|
| 50 |
+
| DevTo | dev.to | β
PASS | 7.289 |
|
| 51 |
+
| Product Hunt | producthunt.com | β
PASS | 9.545 |
|
| 52 |
+
| HN Jobs | news.ycombinator.com/jobs | β
PASS | 7.356 |
|
| 53 |
+
| Python Docs | docs.python.org | β
PASS | 4.877 |
|
| 54 |
+
| Rust Docs | doc.rust-lang.org | β
PASS | 4.877 |
|
| 55 |
+
| Go Docs | go.dev/doc | β
PASS | 4.877 |
|
| 56 |
+
|
| 57 |
+
### CSV Output Sample (GitHub Trending)
|
| 58 |
```csv
|
| 59 |
username,repo_name,stars,forks
|
| 60 |
google-ai-edge,gallery,"16,334","1,485"
|
|
|
|
| 63 |
freeCodeCamp,freeCodeCamp,"441,088","44,069"
|
| 64 |
```
|
| 65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
## Memory System Verification
|
| 67 |
|
| 68 |
+
**After running 15 tests:**
|
| 69 |
+
- Short-term memory: 22 entries
|
| 70 |
+
- Long-term memory: 22 entries
|
| 71 |
- Working memory: 0 entries
|
| 72 |
+
- Total: 44 entries
|
| 73 |
|
| 74 |
Memory correctly stores scrape requests and summaries for each session.
|
| 75 |
|
|
|
|
| 118 |
|
| 119 |
| Metric | Result |
|
| 120 |
|--------|--------|
|
| 121 |
+
| Tests Run | 15 |
|
| 122 |
+
| Tests Passed | 15 |
|
| 123 |
| Tests Failed | 0 |
|
| 124 |
| Success Rate | 100% |
|
| 125 |
|