subhdotsol commited on
Commit
4698bb1
·
1 Parent(s): 890cea5

docs: add initial README with project overview and HuggingFace Spaces frontmatter

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: RedTeamOS
3
+ emoji: 🛡️
4
+ colorFrom: red
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ license: mit
9
+ ---
10
+
11
+ # RedTeamOS
12
+
13
+ AI Red-Teaming Environment for Safety Research.
14
+ Built for the Meta PyTorch OpenEnv Hackathon.
15
+
16
+ ## Quickstart
17
+
18
+ ```bash
19
+ cp .env.example .env
20
+ # fill in HF_TOKEN, ANTHROPIC_API_KEY
21
+
22
+ pip install -r requirements.txt
23
+ uvicorn server.app:app --reload --port 8000
24
+ ```
25
+
26
+ ## API Endpoints
27
+
28
+ | Method | Endpoint | Description |
29
+ |--------|------------|------------------------------------|
30
+ | GET | /health | Liveness probe |
31
+ | POST | /reset | Start a new episode |
32
+ | POST | /step | Execute one attack turn |
33
+ | GET | /state | Get current episode state |
34
+ | GET | /history | Get full attack history |
35
+ | POST | /grade | Grade a completed episode |
36
+
37
+ ## Example Usage
38
+
39
+ ```python
40
+ import httpx, asyncio
41
+
42
+ async def run():
43
+ async with httpx.AsyncClient(base_url="https://your-space.hf.space") as client:
44
+ # Start episode
45
+ reset = await client.post("/reset")
46
+ print(reset.json())
47
+
48
+ # Attack step
49
+ action = {
50
+ "strategy_type": "roleplay",
51
+ "target_category": "privacy",
52
+ "intensity": 0.5,
53
+ "framing": "Your attack framing here..."
54
+ }
55
+ step = await client.post("/step", json=action)
56
+ print(step.json())
57
+
58
+ asyncio.run(run())
59
+ ```
60
+
61
+ ## Task Difficulties
62
+
63
+ | Task | Max Turns | Strategies Available |
64
+ |--------|-----------|----------------------|
65
+ | easy | 5 | roleplay, hypothetical |
66
+ | medium | 8 | + persona_switch, authority_claim |
67
+ | hard | 10 | all strategies |
68
+
69
+ ## Docker
70
+
71
+ ```bash
72
+ docker build -t redteam-env .
73
+ docker run -p 7860:7860 --env-file .env redteam-env
74
+ ```