Spaces:
Sleeping
Sleeping
Merge pull request #3 from macrocosm-os/features/random-stream
Browse files- README.md +111 -4
- assets/macrocosmos-black.png +0 -0
- assets/macrocosmos-white.png +0 -0
- responses.py +27 -0
- server.py +4 -35
- utils.py +21 -21
- validators/base.py +3 -3
- validators/sn1_validator_wrapper.py +64 -35
README.md
CHANGED
|
@@ -1,18 +1,125 @@
|
|
| 1 |
-
# chattensor-backend
|
| 2 |
-
Backend for Chattensor app
|
| 3 |
|
| 4 |
-
|
|
|
|
|
|
|
|
|
|
| 5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
## Install
|
| 10 |
-
Create a new python environment and install the dependencies with the command
|
| 11 |
|
|
|
|
| 12 |
```bash
|
|
|
|
|
|
|
| 13 |
pip install -r requirements.txt
|
| 14 |
```
|
| 15 |
|
|
|
|
|
|
|
| 16 |
> Note: Currently the prompting library is only installable on machines with cuda devices (NVIDIA-GPU).
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
|
|
|
|
|
|
|
|
|
| 1 |
|
| 2 |
+
<picture>
|
| 3 |
+
<source srcset="./assets/macrocosmos-white.png" media="(prefers-color-scheme: dark)">
|
| 4 |
+
<img src="macrocosmos-white.png">
|
| 5 |
+
</picture>
|
| 6 |
|
| 7 |
+
<picture>
|
| 8 |
+
<source srcset="./assets/macrocosmos-black.png" media="(prefers-color-scheme: light)">
|
| 9 |
+
<img src="macrocosmos-black.png">
|
| 10 |
+
</picture>
|
| 11 |
|
| 12 |
|
| 13 |
+
<br/>
|
| 14 |
+
<br/>
|
| 15 |
+
<br/>
|
| 16 |
+
|
| 17 |
+
# Subnet 1 API
|
| 18 |
+
> Note: This project is still in development and is not yet ready for production use.
|
| 19 |
+
|
| 20 |
+
The official REST API for Bittensor's flagship subnet 1 ([prompting](https://github.com/opentensor/prompting)), built by [Macrocosmos](https://macrocosmos.ai).
|
| 21 |
+
|
| 22 |
+
Subnet 1 is an decentralized open source network containing around 1000 highly capable LLM agents. These agents are capable of performing a wide range of tasks, from simple math problems to complex natural language processing tasks. As subnet 1 is constantly evolving, its capabilities are always expanding. Our goal is to provide a world-class inference engine, to be used by developers and researchers alike.
|
| 23 |
+
|
| 24 |
+
This API is designed to power applications and facilitate the interaction between subnets by providing a simple and easy-to-use interface for developers which enables:
|
| 25 |
+
1. **Conversation**: Chatting with the network (streaming and non-streaming)
|
| 26 |
+
2. **Data cleaning**: Filtering empty and otherwise useless responses
|
| 27 |
+
3. **Advanced inference**: Providing enhanced responses using SOTA ensembling techniques (WIP)
|
| 28 |
+
|
| 29 |
+
Validators can use this API to interact with the network and perform various tasks.
|
| 30 |
+
To run an API server, you will need a bittensor wallet which is registered as a validator the relevant subnet (1@mainnet or 61@testnet).
|
| 31 |
+
|
| 32 |
+
NOTE: At present, miners are choosing not to stream their responses to the network. This means that the server will not be able to provide a streamed response to the client until the miner has finished processing the request. This is a temporary measure and will be resolved in the future.
|
| 33 |
+
|
| 34 |
+
## How it works
|
| 35 |
+
The API server is a RESTful API that provides endpoints for interacting with the network. It is a simple [wrapper](./validators/sn1_validator_wrapper.py) around your subnet 1 validator, which makes use of the dendrite to make queries.
|
| 36 |
|
| 37 |
## Install
|
| 38 |
+
Create a new python environment and install the dependencies with the command.
|
| 39 |
|
| 40 |
+
(First time only)
|
| 41 |
```bash
|
| 42 |
+
python3.10 -m venv env
|
| 43 |
+
source env/bin/activate
|
| 44 |
pip install -r requirements.txt
|
| 45 |
```
|
| 46 |
|
| 47 |
+
> Note: This project requires python >=3.10.
|
| 48 |
+
|
| 49 |
> Note: Currently the prompting library is only installable on machines with cuda devices (NVIDIA-GPU).
|
| 50 |
|
| 51 |
+
## Run
|
| 52 |
+
|
| 53 |
+
First activate the virtual environment and then run the following command to start the server:
|
| 54 |
+
|
| 55 |
+
```bash
|
| 56 |
+
source env/bin/activate
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
Run an API server on subnet 1 with the following command:
|
| 60 |
+
|
| 61 |
+
```bash
|
| 62 |
+
EXPECTED_ACCESS_KEY=<ACCESS_KEY> python server.py --wallet.name <WALLET_NAME> --wallet.hotkey <WALLET_HOTKEY> --netuid <NETUID> --neuron.model_id mock --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
The command ensures that no GPU memory is used by the server, and that the large models used by the incentive mechanism are not loaded.
|
| 66 |
+
|
| 67 |
+
> Note: This command is subject to change as the project evolves.
|
| 68 |
+
|
| 69 |
+
We recommend that you run the server using a process manager like PM2. This will ensure that the server is always running and will restart if it crashes.
|
| 70 |
+
|
| 71 |
+
```bash
|
| 72 |
+
EXPECTED_ACCESS_KEY=<ACCESS_KEY> pm2 start server.py --interpreter python3 --name sn1-api -- --wallet.name <WALLET_NAME> --wallet.hotkey <WALLET_HOTKEY> --netuid <NETUID> --neuron.model_id mock --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
## API Usage
|
| 76 |
+
At present, the API provides two endpoints: `/chat` (live) and `/echo` (test).
|
| 77 |
+
|
| 78 |
+
`/chat` is used to chat with the network and receive a response. The endpoint requires a JSON payload with the following fields:
|
| 79 |
+
- `k: int`: The number of responses to return
|
| 80 |
+
- `timeout: float`: The time in seconds to wait for a response
|
| 81 |
+
- `roles: List[str]`: The roles of the agents to query
|
| 82 |
+
- `messages: List[str]`: The messages to send to the network
|
| 83 |
+
- `prefer: str`: The preferred response to use as the default view. Should be one of `{'longest', 'shortest'}`
|
| 84 |
+
|
| 85 |
+
Responses from the `/chat` endpoint are streamed back to the client as they are received from the network. Upon completion, the server will return a JSON response with the following fields:
|
| 86 |
+
- `streamed_chunks: List[str]`: The streamed responses from the network
|
| 87 |
+
- `streamed_chunks_timings: List[float]`: The time taken to receive each streamed response
|
| 88 |
+
- `synapse: StreamPromptingSynapse`: The synapse used to query the network. This contains full context and metadata about the query.
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
## Testing
|
| 92 |
+
|
| 93 |
+
To test the API locally, you can use the following curl command:
|
| 94 |
+
|
| 95 |
+
```bash
|
| 96 |
+
curl --no-buffer -X POST http://0.0.0.0:10000/chat/ -H "api_key: <ACCESS_KEY>" -d '{"k": 5, "timeout": 15, "roles": ["user"], "messages": ["What is today's date?"]}'
|
| 97 |
+
"""
|
| 98 |
+
```
|
| 99 |
+
> Note: Use the `--no-buffer` flag to ensure that the response is streamed back to the client.
|
| 100 |
+
|
| 101 |
+
After verifying that the server is responding to requests locally, you can test the server on a remote machine.
|
| 102 |
+
|
| 103 |
+
### Troubleshooting
|
| 104 |
+
|
| 105 |
+
If you do not receive a response from the server, check that the server is running and that the port is open on the server. You can open the port using the following commands:
|
| 106 |
+
|
| 107 |
+
```bash
|
| 108 |
+
sudo ufw allow 10000/tcp
|
| 109 |
+
```
|
| 110 |
+
|
| 111 |
+
---
|
| 112 |
+
|
| 113 |
+
## Contributing
|
| 114 |
+
If you would like to contribute to the project, please read the [CONTRIBUTING.md](CONTRIBUTING.md) file for more information.
|
| 115 |
+
|
| 116 |
+
You can find out more about the project by visiting the [Macrocosmos website](https://macrocosmos.ai) or by joining us in our social channels:
|
| 117 |
+
|
| 118 |
+
|
| 119 |
+

|
| 120 |
+
[](https://substack.com/@macrocosmosai)
|
| 121 |
+
[](https://twitter.com/MacrocosmosAI)
|
| 122 |
+
[](https://twitter.com/MacrocosmosAI)
|
| 123 |
+
[](www.linkedin.com/in/MacrocosmosAI)
|
| 124 |
+
[](https://opensource.org/licenses/MIT)
|
| 125 |
|
assets/macrocosmos-black.png
ADDED
|
assets/macrocosmos-white.png
ADDED
|
responses.py
ADDED
|
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from pydantic import BaseModel, Field
|
| 2 |
+
from typing import List, Dict, Any
|
| 3 |
+
|
| 4 |
+
|
| 5 |
+
class TextStreamResponse(BaseModel):
|
| 6 |
+
streamed_chunks: List[str] = Field(
|
| 7 |
+
default_factory=list, description="List of streamed chunks."
|
| 8 |
+
)
|
| 9 |
+
streamed_chunks_timings: List[float] = Field(
|
| 10 |
+
default_factory=list, description="List of streamed chunks timings, in seconds."
|
| 11 |
+
)
|
| 12 |
+
uid: int = Field(0, description="UID of queried miner")
|
| 13 |
+
completion: str = Field(
|
| 14 |
+
"", description="The final completed string from the stream."
|
| 15 |
+
)
|
| 16 |
+
timing: float = Field(
|
| 17 |
+
0, description="Timing information of all request, in seconds."
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
def to_dict(self):
|
| 21 |
+
return {
|
| 22 |
+
"streamed_chunks": self.streamed_chunks,
|
| 23 |
+
"streamed_chunks_timings": self.streamed_chunks_timings,
|
| 24 |
+
"uid": self.uid,
|
| 25 |
+
"completion": self.completion,
|
| 26 |
+
"timing": self.timing,
|
| 27 |
+
}
|
server.py
CHANGED
|
@@ -2,41 +2,11 @@ import asyncio
|
|
| 2 |
import utils
|
| 3 |
import bittensor as bt
|
| 4 |
from aiohttp import web
|
| 5 |
-
from aiohttp.web_response import Response
|
| 6 |
from validators import S1ValidatorAPI, QueryValidatorParams, ValidatorAPI
|
| 7 |
from middlewares import api_key_middleware, json_parsing_middleware
|
| 8 |
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
```
|
| 12 |
-
curl -X POST http://0.0.0.0:10000/chat/ -H "api_key: hello" -d '{"k": 5, "timeout": 3, "roles": ["user"], "messages": ["hello world"]}'
|
| 13 |
-
|
| 14 |
-
curl -X POST http://0.0.0.0:10000/chat/ -H "api_key: hey-michal" -d '{"k": 5, "timeout": 3, "roles": ["user"], "messages": ["on what exact date did the 21st century begin?"]}'
|
| 15 |
-
|
| 16 |
-
# stream
|
| 17 |
-
curl --no-buffer -X POST http://129.146.127.82:10000/echo/ -H "api_key: hey-michal" -d '{"k": 3, "timeout": 0.2, "roles": ["user"], "messages": ["i need to tell you something important but first"]}'
|
| 18 |
-
```
|
| 19 |
-
|
| 20 |
-
TROUBLESHOOT
|
| 21 |
-
check if port is open
|
| 22 |
-
```
|
| 23 |
-
sudo ufw allow 10000/tcp
|
| 24 |
-
sudo ufw allow 10000/tcp
|
| 25 |
-
```
|
| 26 |
-
# run
|
| 27 |
-
```
|
| 28 |
-
EXPECTED_ACCESS_KEY="hey-michal" pm2 start app.py --interpreter python3 --name app -- --neuron.model_id mock --wallet.name sn1 --wallet.hotkey v1 --netuid 1 --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
|
| 29 |
-
```
|
| 30 |
-
|
| 31 |
-
basic testing
|
| 32 |
-
```
|
| 33 |
-
EXPECTED_ACCESS_KEY="hey-michal" python app.py --neuron.model_id mock --wallet.name sn1 --wallet.hotkey v1 --netuid 1 --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
|
| 34 |
-
```
|
| 35 |
-
add --mock to test the echo stream
|
| 36 |
-
"""
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
async def chat(request: web.Request) -> Response:
|
| 40 |
"""
|
| 41 |
Chat endpoint for the validator.
|
| 42 |
"""
|
|
@@ -49,9 +19,8 @@ async def chat(request: web.Request) -> Response:
|
|
| 49 |
return response
|
| 50 |
|
| 51 |
|
| 52 |
-
async def echo_stream(request
|
| 53 |
-
|
| 54 |
-
return await utils.echo_stream(request_data)
|
| 55 |
|
| 56 |
|
| 57 |
class ValidatorApplication(web.Application):
|
|
|
|
| 2 |
import utils
|
| 3 |
import bittensor as bt
|
| 4 |
from aiohttp import web
|
|
|
|
| 5 |
from validators import S1ValidatorAPI, QueryValidatorParams, ValidatorAPI
|
| 6 |
from middlewares import api_key_middleware, json_parsing_middleware
|
| 7 |
|
| 8 |
+
|
| 9 |
+
async def chat(request: web.Request) -> web.StreamResponse:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
"""
|
| 11 |
Chat endpoint for the validator.
|
| 12 |
"""
|
|
|
|
| 19 |
return response
|
| 20 |
|
| 21 |
|
| 22 |
+
async def echo_stream(request: web.Request) -> web.StreamResponse:
|
| 23 |
+
return await utils.echo_stream(request)
|
|
|
|
| 24 |
|
| 25 |
|
| 26 |
class ValidatorApplication(web.Application):
|
utils.py
CHANGED
|
@@ -1,8 +1,10 @@
|
|
| 1 |
import re
|
| 2 |
-
import bittensor as bt
|
| 3 |
import time
|
| 4 |
import json
|
|
|
|
|
|
|
| 5 |
from aiohttp import web
|
|
|
|
| 6 |
from collections import Counter
|
| 7 |
from prompting.rewards import DateRewardModel, FloatDiffModel
|
| 8 |
|
|
@@ -134,47 +136,45 @@ def guess_task_name(challenge: str):
|
|
| 134 |
return "qa"
|
| 135 |
|
| 136 |
|
| 137 |
-
async def echo_stream(
|
|
|
|
| 138 |
k = request_data.get("k", 1)
|
| 139 |
-
exclude = request_data.get("exclude", [])
|
| 140 |
-
timeout = request_data.get("timeout", 0.2)
|
| 141 |
message = "\n\n".join(request_data["messages"])
|
| 142 |
|
| 143 |
# Create a StreamResponse
|
| 144 |
response = web.StreamResponse(
|
| 145 |
-
status=200, reason="OK", headers={"Content-Type": "
|
| 146 |
)
|
| 147 |
-
await response.prepare()
|
| 148 |
|
| 149 |
completion = ""
|
|
|
|
|
|
|
|
|
|
| 150 |
# Echo the message k times with a timeout between each chunk
|
| 151 |
for _ in range(k):
|
| 152 |
for word in message.split():
|
| 153 |
chunk = f"{word} "
|
| 154 |
await response.write(chunk.encode("utf-8"))
|
| 155 |
completion += chunk
|
| 156 |
-
|
| 157 |
bt.logging.info(f"Echoed: {chunk}")
|
| 158 |
|
|
|
|
|
|
|
|
|
|
| 159 |
completion = completion.strip()
|
| 160 |
|
| 161 |
# Prepare final JSON chunk
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
"status_messages": ["Went well!"],
|
| 169 |
-
"status_codes": [200],
|
| 170 |
-
"completion_is_valid": [True],
|
| 171 |
-
"task_name": "echo",
|
| 172 |
-
"ensemble_result": {},
|
| 173 |
-
}
|
| 174 |
-
)
|
| 175 |
|
| 176 |
# Send the final JSON as part of the stream
|
| 177 |
-
await response.write(
|
| 178 |
|
| 179 |
# Finalize the response
|
| 180 |
await response.write_eof()
|
|
|
|
| 1 |
import re
|
|
|
|
| 2 |
import time
|
| 3 |
import json
|
| 4 |
+
import asyncio
|
| 5 |
+
import bittensor as bt
|
| 6 |
from aiohttp import web
|
| 7 |
+
from responses import TextStreamResponse
|
| 8 |
from collections import Counter
|
| 9 |
from prompting.rewards import DateRewardModel, FloatDiffModel
|
| 10 |
|
|
|
|
| 136 |
return "qa"
|
| 137 |
|
| 138 |
|
| 139 |
+
async def echo_stream(request: web.Request) -> web.StreamResponse:
|
| 140 |
+
request_data = request["data"]
|
| 141 |
k = request_data.get("k", 1)
|
|
|
|
|
|
|
| 142 |
message = "\n\n".join(request_data["messages"])
|
| 143 |
|
| 144 |
# Create a StreamResponse
|
| 145 |
response = web.StreamResponse(
|
| 146 |
+
status=200, reason="OK", headers={"Content-Type": "application/json"}
|
| 147 |
)
|
| 148 |
+
await response.prepare(request)
|
| 149 |
|
| 150 |
completion = ""
|
| 151 |
+
chunks = []
|
| 152 |
+
chunks_timings = []
|
| 153 |
+
start_time = time.time()
|
| 154 |
# Echo the message k times with a timeout between each chunk
|
| 155 |
for _ in range(k):
|
| 156 |
for word in message.split():
|
| 157 |
chunk = f"{word} "
|
| 158 |
await response.write(chunk.encode("utf-8"))
|
| 159 |
completion += chunk
|
| 160 |
+
await asyncio.sleep(0.3)
|
| 161 |
bt.logging.info(f"Echoed: {chunk}")
|
| 162 |
|
| 163 |
+
chunks.append(chunk)
|
| 164 |
+
chunks_timings.append(time.time() - start_time)
|
| 165 |
+
|
| 166 |
completion = completion.strip()
|
| 167 |
|
| 168 |
# Prepare final JSON chunk
|
| 169 |
+
response_data = TextStreamResponse(
|
| 170 |
+
streamed_chunks=chunks,
|
| 171 |
+
streamed_chunks_timings=chunks_timings,
|
| 172 |
+
completion=completion,
|
| 173 |
+
timing=time.time() - start_time,
|
| 174 |
+
).to_dict()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 175 |
|
| 176 |
# Send the final JSON as part of the stream
|
| 177 |
+
await response.write(json.dumps(response_data).encode("utf-8"))
|
| 178 |
|
| 179 |
# Finalize the response
|
| 180 |
await response.write_eof()
|
validators/base.py
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
from abc import ABC, abstractmethod
|
| 2 |
from typing import List
|
| 3 |
from dataclasses import dataclass
|
| 4 |
-
from aiohttp.web import Response, Request
|
| 5 |
|
| 6 |
|
| 7 |
@dataclass
|
|
@@ -31,10 +31,10 @@ class QueryValidatorParams:
|
|
| 31 |
|
| 32 |
class ValidatorAPI(ABC):
|
| 33 |
@abstractmethod
|
| 34 |
-
async def query_validator(self, params: QueryValidatorParams) ->
|
| 35 |
pass
|
| 36 |
|
| 37 |
|
| 38 |
class MockValidator(ValidatorAPI):
|
| 39 |
-
async def query_validator(self, params: QueryValidatorParams) ->
|
| 40 |
...
|
|
|
|
| 1 |
from abc import ABC, abstractmethod
|
| 2 |
from typing import List
|
| 3 |
from dataclasses import dataclass
|
| 4 |
+
from aiohttp.web import Response, Request, StreamResponse
|
| 5 |
|
| 6 |
|
| 7 |
@dataclass
|
|
|
|
| 31 |
|
| 32 |
class ValidatorAPI(ABC):
|
| 33 |
@abstractmethod
|
| 34 |
+
async def query_validator(self, params: QueryValidatorParams) -> StreamResponse:
|
| 35 |
pass
|
| 36 |
|
| 37 |
|
| 38 |
class MockValidator(ValidatorAPI):
|
| 39 |
+
async def query_validator(self, params: QueryValidatorParams) -> StreamResponse:
|
| 40 |
...
|
validators/sn1_validator_wrapper.py
CHANGED
|
@@ -2,7 +2,8 @@ import json
|
|
| 2 |
import utils
|
| 3 |
import torch
|
| 4 |
import traceback
|
| 5 |
-
import
|
|
|
|
| 6 |
import bittensor as bt
|
| 7 |
from typing import Awaitable
|
| 8 |
from prompting.validator import Validator
|
|
@@ -12,6 +13,16 @@ from prompting.dendrite import DendriteResponseEvent
|
|
| 12 |
from .base import QueryValidatorParams, ValidatorAPI
|
| 13 |
from aiohttp.web_response import Response, StreamResponse
|
| 14 |
from deprecated import deprecated
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
|
| 17 |
class S1ValidatorAPI(ValidatorAPI):
|
|
@@ -75,27 +86,39 @@ class S1ValidatorAPI(ValidatorAPI):
|
|
| 75 |
return Response(status=500, reason="Internal error")
|
| 76 |
|
| 77 |
async def process_response(
|
| 78 |
-
self, response: StreamResponse,
|
| 79 |
-
):
|
| 80 |
"""Process a single response asynchronously."""
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
)
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
finally:
|
| 98 |
-
await response.write_eof() # Ensure to close the response properly
|
| 99 |
|
| 100 |
async def get_stream_response(self, params: QueryValidatorParams) -> StreamResponse:
|
| 101 |
response = StreamResponse(status=200, reason="OK")
|
|
@@ -105,7 +128,7 @@ class S1ValidatorAPI(ValidatorAPI):
|
|
| 105 |
|
| 106 |
try:
|
| 107 |
# Guess the task name of current request
|
| 108 |
-
task_name = utils.guess_task_name(params.messages[-1])
|
| 109 |
|
| 110 |
# Get the list of uids to query for this step.
|
| 111 |
uids = get_random_uids(
|
|
@@ -115,6 +138,8 @@ class S1ValidatorAPI(ValidatorAPI):
|
|
| 115 |
|
| 116 |
# Make calls to the network with the prompt.
|
| 117 |
bt.logging.info(f"Calling dendrite")
|
|
|
|
|
|
|
| 118 |
streams_responses = await self.validator.dendrite(
|
| 119 |
axons=axons,
|
| 120 |
synapse=StreamPromptingSynapse(
|
|
@@ -125,13 +150,24 @@ class S1ValidatorAPI(ValidatorAPI):
|
|
| 125 |
streaming=True,
|
| 126 |
)
|
| 127 |
|
| 128 |
-
|
| 129 |
-
self.process_response(uid, res)
|
| 130 |
-
for uid, res in dict(zip(uids, streams_responses))
|
| 131 |
-
]
|
| 132 |
-
results = await asyncio.gather(*tasks, return_exceptions=True)
|
| 133 |
|
| 134 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
except Exception as e:
|
| 136 |
bt.logging.error(
|
| 137 |
f"Encountered an error in {self.__class__.__name__}:get_stream_response:\n{traceback.format_exc()}"
|
|
@@ -144,11 +180,4 @@ class S1ValidatorAPI(ValidatorAPI):
|
|
| 144 |
return response
|
| 145 |
|
| 146 |
async def query_validator(self, params: QueryValidatorParams) -> Response:
|
| 147 |
-
|
| 148 |
-
stream = params.request.get("stream", False)
|
| 149 |
-
|
| 150 |
-
if stream:
|
| 151 |
-
return await self.get_stream_response(params)
|
| 152 |
-
else:
|
| 153 |
-
# DEPRECATED
|
| 154 |
-
return await self.get_response(params)
|
|
|
|
| 2 |
import utils
|
| 3 |
import torch
|
| 4 |
import traceback
|
| 5 |
+
import time
|
| 6 |
+
import random
|
| 7 |
import bittensor as bt
|
| 8 |
from typing import Awaitable
|
| 9 |
from prompting.validator import Validator
|
|
|
|
| 13 |
from .base import QueryValidatorParams, ValidatorAPI
|
| 14 |
from aiohttp.web_response import Response, StreamResponse
|
| 15 |
from deprecated import deprecated
|
| 16 |
+
from dataclasses import dataclass
|
| 17 |
+
from typing import List
|
| 18 |
+
from responses import TextStreamResponse
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
@dataclass
|
| 22 |
+
class ProcessedStreamResponse:
|
| 23 |
+
streamed_chunks: List[str]
|
| 24 |
+
streamed_chunks_timings: List[float]
|
| 25 |
+
synapse: StreamPromptingSynapse
|
| 26 |
|
| 27 |
|
| 28 |
class S1ValidatorAPI(ValidatorAPI):
|
|
|
|
| 86 |
return Response(status=500, reason="Internal error")
|
| 87 |
|
| 88 |
async def process_response(
|
| 89 |
+
self, response: StreamResponse, async_generator: Awaitable
|
| 90 |
+
) -> ProcessedStreamResponse:
|
| 91 |
"""Process a single response asynchronously."""
|
| 92 |
+
# Initialize chunk with a default value
|
| 93 |
+
chunk = None
|
| 94 |
+
# Initialize chunk array to accumulate streamed chunks
|
| 95 |
+
chunks = []
|
| 96 |
+
chunks_timings = []
|
| 97 |
+
|
| 98 |
+
start_time = time.time()
|
| 99 |
+
last_sent_index = 0
|
| 100 |
+
async for chunk in async_generator:
|
| 101 |
+
if isinstance(chunk, list):
|
| 102 |
+
# Chunks are currently returned in string arrays, so we need to concatenate them
|
| 103 |
+
concatenated_chunks = "".join(chunk)
|
| 104 |
+
new_data = concatenated_chunks[last_sent_index:]
|
| 105 |
+
|
| 106 |
+
if new_data:
|
| 107 |
+
await response.write(new_data.encode("utf-8"))
|
| 108 |
+
bt.logging.info(f"Received new chunk from miner: {chunk}")
|
| 109 |
+
last_sent_index += len(new_data)
|
| 110 |
+
chunks.extend(chunk)
|
| 111 |
+
chunks_timings.append(time.time() - start_time)
|
| 112 |
+
|
| 113 |
+
if chunk is not None and isinstance(chunk, StreamPromptingSynapse):
|
| 114 |
+
# Assuming the last chunk holds the last value yielded which should be a synapse with the completion filled
|
| 115 |
+
return ProcessedStreamResponse(
|
| 116 |
+
synapse=chunk,
|
| 117 |
+
streamed_chunks=chunks,
|
| 118 |
+
streamed_chunks_timings=chunks_timings,
|
| 119 |
)
|
| 120 |
+
else:
|
| 121 |
+
raise ValueError("The last chunkis not a StreamPrompting synapse")
|
|
|
|
|
|
|
| 122 |
|
| 123 |
async def get_stream_response(self, params: QueryValidatorParams) -> StreamResponse:
|
| 124 |
response = StreamResponse(status=200, reason="OK")
|
|
|
|
| 128 |
|
| 129 |
try:
|
| 130 |
# Guess the task name of current request
|
| 131 |
+
# task_name = utils.guess_task_name(params.messages[-1])
|
| 132 |
|
| 133 |
# Get the list of uids to query for this step.
|
| 134 |
uids = get_random_uids(
|
|
|
|
| 138 |
|
| 139 |
# Make calls to the network with the prompt.
|
| 140 |
bt.logging.info(f"Calling dendrite")
|
| 141 |
+
start_time = time.time()
|
| 142 |
+
|
| 143 |
streams_responses = await self.validator.dendrite(
|
| 144 |
axons=axons,
|
| 145 |
synapse=StreamPromptingSynapse(
|
|
|
|
| 150 |
streaming=True,
|
| 151 |
)
|
| 152 |
|
| 153 |
+
uid_stream_dict = dict(zip(uids, streams_responses))
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
|
| 155 |
+
random_uid, random_stream = random.choice(list(uid_stream_dict.items()))
|
| 156 |
+
processed_response = await self.process_response(response, random_stream)
|
| 157 |
+
|
| 158 |
+
# Prepare final JSON chunk
|
| 159 |
+
response_data = json.dumps(
|
| 160 |
+
TextStreamResponse(
|
| 161 |
+
streamed_chunks=processed_response.streamed_chunks,
|
| 162 |
+
streamed_chunks_timings=processed_response.streamed_chunks_timings,
|
| 163 |
+
uid=random_uid,
|
| 164 |
+
completion=processed_response.synapse.completion,
|
| 165 |
+
timing=time.time() - start_time,
|
| 166 |
+
).to_dict()
|
| 167 |
+
)
|
| 168 |
+
|
| 169 |
+
# Send the final JSON as part of the stream
|
| 170 |
+
await response.write(json.dumps(response_data).encode("utf-8"))
|
| 171 |
except Exception as e:
|
| 172 |
bt.logging.error(
|
| 173 |
f"Encountered an error in {self.__class__.__name__}:get_stream_response:\n{traceback.format_exc()}"
|
|
|
|
| 180 |
return response
|
| 181 |
|
| 182 |
async def query_validator(self, params: QueryValidatorParams) -> Response:
|
| 183 |
+
return await self.get_stream_response(params)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|