Spaces:
Paused
Paused
Commit
·
639f902
1
Parent(s):
1813b17
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,6 +10,24 @@ app_file: app.py
|
|
| 10 |
pinned: true
|
| 11 |
---
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
## 7/21/23
|
| 14 |
am going to naively, without evidence, state that you can represent any function in text with a large language model.
|
| 15 |
|
|
|
|
| 10 |
pinned: true
|
| 11 |
---
|
| 12 |
|
| 13 |
+
## 7/23/23 - Towards A Unified Agent with Foundation Models
|
| 14 |
+
https://arxiv.org/abs/2307.09668
|
| 15 |
+
|
| 16 |
+
Generate synthetic data set for the state that you want, search over the action space until you find a trajectory that reaches a cosine similarity threshold denoted by the state you want, add all those frames and states of the buffer and incorporate into training
|
| 17 |
+
|
| 18 |
+
You can bootstrap process with priors still search for the desired state
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
### reward
|
| 22 |
+
Reward any trajectory proportionally to a semantically similar state as any state in a run with a victory condition.
|
| 23 |
+
Linear or some function reward curve
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
### Sample curve
|
| 27 |
+
Sections of states with more changes in them
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
|
| 31 |
## 7/21/23
|
| 32 |
am going to naively, without evidence, state that you can represent any function in text with a large language model.
|
| 33 |
|