AMoresc's picture
submission
80a80da

REPORT FOR AGENTIC Zork1

Given the tools of Map, Memory and Inverntory, my original idea was that to give the model coherence of action with the current_objective tool to give continuity between steps. This is injected in the prompt at every step, and the model is invited to modify it every 20 steps. The model forgets about the current_objective object when not directly reminded, but seems to react 'meaningfully' when this is done. For example the model sometimes recognises that the current_objective is already what he is trying to do and doesn't change it. Through some experimentation, I found that the model effectively is influenced by the current_objective parameter.

The second idea was to keep a list of interactive objects, associated to their locations. This way when finding a new tool (eg. a key) the model could efficiently check if it could be used in previously seen parts (eg. a locked door). This is not kept track automatically, the model needs to manually add them, for this reason I periodically remind the model to do it every 9 moves (9 so that it does not collide with the reminder to update current_objective). The model uses the tool very sporadically, even if reminded, though occurrences where this tool is useful seem to be rare in these games.

To improve model memory I allowed a larger section of the memory to be injected in the prompt to improve unity of action, in particular the model re-reads the previous 10 actions without any cuts to have a better understanding. This successfully prevents most mindless loops. I modified the Agent to keep track of the locations visited from a certain direction (very similarly to the map tool) to avoid the model repeating movements. This damages the freedom of the model, but it seems rare that it visits again the same direction with a specific goal.

The main difficulty to me was to force the model to use the tools. Adding them to the system prompt seemed to not have effects, the model almost never consults them. To remind the model to use specific tools, it reads an additional prompt every few steps. In this way the model actually does use some of them. There are tools it prefers, like set_objective, and ones it almost never uses, even when reminded (like add_interactive).