Commit
Β·
eb5efe8
1
Parent(s):
a4b0424
updated readme file
Browse files
README.md
CHANGED
|
@@ -66,6 +66,9 @@ so if its size exceeds a pre-configured threshold, it is chunked and only the mo
|
|
| 66 |
- **Pyhton code executor**: executes a snippet of python code provided as input
|
| 67 |
- **Think tool**: used for strategic reflection on the progress of the solving process
|
| 68 |
|
|
|
|
|
|
|
|
|
|
| 69 |
**Python code Executor**βοΈ can run either a snippet of python code or a given python file. The python code snippet is executed using `langchain_experimental.tools.PythonREPLTool`. The python file is executed in a sub-process.
|
| 70 |
|
| 71 |
**Spreadsheets**π: analyzes `excel` files using the pandas dataframe agent `langchain_experimental.agents.create_pandas_dataframe_agent`
|
|
@@ -75,7 +78,7 @@ and the`gpt-4.1` model.
|
|
| 75 |
|
| 76 |
- **Picture analysis**: identifies the location of each chess piece on the board. Once the coordinates are identified, the FEN of
|
| 77 |
the game is computed programmatically. Both `gpt-4.1` and `gemini-2.5-flash` models are used to extract the coordinates and an arbitrage is performed on their outcomes.
|
| 78 |
-
- **Move suggestion**: the best move is suggested by a `stockfish` chess engine
|
| 79 |
- **Move interpretation**: the move is then interpreted and transcribed into the algebraic notation with the help of `gpt-4`.
|
| 80 |
|
| 81 |
**Chess Board Picture Analysis - Challenges and Limitations** π
|
|
@@ -93,7 +96,7 @@ From what I observed, this approach improved the chances of having a correct ide
|
|
| 93 |
|
| 94 |
**YouTube Videos Analysis**π₯ This is work in progress π§
|
| 95 |
|
| 96 |
-
So far, the agent is able to
|
| 97 |
The assistant searches for the transcripts by using the `tavily extract` tool.
|
| 98 |
TODO: analyze YouTube videos and answer questions about objects in the video.
|
| 99 |
|
|
@@ -101,8 +104,7 @@ TODO: analyze YouTube videos and answer questions about objects in the video.
|
|
| 101 |
|
| 102 |
|
| 103 |
## Future work and improvements π
|
| 104 |
-
- **Evaluation**:
|
| 105 |
-
Evaluate the agent against other questions from the GAIA validation set.
|
| 106 |
- **Large Web Extracts**: Try other chunking strategies.
|
| 107 |
- **Audio Analysis**:Use a lesser expensive model to get the transcripts (like whisper) and if this is not enough to answer the question and more sophiticated processing is needed
|
| 108 |
for other sounds like music, barks or other type of sounds then indeed use a better model.
|
|
|
|
| 66 |
- **Pyhton code executor**: executes a snippet of python code provided as input
|
| 67 |
- **Think tool**: used for strategic reflection on the progress of the solving process
|
| 68 |
|
| 69 |
+
At this point it looks like the agent prefers to answer the mathematical question from the test set by invoking the python code executor instead.
|
| 70 |
+
The question is answered correctly. I decided to not remove yet this tool, until I test the agent on other mathematical questions from the GAIA validation set.
|
| 71 |
+
|
| 72 |
**Python code Executor**βοΈ can run either a snippet of python code or a given python file. The python code snippet is executed using `langchain_experimental.tools.PythonREPLTool`. The python file is executed in a sub-process.
|
| 73 |
|
| 74 |
**Spreadsheets**π: analyzes `excel` files using the pandas dataframe agent `langchain_experimental.agents.create_pandas_dataframe_agent`
|
|
|
|
| 78 |
|
| 79 |
- **Picture analysis**: identifies the location of each chess piece on the board. Once the coordinates are identified, the FEN of
|
| 80 |
the game is computed programmatically. Both `gpt-4.1` and `gemini-2.5-flash` models are used to extract the coordinates and an arbitrage is performed on their outcomes.
|
| 81 |
+
- **Move suggestion**: the best move is suggested by a `stockfish` chess engine.
|
| 82 |
- **Move interpretation**: the move is then interpreted and transcribed into the algebraic notation with the help of `gpt-4`.
|
| 83 |
|
| 84 |
**Chess Board Picture Analysis - Challenges and Limitations** π
|
|
|
|
| 96 |
|
| 97 |
**YouTube Videos Analysis**π₯ This is work in progress π§
|
| 98 |
|
| 99 |
+
So far, the agent is able to answer questions on the conversation inside a YouTube video. There is no dedicated tool for this.
|
| 100 |
The assistant searches for the transcripts by using the `tavily extract` tool.
|
| 101 |
TODO: analyze YouTube videos and answer questions about objects in the video.
|
| 102 |
|
|
|
|
| 104 |
|
| 105 |
|
| 106 |
## Future work and improvements π
|
| 107 |
+
- **Evaluation**: Evaluate the agent against other questions from the GAIA validation set.
|
|
|
|
| 108 |
- **Large Web Extracts**: Try other chunking strategies.
|
| 109 |
- **Audio Analysis**:Use a lesser expensive model to get the transcripts (like whisper) and if this is not enough to answer the question and more sophiticated processing is needed
|
| 110 |
for other sounds like music, barks or other type of sounds then indeed use a better model.
|