Spaces:
Sleeping
Sleeping
Commit ·
3b3f564
1
Parent(s): 58a8b0b
fix multimodal
Browse files- README.md +3 -1
- src/FinancialAgentApp.py +2 -3
README.md
CHANGED
|
@@ -21,6 +21,8 @@ In this task i select **decoder-only** model, because we want them as conversati
|
|
| 21 |
- Propetary model: **OpenAI GPT-5** -> its capability and integrated service; while deepseek offer lower price but on peak hours deepseek api often not reliable.
|
| 22 |
- Local model: **QWEN3-4b** -> since we only have limited RAM and no GPU, we need reliable but small enough model. Compared to same size model (Gemma) Qwen3 outperform on reasoning and analytical capability. This reasoning model also outperform DeepSeekR1. [link](https://artificialanalysis.ai/models/gemma-3-4b?models=gpt-4-1%2Cgpt-oss-120b%2Cgpt-oss-20b%2Cgpt-5%2Co3%2Cgpt-5-minimal%2Cllama-4-maverick%2Cgemini-2-5-pro%2Cgemini-2-5-flash-reasoning%2Cgemma-3-4b%2Cclaude-4-1-opus-thinking%2Cclaude-4-sonnet-thinking%2Cmistral-medium-3-1%2Cdeepseek-r1%2Cdeepseek-v3-1-reasoning%2Cdeepseek-v3-1%2Cdeepseek-r1-qwen3-8b%2Cgrok-code-fast-1%2Cgrok-4%2Csolar-pro-2-reasoning%2Cllama-nemotron-super-49b-v1-5-reasoning%2Ckimi-k2-0905%2Cexaone-4-0-32b-reasoning%2Cglm-4.5%2Cqwen3-235b-a22b-instruct-2507-reasoning%2Cqwen3-4b-2507-instruct-reasoning%2Cqwen1.5-110b-chat)
|
| 23 |
|
|
|
|
|
|
|
| 24 |
## File handling
|
| 25 |
File located in PKL to optimize data loading, because the size is small, we dont need to include it as db.
|
| 26 |
|
|
@@ -54,7 +56,7 @@ PDF saved in vector store, for non openai, we use faiss as database because it h
|
|
| 54 |
I choose `sentence-transformers/all-MiniLM-L6-v2` embedding model because it have balance capability on accuracy and latency. While its accuracy comparable with `sentence-transformers/all-mpnet-base-v2`, this model has lower latency. Feasible for retreiver embedding.
|
| 55 |
|
| 56 |
## Ollama API
|
| 57 |
-
As alternative we provie self hosted ollama api on hugginface. This API already enabled with "qwen3-4b" model. [
|
| 58 |
|
| 59 |
## Schema
|
| 60 |
```
|
|
|
|
| 21 |
- Propetary model: **OpenAI GPT-5** -> its capability and integrated service; while deepseek offer lower price but on peak hours deepseek api often not reliable.
|
| 22 |
- Local model: **QWEN3-4b** -> since we only have limited RAM and no GPU, we need reliable but small enough model. Compared to same size model (Gemma) Qwen3 outperform on reasoning and analytical capability. This reasoning model also outperform DeepSeekR1. [link](https://artificialanalysis.ai/models/gemma-3-4b?models=gpt-4-1%2Cgpt-oss-120b%2Cgpt-oss-20b%2Cgpt-5%2Co3%2Cgpt-5-minimal%2Cllama-4-maverick%2Cgemini-2-5-pro%2Cgemini-2-5-flash-reasoning%2Cgemma-3-4b%2Cclaude-4-1-opus-thinking%2Cclaude-4-sonnet-thinking%2Cmistral-medium-3-1%2Cdeepseek-r1%2Cdeepseek-v3-1-reasoning%2Cdeepseek-v3-1%2Cdeepseek-r1-qwen3-8b%2Cgrok-code-fast-1%2Cgrok-4%2Csolar-pro-2-reasoning%2Cllama-nemotron-super-49b-v1-5-reasoning%2Ckimi-k2-0905%2Cexaone-4-0-32b-reasoning%2Cglm-4.5%2Cqwen3-235b-a22b-instruct-2507-reasoning%2Cqwen3-4b-2507-instruct-reasoning%2Cqwen1.5-110b-chat)
|
| 23 |
|
| 24 |
+
Compared to OpenAI online, QWEN3:4B local have at least 6 times slower (6 or more latency).
|
| 25 |
+
|
| 26 |
## File handling
|
| 27 |
File located in PKL to optimize data loading, because the size is small, we dont need to include it as db.
|
| 28 |
|
|
|
|
| 56 |
I choose `sentence-transformers/all-MiniLM-L6-v2` embedding model because it have balance capability on accuracy and latency. While its accuracy comparable with `sentence-transformers/all-mpnet-base-v2`, this model has lower latency. Feasible for retreiver embedding.
|
| 57 |
|
| 58 |
## Ollama API
|
| 59 |
+
As alternative we provie self hosted ollama api on hugginface. This API already enabled with "qwen3-4b" model. [https://mrfirdauss-ollama-api.hf.space](https://mrfirdauss-ollama-api.hf.space)
|
| 60 |
|
| 61 |
## Schema
|
| 62 |
```
|
src/FinancialAgentApp.py
CHANGED
|
@@ -90,11 +90,10 @@ class FinancialAgentApp (ABC):
|
|
| 90 |
if fig.get_axes():
|
| 91 |
with self.st.chat_message("assistant"):
|
| 92 |
self.st.pyplot(fig)
|
| 93 |
-
buf = self.__safe_savefig__()
|
| 94 |
self.st.session_state.messages.append({
|
| 95 |
"role": "assistant",
|
| 96 |
"type": "plot",
|
| 97 |
-
"content":
|
| 98 |
})
|
| 99 |
plt.close(fig)
|
| 100 |
|
|
@@ -111,7 +110,7 @@ class FinancialAgentApp (ABC):
|
|
| 111 |
self.__stream_answer__(
|
| 112 |
instructions=FINAL_PROMPT,
|
| 113 |
input_messages=[
|
| 114 |
-
{"role": m["role"], "content": m["content"]}
|
| 115 |
for m in self.st.session_state.messages
|
| 116 |
] + [{"role": "user", "content": context_prompt}]
|
| 117 |
)
|
|
|
|
| 90 |
if fig.get_axes():
|
| 91 |
with self.st.chat_message("assistant"):
|
| 92 |
self.st.pyplot(fig)
|
|
|
|
| 93 |
self.st.session_state.messages.append({
|
| 94 |
"role": "assistant",
|
| 95 |
"type": "plot",
|
| 96 |
+
"content": fig
|
| 97 |
})
|
| 98 |
plt.close(fig)
|
| 99 |
|
|
|
|
| 110 |
self.__stream_answer__(
|
| 111 |
instructions=FINAL_PROMPT,
|
| 112 |
input_messages=[
|
| 113 |
+
{"role": m["role"], "content": m["content"]} if m['type'] != 'plot' or m['type'] is None else {}
|
| 114 |
for m in self.st.session_state.messages
|
| 115 |
] + [{"role": "user", "content": context_prompt}]
|
| 116 |
)
|