Spaces:

mrfirdauss
/

LLM-CreditCard

Sleeping

App Files Files Community

mrfirdauss commited on Sep 10, 2025

Commit

3b3f564

1 Parent(s): 58a8b0b

fix multimodal

Browse files

Files changed (2) hide show

README.md +3 -1
src/FinancialAgentApp.py +2 -3

README.md CHANGED Viewed

@@ -21,6 +21,8 @@ In this task i select **decoder-only** model, because we want them as conversati
 - Propetary model: **OpenAI GPT-5** -> its capability and integrated service; while deepseek offer lower price but on peak hours deepseek api often not reliable.
 - Local model: **QWEN3-4b** -> since we only have limited RAM and no GPU, we need reliable but small enough model. Compared to same size model (Gemma) Qwen3 outperform on reasoning and analytical capability. This reasoning model also outperform DeepSeekR1. [link](https://artificialanalysis.ai/models/gemma-3-4b?models=gpt-4-1%2Cgpt-oss-120b%2Cgpt-oss-20b%2Cgpt-5%2Co3%2Cgpt-5-minimal%2Cllama-4-maverick%2Cgemini-2-5-pro%2Cgemini-2-5-flash-reasoning%2Cgemma-3-4b%2Cclaude-4-1-opus-thinking%2Cclaude-4-sonnet-thinking%2Cmistral-medium-3-1%2Cdeepseek-r1%2Cdeepseek-v3-1-reasoning%2Cdeepseek-v3-1%2Cdeepseek-r1-qwen3-8b%2Cgrok-code-fast-1%2Cgrok-4%2Csolar-pro-2-reasoning%2Cllama-nemotron-super-49b-v1-5-reasoning%2Ckimi-k2-0905%2Cexaone-4-0-32b-reasoning%2Cglm-4.5%2Cqwen3-235b-a22b-instruct-2507-reasoning%2Cqwen3-4b-2507-instruct-reasoning%2Cqwen1.5-110b-chat)
 ## File handling
 File located in PKL to optimize data loading, because the size is small, we dont need to include it as db.
@@ -54,7 +56,7 @@ PDF saved in vector store, for non openai, we use faiss as database because it h
 I choose `sentence-transformers/all-MiniLM-L6-v2` embedding model because it have balance capability on accuracy and latency. While its accuracy comparable with `sentence-transformers/all-mpnet-base-v2`, this model has lower latency. Feasible for retreiver embedding.
 ## Ollama API
-As alternative we provie self hosted ollama api on hugginface. This API already enabled with "qwen3-4b" model. [link](https://mrfirdauss-ollama-api.hf.space)
 ## Schema
 ```

 - Propetary model: **OpenAI GPT-5** -> its capability and integrated service; while deepseek offer lower price but on peak hours deepseek api often not reliable.
 - Local model: **QWEN3-4b** -> since we only have limited RAM and no GPU, we need reliable but small enough model. Compared to same size model (Gemma) Qwen3 outperform on reasoning and analytical capability. This reasoning model also outperform DeepSeekR1. [link](https://artificialanalysis.ai/models/gemma-3-4b?models=gpt-4-1%2Cgpt-oss-120b%2Cgpt-oss-20b%2Cgpt-5%2Co3%2Cgpt-5-minimal%2Cllama-4-maverick%2Cgemini-2-5-pro%2Cgemini-2-5-flash-reasoning%2Cgemma-3-4b%2Cclaude-4-1-opus-thinking%2Cclaude-4-sonnet-thinking%2Cmistral-medium-3-1%2Cdeepseek-r1%2Cdeepseek-v3-1-reasoning%2Cdeepseek-v3-1%2Cdeepseek-r1-qwen3-8b%2Cgrok-code-fast-1%2Cgrok-4%2Csolar-pro-2-reasoning%2Cllama-nemotron-super-49b-v1-5-reasoning%2Ckimi-k2-0905%2Cexaone-4-0-32b-reasoning%2Cglm-4.5%2Cqwen3-235b-a22b-instruct-2507-reasoning%2Cqwen3-4b-2507-instruct-reasoning%2Cqwen1.5-110b-chat)
+   Compared to OpenAI online, QWEN3:4B local have at least 6 times slower (6 or more latency).
 ## File handling
 File located in PKL to optimize data loading, because the size is small, we dont need to include it as db.
 I choose `sentence-transformers/all-MiniLM-L6-v2` embedding model because it have balance capability on accuracy and latency. While its accuracy comparable with `sentence-transformers/all-mpnet-base-v2`, this model has lower latency. Feasible for retreiver embedding.
 ## Ollama API
+As alternative we provie self hosted ollama api on hugginface. This API already enabled with "qwen3-4b" model. [https://mrfirdauss-ollama-api.hf.space](https://mrfirdauss-ollama-api.hf.space)
 ## Schema
 ```

src/FinancialAgentApp.py CHANGED Viewed

@@ -90,11 +90,10 @@ class FinancialAgentApp (ABC):
             if fig.get_axes():
                 with self.st.chat_message("assistant"):
                     self.st.pyplot(fig)
-                buf = self.__safe_savefig__()
                 self.st.session_state.messages.append({
                     "role": "assistant",
                     "type": "plot",
-                    "content": buf
                 })
                 plt.close(fig)
@@ -111,7 +110,7 @@ class FinancialAgentApp (ABC):
                 self.__stream_answer__(
                     instructions=FINAL_PROMPT,
                     input_messages=[
-                        {"role": m["role"], "content": m["content"]}
                         for m in self.st.session_state.messages
                     ] + [{"role": "user", "content": context_prompt}]
                 )

             if fig.get_axes():
                 with self.st.chat_message("assistant"):
                     self.st.pyplot(fig)
                 self.st.session_state.messages.append({
                     "role": "assistant",
                     "type": "plot",
+                    "content": fig
                 })
                 plt.close(fig)
                 self.__stream_answer__(
                     instructions=FINAL_PROMPT,
                     input_messages=[
+                        {"role": m["role"], "content": m["content"]} if m['type'] != 'plot' or m['type'] is None else {}
                         for m in self.st.session_state.messages
                     ] + [{"role": "user", "content": context_prompt}]
                 )