Spaces:

mrfirdauss
/

Financial-RAG

Sleeping

App Files Files Community

mrfirdauss commited on Sep 10, 2025

Commit

a16904f

2 Parent(s): afad5cd 6211544

fix conflict

Browse files

Files changed (2) hide show

README.md +17 -0
src/FinancialAgentApp.py +2 -3

README.md CHANGED Viewed

@@ -36,6 +36,23 @@ As alternative we provie self hosted ollama api on hugginface. This API already
 ## Design Pattern
 I apply **`Factory Design Pattern`**, because we want all library we use have same method, even originally the method of the library was different. Thus we create a class as Factory Abstract Class.
 ## Schema
 ```
 User Input

 ## Design Pattern
 I apply **`Factory Design Pattern`**, because we want all library we use have same method, even originally the method of the library was different. Thus we create a class as Factory Abstract Class.
+## Model Selection:
+In this task i select **decoder-only** model, because we want them as conversation model. Not specific task. To avoid hallucination, we add RAG as context.
+- Propetary model: **OpenAI GPT-5** -> its capability and integrated service; while deepseek offer lower price but on peak hours deepseek api often not reliable.
+- Local model: **QWEN3-4b** -> since we only have limited RAM and no GPU, we need reliable but small enough model. Compared to same size model (Gemma) Qwen3 outperform on reasoning and analytical capability. This reasoning model also outperform DeepSeekR1. [link](https://artificialanalysis.ai/models/gemma-3-4b?models=gpt-4-1%2Cgpt-oss-120b%2Cgpt-oss-20b%2Cgpt-5%2Co3%2Cgpt-5-minimal%2Cllama-4-maverick%2Cgemini-2-5-pro%2Cgemini-2-5-flash-reasoning%2Cgemma-3-4b%2Cclaude-4-1-opus-thinking%2Cclaude-4-sonnet-thinking%2Cmistral-medium-3-1%2Cdeepseek-r1%2Cdeepseek-v3-1-reasoning%2Cdeepseek-v3-1%2Cdeepseek-r1-qwen3-8b%2Cgrok-code-fast-1%2Cgrok-4%2Csolar-pro-2-reasoning%2Cllama-nemotron-super-49b-v1-5-reasoning%2Ckimi-k2-0905%2Cexaone-4-0-32b-reasoning%2Cglm-4.5%2Cqwen3-235b-a22b-instruct-2507-reasoning%2Cqwen3-4b-2507-instruct-reasoning%2Cqwen1.5-110b-chat)
+## File handling
+File located in PKL to optimize data loading, because the size is small, we dont need to include it as db.
+## PDF:
+PDF saved in vector store, for non openai, we use faiss as database because it has concrete capability, dont need to integrate with many others library anymore.
+### Embedding:
+I choose `sentence-transformers/all-MiniLM-L6-v2` embedding model because it have balance capability on accuracy and latency. While its accuracy comparable with `sentence-transformers/all-mpnet-base-v2`, this model has lower latency. Feasible for retreiver embedding.
+## Ollama API
+As alternative we provie self hosted ollama api on hugginface. This API already enabled with "qwen3-4b" model. [link](https://mrfirdauss-ollama-api.hf.space)
 ## Schema
 ```
 User Input

src/FinancialAgentApp.py CHANGED Viewed

@@ -85,11 +85,10 @@ class FinancialAgentApp (ABC):
             exec(response_state.code, {}, local_scope)
             fig = plt.gcf()
-            if fig.get_axes():  # if a chart was generated
                 with self.st.chat_message("assistant"):
                     self.st.pyplot(fig)
-                buf = self.__safe_savefig__()  # BytesIO PNG
-                # Add the plot as a chat message in session state
                 self.st.session_state.messages.append({
                     "role": "assistant",
                     "type": "plot",

             exec(response_state.code, {}, local_scope)
             fig = plt.gcf()
+            if fig.get_axes():
                 with self.st.chat_message("assistant"):
                     self.st.pyplot(fig)
+                buf = self.__safe_savefig__()
                 self.st.session_state.messages.append({
                     "role": "assistant",
                     "type": "plot",