fengxb30 commited on
Commit
2543635
·
verified ·
1 Parent(s): 594b3dc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FinGPT Compliance Agent with RAG for XBRL Specifications
2
+
3
+ This project demonstrates a specialized compliance agent built using a Retrieval-Augmented Generation (RAG) framework. The core Large Language Model (LLM) is **TheFinAI/Fin-o1-8B**, which is augmented with a custom-built knowledge base of XBRL (eXtensible Business Reporting Language) specifications.
4
+
5
+ The agent can handle two types of queries:
6
+ 1. **General Financial Questions**: Answered directly by the Fin-o1-8B model use code like:
7
+ from transformers import AutoModelForCausalLM, AutoTokenizer
8
+
9
+ model_name = "TheFinAI/Fin-o1-8B"
10
+
11
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
12
+ model = AutoModelForCausalLM.from_pretrained(model_name)
13
+
14
+ input_text = "What is the results of 3-5?"
15
+ inputs = tokenizer(input_text, return_tensors="pt")
16
+
17
+ output = model.generate(**inputs, max_new_tokens=200)
18
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
19
+
20
+ 2. **XBRL-Specific Compliance Questions**: Answered using the RAG pipeline, which retrieves relevant context from a local knowledge base (`xbrl_results_2_spec_filtered_reindexed.json`) before generating a response. This ensures that answers related to XBRL are accurate, detailed, and grounded in official documentation.
21
+
22
+ ##
23
+ Project Structure
24
+
25
+ To use this framework correctly, please organize your project files as follows. All project files, except for `inference.py` and the JSON knowledge base, should be placed inside a directory named after the model, `Fin-o1-8B`.
26
+
27
+ - **`Fin-o1-8B/`**: This directory should contain the downloaded model artifacts for `TheFinAI/Fin-o1-8B`. The `transformers` library will automatically cache the model here if you specify it as the save directory.
28
+ - **`inference.py`**: The main script for running the RAG-powered XBRL compliance agent.
29
+ - **`xbrl_results_2_spec_filtered_reindexed.json`**: The pre-built knowledge base containing crawled data from XBRL specification websites.