README.md · openalchemy/MachFund at main

MachFund / README.md

firstsko

Add model card with training details and usage examples

a97dde9 verified 5 days ago

preview code

raw

history blame contribute delete

4.12 kB

	---
	license: mit
	language:
	- zh
	base_model:
	- Qwen/Qwen2.5-3B-Instruct
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- finance
	- chinese
	- qlora
	- private-equity
	- fund-analysis
	- distillation
	metrics:
	- loss
	---

	# MachFund-1

	A specialized Chinese private equity fund analysis model, fine-tuned from [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) using QLoRA knowledge distillation.

	## Overview

	MachFund-1 is trained to analyze Chinese private equity funds across multiple dimensions: performance analysis, risk assessment, strategy evaluation, manager background, fund comparisons, and investment advice. The model demonstrates a 68.75% improvement over the base model on domain-specific tasks.

	## Training Details

	\| Parameter \| Value \|
	\|---\|---\|
	\| Base Model \| Qwen2.5-3B-Instruct \|
	\| Method \| QLoRA (4-bit NF4 quantization) \|
	\| LoRA Rank / Alpha \| 32 / 64 \|
	\| Training Samples \| 6,976 (eval: 769) \|
	\| Effective Batch Size \| 16 (2 x 8 grad accumulation) \|
	\| Learning Rate \| 2e-4 (cosine schedule) \|
	\| Epochs \| 2 \|
	\| Max Sequence Length \| 6,144 tokens \|
	\| Final Training Loss \| 0.9269 \|
	\| Training Time \| 141 min on NVIDIA A100 80GB \|
	\| Total Steps \| 872 \|

	### Knowledge Distillation Pipeline

	1. Teacher Model: Gemini 2.5 Pro generates ~50 Q&A pairs per fund across 8 categories for 178 Chinese private equity funds
	2. Quality Scoring: Gemini 2.5 Flash scores each pair on 5 dimensions (accuracy, completeness, professionalism, data usage, coherence) with a threshold of 15/25
	3. Student Training: QLoRA fine-tuning on 6,976 high-quality filtered samples

	### Question Categories

	- Fund overview and basic information
	- Performance analysis and benchmarking
	- Risk assessment and drawdown analysis
	- Strategy analysis and market positioning
	- Manager background and track record
	- Fund comparisons (peer and category)
	- Investment advice and suitability
	- Structured data extraction

	## Evaluation

	\| Gate \| Metric \| Result \|
	\|---\|---\|---\|
	\| Training Lift \| Base vs Fine-tuned Score \| PASS (4.8 to 8.1, +68.75%, threshold: 30%) \|
	\| Speed (FP16) \| Tokens/sec on RTX 5080 \| 30.1 tok/s (threshold: 50) \|

	## Available Formats

	\| Format \| File \| Size \| Use Case \|
	\|---\|---\|---\|---\|
	\| SafeTensors (FP16) \| `model.safetensors` \| 6.17 GB \| Full precision inference \|
	\| GGUF Q8_0 \| `gguf/mach-fund-1-Q8_0.gguf` \| 3.29 GB \| High-quality quantized inference \|
	\| GGUF Q4_K_M \| `gguf/mach-fund-1-Q4_K_M.gguf` \| 1.93 GB \| Efficient inference, recommended \|
	\| GGUF F16 \| `gguf/mach-fund-1-f16.gguf` \| 6.18 GB \| Full precision GGUF \|

	## Usage

	### Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("openalchemy/MachFund", torch_dtype="auto", device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained("openalchemy/MachFund")

	messages = [
	{"role": "system", "content": "You are a professional private equity fund analyst."},
	{"role": "user", "content": "Analyze the performance of this fund"}
	]

	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=1024)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### llama.cpp (GGUF)

	```bash
	./llama-cli -m mach-fund-1-Q4_K_M.gguf -p "Analyze the risk profile of this fund" -n 512
	```

	### Ollama

	```bash
	echo 'FROM ./mach-fund-1-Q4_K_M.gguf' > Modelfile
	ollama create machfund -f Modelfile
	ollama run machfund "What is the Sharpe ratio of this fund?"
	```

	## Limitations

	- Trained specifically on Chinese private equity fund data; may not generalize to other financial domains
	- Training data reflects fund information available up to early 2026
	- Should not be used as the sole basis for investment decisions
	- Speed on consumer GPUs (RTX 5080) is below the 50 tok/s target at FP16; use GGUF Q4_K_M for faster inference

	## License

	MIT