Cortex-1-mini / README.md

Update README.md

0fc02e5 verified 11 months ago

8.42 kB

	---
	license: mit
	datasets:
	- Jarrodbarnes/cortex-1-market-analysis
	language:
	- en
	base_model:
	- microsoft/Phi-4-mini-instruct
	tags:
	- finance
	- crypto
	- phi-4
	- reasoning
	- GRPO
	library_name: transformers
	---

	# NEAR Cortex-1-mini

	This model is a fine-tuned version of Microsoft's [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) (3.8B parameters), specialized for blockchain market analysis with explicit reasoning capabilities. It's designed to analyze on-chain data, identify patterns and anomalies, and provide actionable insights with transparent reasoning processes.

	## Model Description

	The model has been fine-tuned on the [Cortex-1 Market Analysis dataset](https://huggingface.co/datasets/Jarrodbarnes/cortex-1-market-analysis) to:

	- Break down complex market data into structured components
	- Perform numerical calculations and identify correlations
	- Recognize patterns across multiple metrics
	- Separate detailed reasoning (using `<thinking>` tags) from concise summaries
	- Provide actionable insights with specific price targets

	This model is part of the [NEAR Cortex-1](https://github.com/jbarnes850/cortex-1) initiative, which aims to create AI models that can analyze blockchain data with transparent reasoning processes.

	## Usage

	The model is designed to analyze blockchain market data and provide both detailed reasoning and concise conclusions. It uses `<thinking>` tags to separate its reasoning process from its final analysis.

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Load model and tokenizer
	model_name = "Jarrodbarnes/cortex-1-mini"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.float16,
	device_map="auto"
	)

	# Example prompt
	prompt = """Please analyze this market data and show your reasoning:

	Given the following Ethereum market data:
	- Daily Transactions: 1.5M (up 8% from average)
	- Current Price: $3,450
	- Exchange Outflows: 52K ETH (up 20%)"""

	# Generate response
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	inputs["input_ids"],
	max_new_tokens=512,
	temperature=0.7,
	do_sample=True
	)

	# Print response
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### Post-Processing for Thinking Tags

	The model sometimes has issues with the proper formatting of `<thinking>` tags. We recommend implementing the following post-processing function:

	```python
	def clean_thinking_tags(text, prompt):
	"""
	Clean up thinking tags in the response.

	Args:
	text: Raw model response
	prompt: Original prompt

	Returns:
	Cleaned response with proper thinking tags
	"""
	# Extract content after the prompt
	if prompt in text:
	text = text[len(prompt):].strip()

	# Handle case where model repeats <thinking> tags
	thinking_tag_count = text.count("<thinking>")
	if thinking_tag_count > 1:
	# Keep only the first <thinking> tag
	first_tag_pos = text.find("<thinking>")
	text_after_first_tag = text[first_tag_pos:]

	# Replace subsequent <thinking> tags with newlines
	modified_text = text_after_first_tag
	for i in range(thinking_tag_count - 1):
	modified_text = modified_text.replace("<thinking>", "\n", 1)

	text = text[:first_tag_pos] + modified_text

	# Ensure there's a </thinking> tag if there's a <thinking> tag
	if "<thinking>" in text and "</thinking>" not in text:
	# Add </thinking> before what looks like a conclusion
	conclusion_markers = ["In conclusion", "To summarize", "Overall",
	"Final analysis", "Therefore", "Based on this analysis"]
	for marker in conclusion_markers:
	if marker in text:
	parts = text.split(marker, 1)
	text = parts[0] + "</thinking>\n\n" + marker + parts[1]
	break
	else:
	# If no conclusion marker, add </thinking> at 80% of the text
	split_point = int(len(text) * 0.8)
	text = text[:split_point] + "\n</thinking>\n\n" + text[split_point:]

	return text
	```

	## Training Details

	- Base Model: microsoft/Phi-4-mini-instruct (3.8B parameters)
	- Training Method: LoRA fine-tuning (r=16, alpha=16)
	- Target Modules: qkv_proj, o_proj (attention layers)
	- Dataset: Cortex-1 Market Analysis (521 examples)
	- 436 training examples
	- 85 evaluation examples
	- Training Duration: 3 epochs
	- Hardware: Apple Silicon (M-series) with Metal Performance Shaders (MPS)
	- Hyperparameters:
	- Learning Rate: 2e-5 with cosine scheduler and 10% warmup
	- Batch Size: 1 with gradient accumulation steps of 8 (effective batch size of 8)
	- Max Sequence Length: 2048 tokens
	- Metrics:
	- Training Loss: 11.6% reduction (1.5591 → 1.3790)
	- Token Accuracy: 2.93 percentage point improvement (61.43% → 64.36%)
	- Evaluation Loss: 4.04% reduction (1.6273 → 1.5616)

	## Performance and Capabilities

	The model demonstrates strong performance across various market analysis tasks:

	\| Capability \| Success Rate \|
	\|------------\|--------------\|
	\| Support/Resistance Identification \| 92% \|
	\| Volume Analysis \| 88% \|
	\| Pattern Recognition \| 84% \|
	\| Risk Assessment \| 80% \|
	\| Confidence Interval Calculation \| 76% \|

	### Reasoning Quality Assessment

	The model was evaluated using a structured rubric with the following results:

	\| Dimension \| Score (0-10) \| Notes \|
	\|-----------\|--------------\|-------\|
	\| Logical Flow \| 7.8 \| Strong sequential reasoning with occasional minor gaps \|
	\| Calculation Accuracy \| 8.2 \| Generally accurate with some rounding inconsistencies \|
	\| Evidence Citation \| 8.5 \| Consistent citation of metrics in analysis \|
	\| Insight Depth \| 6.9 \| Good pattern recognition but limited novel insights \|
	\| Completeness \| 8.3 \| Comprehensive coverage of analysis components \|
	\| Weighted Total \| 7.9 \| Strong overall reasoning quality \|

	## Limitations

	The model has several limitations to be aware of:

	- Novel Insights: Sometimes relies on obvious patterns rather than discovering subtle connections
	- Confidence Calibration: Prediction ranges can be overly narrow in volatile market conditions
	- Cross-Chain Analysis: Less effective when analyzing correlations across multiple blockchains
	- Temporal Reasoning: Occasionally struggles with complex time-series patterns
	- Extreme Scenarios: Performance degrades in highly anomalous market conditions
	- Thinking Tag Formatting: The model sometimes has issues with the proper formatting of `<thinking>` tags, such as:
	- Repeating the opening tag multiple times
	- Omitting the closing tag
	- Inconsistent formatting

	## Practical Applications

	The fine-tuned model can be used for various blockchain analytics applications:

	1. Trading Dashboards: Providing real-time analysis of market conditions
	2. DeFi Applications: Offering insights for protocol governance and risk management
	3. Research Platforms: Supporting blockchain data analysis and visualization
	4. Educational Tools: Teaching market analysis methodologies

	## Future Improvements

	Several avenues for future improvement have been identified:

	1. Expanded Dataset: Incorporating more diverse market scenarios and blockchain networks
	2. Specialized Evaluation: Developing domain-specific evaluation metrics for market analysis
	3. Multi-chain Integration: Enhancing cross-chain analysis capabilities
	4. Real-time Data Integration: Connecting the model to live blockchain data feeds
	5. Quantitative Accuracy: Improving numerical prediction accuracy through specialized training

	## Citation

	If you use this model in your research or applications, please cite:

	```
	@misc{barnes2025phi4mini,
	author = {Barnes, Jarrod},
	title = {Cortex-1-mini},
	year = {2025},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/Jarrodbarnes/Cortex-1-mini}}
	}
	```

	## License

	This model is released under the MIT License.

	## Acknowledgements

	- Microsoft for creating the Phi-4-mini-instruct base model
	- The NEAR Cortex-1 project team for their contributions to the dataset and evaluation
	- The Hugging Face team for their infrastructure and tools