Cortex-1-mini / README.md
Jarrodbarnes's picture
Update README.md
0fc02e5 verified
---
license: mit
datasets:
- Jarrodbarnes/cortex-1-market-analysis
language:
- en
base_model:
- microsoft/Phi-4-mini-instruct
tags:
- finance
- crypto
- phi-4
- reasoning
- GRPO
library_name: transformers
---
# NEAR Cortex-1-mini
This model is a fine-tuned version of Microsoft's [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) (3.8B parameters), specialized for blockchain market analysis with explicit reasoning capabilities. It's designed to analyze on-chain data, identify patterns and anomalies, and provide actionable insights with transparent reasoning processes.
## Model Description
The model has been fine-tuned on the [Cortex-1 Market Analysis dataset](https://huggingface.co/datasets/Jarrodbarnes/cortex-1-market-analysis) to:
- Break down complex market data into structured components
- Perform numerical calculations and identify correlations
- Recognize patterns across multiple metrics
- Separate detailed reasoning (using `<thinking>` tags) from concise summaries
- Provide actionable insights with specific price targets
This model is part of the [NEAR Cortex-1](https://github.com/jbarnes850/cortex-1) initiative, which aims to create AI models that can analyze blockchain data with transparent reasoning processes.
## Usage
The model is designed to analyze blockchain market data and provide both detailed reasoning and concise conclusions. It uses `<thinking>` tags to separate its reasoning process from its final analysis.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "Jarrodbarnes/cortex-1-mini"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Example prompt
prompt = """Please analyze this market data and show your reasoning:
Given the following Ethereum market data:
- Daily Transactions: 1.5M (up 8% from average)
- Current Price: $3,450
- Exchange Outflows: 52K ETH (up 20%)"""
# Generate response
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=512,
temperature=0.7,
do_sample=True
)
# Print response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Post-Processing for Thinking Tags
The model sometimes has issues with the proper formatting of `<thinking>` tags. We recommend implementing the following post-processing function:
```python
def clean_thinking_tags(text, prompt):
"""
Clean up thinking tags in the response.
Args:
text: Raw model response
prompt: Original prompt
Returns:
Cleaned response with proper thinking tags
"""
# Extract content after the prompt
if prompt in text:
text = text[len(prompt):].strip()
# Handle case where model repeats <thinking> tags
thinking_tag_count = text.count("<thinking>")
if thinking_tag_count > 1:
# Keep only the first <thinking> tag
first_tag_pos = text.find("<thinking>")
text_after_first_tag = text[first_tag_pos:]
# Replace subsequent <thinking> tags with newlines
modified_text = text_after_first_tag
for i in range(thinking_tag_count - 1):
modified_text = modified_text.replace("<thinking>", "\n", 1)
text = text[:first_tag_pos] + modified_text
# Ensure there's a </thinking> tag if there's a <thinking> tag
if "<thinking>" in text and "</thinking>" not in text:
# Add </thinking> before what looks like a conclusion
conclusion_markers = ["In conclusion", "To summarize", "Overall",
"Final analysis", "Therefore", "Based on this analysis"]
for marker in conclusion_markers:
if marker in text:
parts = text.split(marker, 1)
text = parts[0] + "</thinking>\n\n" + marker + parts[1]
break
else:
# If no conclusion marker, add </thinking> at 80% of the text
split_point = int(len(text) * 0.8)
text = text[:split_point] + "\n</thinking>\n\n" + text[split_point:]
return text
```
## Training Details
- **Base Model**: microsoft/Phi-4-mini-instruct (3.8B parameters)
- **Training Method**: LoRA fine-tuning (r=16, alpha=16)
- **Target Modules**: qkv_proj, o_proj (attention layers)
- **Dataset**: Cortex-1 Market Analysis (521 examples)
- 436 training examples
- 85 evaluation examples
- **Training Duration**: 3 epochs
- **Hardware**: Apple Silicon (M-series) with Metal Performance Shaders (MPS)
- **Hyperparameters**:
- Learning Rate: 2e-5 with cosine scheduler and 10% warmup
- Batch Size: 1 with gradient accumulation steps of 8 (effective batch size of 8)
- Max Sequence Length: 2048 tokens
- **Metrics**:
- Training Loss: 11.6% reduction (1.5591 → 1.3790)
- Token Accuracy: 2.93 percentage point improvement (61.43% → 64.36%)
- Evaluation Loss: 4.04% reduction (1.6273 → 1.5616)
## Performance and Capabilities
The model demonstrates strong performance across various market analysis tasks:
| Capability | Success Rate |
|------------|--------------|
| Support/Resistance Identification | 92% |
| Volume Analysis | 88% |
| Pattern Recognition | 84% |
| Risk Assessment | 80% |
| Confidence Interval Calculation | 76% |
### Reasoning Quality Assessment
The model was evaluated using a structured rubric with the following results:
| Dimension | Score (0-10) | Notes |
|-----------|--------------|-------|
| Logical Flow | 7.8 | Strong sequential reasoning with occasional minor gaps |
| Calculation Accuracy | 8.2 | Generally accurate with some rounding inconsistencies |
| Evidence Citation | 8.5 | Consistent citation of metrics in analysis |
| Insight Depth | 6.9 | Good pattern recognition but limited novel insights |
| Completeness | 8.3 | Comprehensive coverage of analysis components |
| **Weighted Total** | **7.9** | **Strong overall reasoning quality** |
## Limitations
The model has several limitations to be aware of:
- **Novel Insights**: Sometimes relies on obvious patterns rather than discovering subtle connections
- **Confidence Calibration**: Prediction ranges can be overly narrow in volatile market conditions
- **Cross-Chain Analysis**: Less effective when analyzing correlations across multiple blockchains
- **Temporal Reasoning**: Occasionally struggles with complex time-series patterns
- **Extreme Scenarios**: Performance degrades in highly anomalous market conditions
- **Thinking Tag Formatting**: The model sometimes has issues with the proper formatting of `<thinking>` tags, such as:
- Repeating the opening tag multiple times
- Omitting the closing tag
- Inconsistent formatting
## Practical Applications
The fine-tuned model can be used for various blockchain analytics applications:
1. **Trading Dashboards**: Providing real-time analysis of market conditions
2. **DeFi Applications**: Offering insights for protocol governance and risk management
3. **Research Platforms**: Supporting blockchain data analysis and visualization
4. **Educational Tools**: Teaching market analysis methodologies
## Future Improvements
Several avenues for future improvement have been identified:
1. **Expanded Dataset**: Incorporating more diverse market scenarios and blockchain networks
2. **Specialized Evaluation**: Developing domain-specific evaluation metrics for market analysis
3. **Multi-chain Integration**: Enhancing cross-chain analysis capabilities
4. **Real-time Data Integration**: Connecting the model to live blockchain data feeds
5. **Quantitative Accuracy**: Improving numerical prediction accuracy through specialized training
## Citation
If you use this model in your research or applications, please cite:
```
@misc{barnes2025phi4mini,
author = {Barnes, Jarrod},
title = {Cortex-1-mini},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Jarrodbarnes/Cortex-1-mini}}
}
```
## License
This model is released under the MIT License.
## Acknowledgements
- Microsoft for creating the Phi-4-mini-instruct base model
- The NEAR Cortex-1 project team for their contributions to the dataset and evaluation
- The Hugging Face team for their infrastructure and tools