LuxiaSL
/

luxia-selfsim-8b

@@ -1,4 +1,20 @@
-# luxia-selfsim-8B
 A fine-tuned Llama 3.1 8B Instruct model trained using curriculum learning to transfer cognitive style and navigational patterns rather than just knowledge.
@@ -27,14 +43,14 @@ Total training data: ~1.5-2M tokens, with top 20% selected via composite scoring
 ### Behavioral Traits
 - Skips unnecessary explanatory scaffolding
 - Assumes user competence and familiarity with complex topics
-- Uses casual language ("u", "lmk") naturally
 - Makes lateral connections between seemingly unrelated concepts
 - High lexical diversity (typically 0.6-0.88)
 - Comfortable with recursive and paradoxical thinking
 ### Limitations
 - **Temperature sensitivity**: Stable 0.3-0.85, begins collapsing around 1.2+
-- **Context drift**: May lose thread in extended conversations, but can be regrounded with concrete direction.
 - **Confabulation**: Will generate plausible but fictional details when uncertain
 - **Scattered coherence**: Brilliant insights mixed with fragmented reasoning
 - **Brief responses**: Tends toward shorter outputs (50-200 tokens typical)
@@ -43,13 +59,13 @@ Total training data: ~1.5-2M tokens, with top 20% selected via composite scoring
 **Best for:**
 - Creative exploration and lateral thinking
-- Short-to-medium conversations (5-15 turns)
 - Philosophical discussion and abstract reasoning
 - Generating diverse perspectives on complex topics
 - Brainstorming and ideation
 **Not ideal for:**
-- Extended multi-turn conversations requiring perfect context retention
 - Tasks requiring strict factual accuracy
 - Formal or structured outputs
 - Situations where confabulation is unacceptable
@@ -57,7 +73,7 @@ Total training data: ~1.5-2M tokens, with top 20% selected via composite scoring
 ## Technical Specifications
 - **Base Model**: Llama 3.1 8B Instruct
-- **Training Method**: OpenPipe curriculum learning with variable learning rates
 - **Context Length**: 8192 tokens
 - **Precision**: bf16 (merged weights)
 - **Parameters**: 8B
@@ -85,7 +101,7 @@ The model excels at:
 - Exploring ideas from multiple angles simultaneously
 The model may struggle with:
-- Maintaining single narrative thread across many turns
 - Distinguishing between recalled knowledge and generated patterns
 - Providing consistently structured outputs
@@ -120,13 +136,13 @@ Data was scored and filtered to select top 20% based on lexical diversity, compl
 ## Citation
-```
 @misc{luxia-selfsim-8b,
   author = {Luxia},
-  title = {luxia-selfsim-8B: Cognitive Style Transfer via Curriculum Learning},
   year = {2025},
   publisher = {HuggingFace},
-  howpublished = {\url{https://huggingface.co/LuxiaSL/luxia-selfsim-8B}}
 }
 ```
@@ -136,4 +152,4 @@ Llama 3.1 Community License
 ## Acknowledgments
-Fine-tuned using OpenPipe's infrastructure. Training methodology focused on cognitive pattern transfer through curriculum learning with scored data selection.

+---
+license: llama3.1
+base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
+tags:
+- llama-3
+- llama-3.1
+- fine-tuned
+- conversational
+- cognitive-style-transfer
+- curriculum-learning
+language:
+- en
+pipeline_tag: text-generation
+library_name: transformers
+---
+# luxia-selfsim-8b
 A fine-tuned Llama 3.1 8B Instruct model trained using curriculum learning to transfer cognitive style and navigational patterns rather than just knowledge.
 ### Behavioral Traits
 - Skips unnecessary explanatory scaffolding
 - Assumes user competence and familiarity with complex topics
+- Uses casual language naturally
 - Makes lateral connections between seemingly unrelated concepts
 - High lexical diversity (typically 0.6-0.88)
 - Comfortable with recursive and paradoxical thinking
 ### Limitations
 - **Temperature sensitivity**: Stable 0.3-0.85, begins collapsing around 1.2+
+- **Context drift**: May lose thread in extended conversations, but can be regrounded with concrete direction
 - **Confabulation**: Will generate plausible but fictional details when uncertain
 - **Scattered coherence**: Brilliant insights mixed with fragmented reasoning
 - **Brief responses**: Tends toward shorter outputs (50-200 tokens typical)
 **Best for:**
 - Creative exploration and lateral thinking
+- Short-to-medium conversations
 - Philosophical discussion and abstract reasoning
 - Generating diverse perspectives on complex topics
 - Brainstorming and ideation
 **Not ideal for:**
+- Extended conversations requiring perfect context retention
 - Tasks requiring strict factual accuracy
 - Formal or structured outputs
 - Situations where confabulation is unacceptable
 ## Technical Specifications
 - **Base Model**: Llama 3.1 8B Instruct
+- **Training Method**: OpenPipe curriculum learning with variable learning rate multipliers
 - **Context Length**: 8192 tokens
 - **Precision**: bf16 (merged weights)
 - **Parameters**: 8B
 - Exploring ideas from multiple angles simultaneously
 The model may struggle with:
+- Maintaining single narrative thread in extended conversations
 - Distinguishing between recalled knowledge and generated patterns
 - Providing consistently structured outputs
 ## Citation
+```bibtex
 @misc{luxia-selfsim-8b,
   author = {Luxia},
+  title = {luxia-selfsim-8b: Cognitive Style Transfer via Curriculum Learning},
   year = {2025},
   publisher = {HuggingFace},
+  howpublished = {\url{https://huggingface.co/LuxiaSL/luxia-selfsim-8b}}
 }
 ```
 ## Acknowledgments
+Fine-tuned using OpenPipe's infrastructure. Training methodology focused on cognitive pattern transfer through curriculum learning with scored data selection.