πŸ“„ Model Card: Nelya-neko

Nelya-neko

🌟 Model Overview Nelya-neko is a Small Language Model (SLM) with 124 million parameters, pre-trained on the Nekolien constructed language (intellectual property of LLm-Clem). It is the first model in the new generation of Clemylia architectures designed for advanced research tasks in conlangs (constructed languages) and long-context processing.

πŸ› οΈ Technical Details and Architecture

Feature Value Impact Note
Family / Type Foundation Model (Base) / SLM Requires fine-tuning for alignment and downstream applications.
Developer Clemylia (LLm-Clem) Created from scratch (architecture, tokenizer, pre-training).
Parameters 124 Million Size optimized for efficiency and deployment on consumer-grade hardware.
Context Window 7000 Tokens Major Innovation: Enables processing of full documents and long-form Nekolien conversations.
Language Nekolien (Constructed Language) Ultra-specialized. Should not be used for natural languages without extensive fine-tuning.
Tokenizer Nekolien-tokenizer Proprietary tokenizer built from scratch, essential for decoding and encoding Nekolien.

πŸ”‘ Special Tokens (Included in Nekolien-tokenizer)

The model uses a set of special tokens to structure data and enable future alignment tasks:

Token Conventional Role Specific Function
UNK (Unknown) Handles unknown sequences not present in the Nekolien corpus.
CLS (Classifier) Classification token for sequence encapsulation (useful for fine-tuning).
SEP (Separator) Used to mark the boundary between different parts of a text sequence.
MASK Required for Masked Language Modeling (MLM) and prediction tasks in fine-tuning.
Memory / Metadata Unique token, potentially related to the efficient management of the extended context (7000 tokens).
Padding Ensures sequence length consistency for GPU efficiency.

πŸ“œ License and Usage Restrictions

License: LRUNDL (Limited Distinction Research Non-Commercial License)

  • Attribution: All derivatives (fine-tuned models) must clearly attribute authorship to LLm-Clem.
  • Restriction: Use of Nelya-neko is strictly limited to research and non-commercial experimentation.
  • Compliance: Derivative works must adhere to the LRUNDL (no more permissive licenses, such as MIT, can be applied).

πŸ’‘ Intended Use and Limitations

Intended Use

  • Conlang Research: Studying language modeling on constructed linguistic systems.
  • Nekolien Dataset Creation: Generating coherent corpora for fine-tuning.
  • Base for Specialized Assistants: Developing bots for the Nekolien language following alignment fine-tuning. Limitations and Precautions
  • Not Aligned: As a pure foundation model, Nelya-neko produces thematic text continuation, not structured responses (requires fine-tuning for instruction following).
  • Monolingual: Performance in any language other than Nekolien is nil or not guaranteed.
  • Access: The model and its tokenizer are subject to access restrictions managed by LLm-Clem.

πŸš€ Next Steps for Deployment

To transition from this foundation model to a functional application, Alignment Fine-Tuning (based on Nekolien instruction/response pairs) is necessary to instill the desired obedience and persona. Nekolien variant of this model: Original/Central Nekolien. Nelya-neko complies with the rules of the Nekolien Academy: https://neko-lexicon-archives.lovable.app/

Downloads last month
90
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support