BINOMDA commited on
Commit
bba63da
·
verified ·
1 Parent(s): ad13832

Initial OMDA model upload

Browse files
Files changed (4) hide show
  1. README.md +37 -0
  2. config.json +18 -0
  3. pytorch_model.bin +3 -0
  4. vocab.txt +0 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # OMDA: Arabic-English Chat LLM
3
+
4
+ **Model Name:** OMDA
5
+ **Architecture:** OMDA-Decoder
6
+ **Tokenizer:** OMDATokenizer
7
+ **Languages:** Arabic, English
8
+ **Type:** Chat/Instruction-following
9
+ **Author:** Binomda
10
+ **Date:** 2025-06-28
11
+
12
+ ## Model Details
13
+ - Layers: 6
14
+ - Hidden size: 512
15
+ - Attention heads: 8
16
+ - FFN dim: 2048
17
+ - Max sequence length: 512
18
+ - Vocab size: 128004
19
+ - Training data: Aggregated Arabic-English chat pairs from CSV, TXT, and JSON sources.
20
+
21
+ ## Intended Use
22
+ - Chatbots, assistants, translation, and educational tools for Arabic/English.
23
+
24
+ ## Training
25
+ - Trained for 5 epochs on 1000 samples.
26
+ - Loss curve and checkpoints included.
27
+
28
+ ## Limitations
29
+ - This is a small-scale demonstration model and may not generalize well to all real-world chat scenarios.
30
+ - Not suitable for production use without further scaling, extensive evaluation, and safety checks.
31
+ - Limited training data and model size may result in hallucinations or inaccurate translations.
32
+ - No advanced filtering for inappropriate or biased outputs.
33
+ - For research and educational purposes only.
34
+
35
+ ## Export & Deployment
36
+ - See below for HuggingFace, llama.cpp, and ollama export instructions.
37
+
config.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_name": "OMDA",
3
+ "architecture": "OMDA-Decoder",
4
+ "vocab_size": 128004,
5
+ "d_model": 512,
6
+ "n_layers": 6,
7
+ "n_heads": 8,
8
+ "d_ff": 2048,
9
+ "max_seq_len": 512,
10
+ "dropout": 0.1,
11
+ "batch_size": 8,
12
+ "learning_rate": 0.0001,
13
+ "num_epochs": 5,
14
+ "save_steps": 1000,
15
+ "eval_steps": 500,
16
+ "model_save_path": "./omda_chat_model",
17
+ "tokenizer_name": "OMDATokenizer"
18
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ad51ed28b2c059ce3a0e44cbdfcbda1978b8993d50e90a1cead97ec07b7cbde9
3
+ size 601553329
vocab.txt ADDED
The diff for this file is too large to render. See raw diff