Christopher Denq commited on
Commit
b64620a
·
1 Parent(s): 0986517

model push

Browse files
.gitattributes CHANGED
@@ -25,7 +25,6 @@
25
  *.safetensors filter=lfs diff=lfs merge=lfs -text
26
  saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
  *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
  *.tflite filter=lfs diff=lfs merge=lfs -text
30
  *.tgz filter=lfs diff=lfs merge=lfs -text
31
  *.wasm filter=lfs diff=lfs merge=lfs -text
 
25
  *.safetensors filter=lfs diff=lfs merge=lfs -text
26
  saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
  *.tar.* filter=lfs diff=lfs merge=lfs -text
 
28
  *.tflite filter=lfs diff=lfs merge=lfs -text
29
  *.tgz filter=lfs diff=lfs merge=lfs -text
30
  *.wasm filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # IDE / editor
2
+ .vs/
3
+ .vscode/
4
+ .idea/
5
+ *.suo
6
+ *.user
7
+
8
+ # Jupyter
9
+ .ipynb_checkpoints/
10
+ *.ipynb
11
+
12
+ # Python
13
+ __pycache__/
14
+ *.py[cod]
15
+ *.pyo
16
+ *.pyd
17
+ .Python
18
+ *.egg-info/
19
+ dist/
20
+ build/
21
+ *.egg
22
+
23
+ # Env
24
+ .env
25
+ .venv/
26
+ venv/
27
+ env/
28
+
29
+ # OS
30
+ .DS_Store
31
+ Thumbs.db
32
+
33
+ # Model artifacts (large files — use Git LFS or HF upload tools)
34
+ *.bin
35
+ *.pt
36
+ *.pth
37
+ *.ckpt
38
+ *.h5
39
+ *.onnx
40
+ *.safetensors
41
+
42
+ # Logs and outputs
43
+ *.log
44
+ Outputs/
45
+ Saved/
README.md CHANGED
@@ -1,3 +1,100 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-classification
7
+ - deception-detection
8
+ - few-shot-learning
9
+ - retrieval-augmented-generation
10
+ - in-context-learning
11
+ - mistral
12
+ - 4-bit
13
+ - bitsandbytes
14
+ - quantized
15
+ pipeline_tag: text-generation
16
+ base_model: Intel/neural-chat-7b-v3-3
17
+ ---
18
+
19
+ # RADDICL 2.0 — Quantized LLM
20
+
21
+ This is the 4-bit NF4 quantized LLM component of **RADDICL 2.0** (**R**etrieval **A**ugmented **D**eception **D**etection through **I**n-**C**ontext **L**earning), a domain-agnostic deception detection system.
22
+
23
+ - **Base model**: [`Intel/neural-chat-7b-v3-3`](https://huggingface.co/Intel/neural-chat-7b-v3-3) (Mistral 7B architecture)
24
+ - **Quantization**: 4-bit NF4 via [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes), double quantization enabled
25
+ - **Compute dtype**: `float16`
26
+ - **Total parameters**: 3.75B (~3.74 GB estimated memory footprint)
27
+
28
+ For the full RAG pipeline and demo, see [`cdenq/raddicl2-demo`](https://huggingface.co/spaces/cdenq/raddicl2-demo).
29
+
30
+ ---
31
+
32
+ ## Model Details
33
+
34
+ | Property | Value |
35
+ |---|---|
36
+ | Architecture | `MistralForCausalLM` |
37
+ | Base model | `Intel/neural-chat-7b-v3-3` |
38
+ | Quantization method | BitsAndBytes `nf4`, double quant |
39
+ | Compute dtype | `float16` |
40
+ | Max position embeddings | 32768 |
41
+ | Sliding window | 4096 |
42
+ | Vocab size | 32000 |
43
+ | Attention | SDPA |
44
+
45
+ ---
46
+
47
+ ## How to Load
48
+
49
+ ```python
50
+ from transformers import AutoModelForCausalLM, AutoTokenizer
51
+ from huggingface_hub import snapshot_download
52
+
53
+ # Download model files
54
+ model_path = snapshot_download(repo_id="cdenq/raddicl2-demo-model")
55
+
56
+ # Load tokenizer
57
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True)
58
+
59
+ # Load quantized model (quantization config is embedded in config.json)
60
+ model = AutoModelForCausalLM.from_pretrained(
61
+ model_path,
62
+ device_map="auto",
63
+ trust_remote_code=True,
64
+ )
65
+ ```
66
+
67
+ The `quantization_config` is already embedded in `config.json`, so no extra `BitsAndBytesConfig` is needed when loading.
68
+
69
+ ---
70
+
71
+ ## Intended Use
72
+
73
+ This model is the generation component of the RADDICL 2.0 deception detection pipeline. Given a structured few-shot prompt (constructed by the RADDICL 2.0 RAG pipeline), it produces a classification label (`deceptive` / `non-deceptive`) and step-by-step reasoning.
74
+
75
+ It is not intended to be used as a standalone general-purpose chat model.
76
+
77
+ ---
78
+
79
+ ## Citation
80
+
81
+ RADDICL 2.0 extends the original RADDICL system. If you use this work, please cite:
82
+
83
+ ```bibtex
84
+ @inproceedings{boumber2024raddicl,
85
+ title = {LLMs for Explainable Few-shot Deception Detection},
86
+ author = {Boumber, Dayne and Denq, Christopher and Verma, Rakesh},
87
+ booktitle = {Proceedings of the ACM Web Conference},
88
+ year = {2024},
89
+ doi = {10.1145/3643651.3659898}
90
+ }
91
+ ```
92
+
93
+ *(Citation for RADDICL 2.0 will be updated upon publication.)*
94
+
95
+ ---
96
+
97
+ ## Acknowledgments
98
+
99
+ Developed by Christopher Denq and Dr. Rakesh Verma at the [ReDAS Lab](https://github.com/ReDASers/), University of Houston.
100
+ Supported in part by the NSF REU CS grant and the University of Houston.
chat_template.jinja ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if messages[0]['role'] == 'system' %}
2
+ {%- set system_message = messages[0]['content'] %}
3
+ {%- set loop_messages = messages[1:] %}
4
+ {%- else %}
5
+ {%- set loop_messages = messages %}
6
+ {%- endif %}
7
+
8
+ {%- for message in loop_messages %}
9
+ {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}
10
+ {{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}
11
+ {%- endif %}
12
+ {%- if loop.first and system_message is defined %}
13
+ {{- '### System:
14
+ ' + system_message + '
15
+ ' }}
16
+ {%- endif %}
17
+ {%- if message['role'] == 'user' %}
18
+ {{- '### User:
19
+ ' + message['content'] + '
20
+ ' }}
21
+ {%- elif message['role'] == 'assistant' %}
22
+ {{- '### Assistant:
23
+ ' + message['content'] + eos_token + '
24
+ '}}
25
+ {%- else %}
26
+ {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}
27
+ {%- endif %}
28
+ {%- endfor %}{% if add_generation_prompt %}{{ '### Assistant:
29
+ ' }}{% endif %}
config.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "MistralForCausalLM"
4
+ ],
5
+ "attention_dropout": 0.0,
6
+ "bos_token_id": 1,
7
+ "eos_token_id": 2,
8
+ "head_dim": null,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 4096,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 14336,
13
+ "max_position_embeddings": 32768,
14
+ "model_type": "mistral",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 32,
17
+ "num_key_value_heads": 8,
18
+ "pad_token_id": 0,
19
+ "quantization_config": {
20
+ "_load_in_4bit": true,
21
+ "_load_in_8bit": false,
22
+ "bnb_4bit_compute_dtype": "float16",
23
+ "bnb_4bit_quant_storage": "uint8",
24
+ "bnb_4bit_quant_type": "nf4",
25
+ "bnb_4bit_use_double_quant": true,
26
+ "llm_int8_enable_fp32_cpu_offload": false,
27
+ "llm_int8_has_fp16_weight": false,
28
+ "llm_int8_skip_modules": null,
29
+ "llm_int8_threshold": 6.0,
30
+ "load_in_4bit": true,
31
+ "load_in_8bit": false,
32
+ "quant_method": "bitsandbytes"
33
+ },
34
+ "rms_norm_eps": 1e-05,
35
+ "rope_theta": 10000.0,
36
+ "sliding_window": 4096,
37
+ "tie_word_embeddings": false,
38
+ "torch_dtype": "float16",
39
+ "transformers_version": "4.53.3",
40
+ "use_cache": true,
41
+ "vocab_size": 32000
42
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "transformers_version": "4.53.3"
6
+ }
model_4bit_config.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_name": "Intel/neural-chat-7b-v3-3",
3
+ "quantization_config": {
4
+ "use_quantization": true,
5
+ "quantization_mode": "4bit"
6
+ },
7
+ "device_config": {
8
+ "device_type": "cpu",
9
+ "max_memory": {
10
+ "0": "15GB",
11
+ "cpu": "9GB"
12
+ },
13
+ "low_cpu_mem_usage": false
14
+ },
15
+ "model_params": {
16
+ "attn_implementation": "sdpa",
17
+ "pad_token_id": 0,
18
+ "trust_remote_code": true
19
+ },
20
+ "model_info": {
21
+ "total_params": 3752071168,
22
+ "trainable_params": 262410240,
23
+ "dtype": "torch.float16",
24
+ "estimated_memory_gb": 3.7387773990631104,
25
+ "quantization_mode": "4bit",
26
+ "dtype_variants": [
27
+ "torch.float16",
28
+ "torch.uint8"
29
+ ],
30
+ "gpu_memory_allocated_gb": 0.1322178840637207,
31
+ "gpu_memory_reserved_gb": 0.158203125
32
+ }
33
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<unk>",
4
+ "<s>",
5
+ "</s>"
6
+ ],
7
+ "bos_token": {
8
+ "content": "<s>",
9
+ "lstrip": false,
10
+ "normalized": false,
11
+ "rstrip": false,
12
+ "single_word": false
13
+ },
14
+ "eos_token": {
15
+ "content": "</s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "pad_token": "</s>",
22
+ "unk_token": {
23
+ "content": "<unk>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false
28
+ }
29
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "0": {
7
+ "content": "<unk>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "1": {
15
+ "content": "<s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": true
21
+ },
22
+ "2": {
23
+ "content": "</s>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": true
29
+ }
30
+ },
31
+ "additional_special_tokens": [
32
+ "<unk>",
33
+ "<s>",
34
+ "</s>"
35
+ ],
36
+ "bos_token": "<s>",
37
+ "clean_up_tokenization_spaces": false,
38
+ "eos_token": "</s>",
39
+ "extra_special_tokens": {},
40
+ "legacy": true,
41
+ "model_max_length": 1000000000000000019884624838656,
42
+ "pad_token": "</s>",
43
+ "sp_model_kwargs": {},
44
+ "spaces_between_special_tokens": false,
45
+ "tokenizer_class": "LlamaTokenizer",
46
+ "unk_token": "<unk>",
47
+ "use_default_system_prompt": true
48
+ }