Instructions to use lokpalai/lokpalgpt with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lokpalai/lokpalgpt with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="lokpalai/lokpalgpt", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("lokpalai/lokpalgpt", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use lokpalai/lokpalgpt with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lokpalai/lokpalgpt" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lokpalai/lokpalgpt", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/lokpalai/lokpalgpt
- SGLang
How to use lokpalai/lokpalgpt with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lokpalai/lokpalgpt" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lokpalai/lokpalgpt", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lokpalai/lokpalgpt" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lokpalai/lokpalgpt", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use lokpalai/lokpalgpt with Docker Model Runner:
docker model run hf.co/lokpalai/lokpalgpt
Commit ·
ecbe8a9
1
Parent(s): d10491e
Model version 0.0.1 added without testing
Browse files- README.md +10 -0
- config.json +29 -0
- generation_config.json +6 -0
- pytorch_model-00001-of-00002.bin +3 -0
- pytorch_model-00002-of-00002.bin +3 -0
- pytorch_model.bin.index.json +203 -0
- special_tokens_map.json +16 -0
- tokenizer.json +0 -0
- tokenizer_config.json +7 -0
README.md
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# LokPalAI: Bridging the Gap to Legal Empowerment
|
| 2 |
+
|
| 3 |
+
LokPalAI is an advanced language model finetuned for Indian scenarios, specifically designed to bridge the gap between individuals and legal empowerment. With LokPalAI, users can interact with a powerful query box to seek information and guidance related to Indian law.
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
## Features:
|
| 7 |
+
1. Interact with LokPalAI’s Query Box: LokPalAI provides a user-friendly query box interface where users can input their legal queries and receive accurate and relevant responses. Whether you need information about a specific law, legal procedure, or any other legal matter, LokPalAI is here to assist you.
|
| 8 |
+
2. Enhanced with Rail Guards: To ensure the accuracy and reliability of the information provided, LokPalAI incorporates rail guards. These safeguards help prevent the generation of misleading or incorrect legal advice. We understand the importance of reliable legal information, and our rail guards are designed to maintain the highest standards of accuracy.
|
| 9 |
+
3. Real-Time Responses using RAG: LokPalAI leverages the Retrieve and Generate (RAG) framework to provide real-time responses to your legal queries. RAG combines the power of retrieval-based models with generation-based models, ensuring that the information provided is both contextually relevant and up to date.
|
| 10 |
+
4. Thorough Testing and Maintenance: We understand the criticality of maintaining a reliable and accurate legal information system. LokPalAI undergoes extensive testing to ensure its performance and reliability. We continuously monitor and update the model to account for changes in Indian law, ensuring that the information provided is always accurate and up to date.
|
config.json
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "tiiuae/falcon-7b-instruct",
|
| 3 |
+
"alibi": false,
|
| 4 |
+
"apply_residual_connection_post_layernorm": false,
|
| 5 |
+
"architectures": [
|
| 6 |
+
"RWForCausalLM"
|
| 7 |
+
],
|
| 8 |
+
"attention_dropout": 0.0,
|
| 9 |
+
"auto_map": {
|
| 10 |
+
"AutoConfig": "tiiuae/falcon-7b-instruct--configuration_RW.RWConfig",
|
| 11 |
+
"AutoModelForCausalLM": "tiiuae/falcon-7b-instruct--modelling_RW.RWForCausalLM"
|
| 12 |
+
},
|
| 13 |
+
"bias": false,
|
| 14 |
+
"bos_token_id": 11,
|
| 15 |
+
"eos_token_id": 11,
|
| 16 |
+
"hidden_dropout": 0.0,
|
| 17 |
+
"hidden_size": 4544,
|
| 18 |
+
"initializer_range": 0.02,
|
| 19 |
+
"layer_norm_epsilon": 1e-05,
|
| 20 |
+
"model_type": "RefinedWebModel",
|
| 21 |
+
"multi_query": true,
|
| 22 |
+
"n_head": 71,
|
| 23 |
+
"n_layer": 32,
|
| 24 |
+
"parallel_attn": true,
|
| 25 |
+
"torch_dtype": "bfloat16",
|
| 26 |
+
"transformers_version": "4.30.0.dev0",
|
| 27 |
+
"use_cache": true,
|
| 28 |
+
"vocab_size": 65024
|
| 29 |
+
}
|
generation_config.json
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 1,
|
| 4 |
+
"eos_token_id": 2,
|
| 5 |
+
"transformers_version": "4.30.0.dev0"
|
| 6 |
+
}
|
pytorch_model-00001-of-00002.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5b2eba3c43780dabd8eb868c6956ad252b720219e8166db482bd8d7cb7efcb66
|
| 3 |
+
size 9951028257
|
pytorch_model-00002-of-00002.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f85f71110d906f7f2db5d6de45bb3a34b031e00160049bbcfff88fafc6ef1dee
|
| 3 |
+
size 3892483153
|
pytorch_model.bin.index.json
ADDED
|
@@ -0,0 +1,203 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"metadata": {
|
| 3 |
+
"total_size": 13843441408
|
| 4 |
+
},
|
| 5 |
+
"weight_map": {
|
| 6 |
+
"lm_head.weight": "pytorch_model-00001-of-00002.bin",
|
| 7 |
+
"transformer.h.0.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 8 |
+
"transformer.h.0.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 9 |
+
"transformer.h.0.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 10 |
+
"transformer.h.0.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 11 |
+
"transformer.h.0.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 12 |
+
"transformer.h.0.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 13 |
+
"transformer.h.1.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 14 |
+
"transformer.h.1.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 15 |
+
"transformer.h.1.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 16 |
+
"transformer.h.1.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 17 |
+
"transformer.h.1.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 18 |
+
"transformer.h.1.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 19 |
+
"transformer.h.10.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 20 |
+
"transformer.h.10.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 21 |
+
"transformer.h.10.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 22 |
+
"transformer.h.10.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 23 |
+
"transformer.h.10.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 24 |
+
"transformer.h.10.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 25 |
+
"transformer.h.11.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 26 |
+
"transformer.h.11.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 27 |
+
"transformer.h.11.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 28 |
+
"transformer.h.11.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 29 |
+
"transformer.h.11.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 30 |
+
"transformer.h.11.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 31 |
+
"transformer.h.12.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 32 |
+
"transformer.h.12.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 33 |
+
"transformer.h.12.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 34 |
+
"transformer.h.12.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 35 |
+
"transformer.h.12.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 36 |
+
"transformer.h.12.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 37 |
+
"transformer.h.13.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 38 |
+
"transformer.h.13.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 39 |
+
"transformer.h.13.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 40 |
+
"transformer.h.13.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 41 |
+
"transformer.h.13.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 42 |
+
"transformer.h.13.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 43 |
+
"transformer.h.14.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 44 |
+
"transformer.h.14.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 45 |
+
"transformer.h.14.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 46 |
+
"transformer.h.14.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 47 |
+
"transformer.h.14.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 48 |
+
"transformer.h.14.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 49 |
+
"transformer.h.15.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 50 |
+
"transformer.h.15.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 51 |
+
"transformer.h.15.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 52 |
+
"transformer.h.15.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 53 |
+
"transformer.h.15.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 54 |
+
"transformer.h.15.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 55 |
+
"transformer.h.16.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 56 |
+
"transformer.h.16.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 57 |
+
"transformer.h.16.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 58 |
+
"transformer.h.16.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 59 |
+
"transformer.h.16.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 60 |
+
"transformer.h.16.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 61 |
+
"transformer.h.17.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 62 |
+
"transformer.h.17.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 63 |
+
"transformer.h.17.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 64 |
+
"transformer.h.17.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 65 |
+
"transformer.h.17.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 66 |
+
"transformer.h.17.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 67 |
+
"transformer.h.18.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 68 |
+
"transformer.h.18.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 69 |
+
"transformer.h.18.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 70 |
+
"transformer.h.18.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 71 |
+
"transformer.h.18.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 72 |
+
"transformer.h.18.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 73 |
+
"transformer.h.19.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 74 |
+
"transformer.h.19.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 75 |
+
"transformer.h.19.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 76 |
+
"transformer.h.19.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 77 |
+
"transformer.h.19.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 78 |
+
"transformer.h.19.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 79 |
+
"transformer.h.2.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 80 |
+
"transformer.h.2.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 81 |
+
"transformer.h.2.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 82 |
+
"transformer.h.2.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 83 |
+
"transformer.h.2.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 84 |
+
"transformer.h.2.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 85 |
+
"transformer.h.20.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 86 |
+
"transformer.h.20.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 87 |
+
"transformer.h.20.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 88 |
+
"transformer.h.20.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 89 |
+
"transformer.h.20.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 90 |
+
"transformer.h.20.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 91 |
+
"transformer.h.21.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 92 |
+
"transformer.h.21.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 93 |
+
"transformer.h.21.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 94 |
+
"transformer.h.21.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 95 |
+
"transformer.h.21.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 96 |
+
"transformer.h.21.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 97 |
+
"transformer.h.22.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 98 |
+
"transformer.h.22.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 99 |
+
"transformer.h.22.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 100 |
+
"transformer.h.22.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 101 |
+
"transformer.h.22.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 102 |
+
"transformer.h.22.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 103 |
+
"transformer.h.23.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 104 |
+
"transformer.h.23.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 105 |
+
"transformer.h.23.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 106 |
+
"transformer.h.23.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 107 |
+
"transformer.h.23.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 108 |
+
"transformer.h.23.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 109 |
+
"transformer.h.24.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 110 |
+
"transformer.h.24.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 111 |
+
"transformer.h.24.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 112 |
+
"transformer.h.24.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 113 |
+
"transformer.h.24.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 114 |
+
"transformer.h.24.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 115 |
+
"transformer.h.25.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 116 |
+
"transformer.h.25.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 117 |
+
"transformer.h.25.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 118 |
+
"transformer.h.25.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 119 |
+
"transformer.h.25.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 120 |
+
"transformer.h.25.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 121 |
+
"transformer.h.26.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 122 |
+
"transformer.h.26.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 123 |
+
"transformer.h.26.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 124 |
+
"transformer.h.26.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 125 |
+
"transformer.h.26.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 126 |
+
"transformer.h.26.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 127 |
+
"transformer.h.27.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 128 |
+
"transformer.h.27.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 129 |
+
"transformer.h.27.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 130 |
+
"transformer.h.27.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 131 |
+
"transformer.h.27.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 132 |
+
"transformer.h.27.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 133 |
+
"transformer.h.28.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 134 |
+
"transformer.h.28.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 135 |
+
"transformer.h.28.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 136 |
+
"transformer.h.28.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 137 |
+
"transformer.h.28.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 138 |
+
"transformer.h.28.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 139 |
+
"transformer.h.29.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 140 |
+
"transformer.h.29.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 141 |
+
"transformer.h.29.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 142 |
+
"transformer.h.29.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 143 |
+
"transformer.h.29.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 144 |
+
"transformer.h.29.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 145 |
+
"transformer.h.3.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 146 |
+
"transformer.h.3.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 147 |
+
"transformer.h.3.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 148 |
+
"transformer.h.3.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 149 |
+
"transformer.h.3.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 150 |
+
"transformer.h.3.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 151 |
+
"transformer.h.30.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 152 |
+
"transformer.h.30.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 153 |
+
"transformer.h.30.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 154 |
+
"transformer.h.30.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 155 |
+
"transformer.h.30.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 156 |
+
"transformer.h.30.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 157 |
+
"transformer.h.31.input_layernorm.bias": "pytorch_model-00002-of-00002.bin",
|
| 158 |
+
"transformer.h.31.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
| 159 |
+
"transformer.h.31.mlp.dense_4h_to_h.weight": "pytorch_model-00002-of-00002.bin",
|
| 160 |
+
"transformer.h.31.mlp.dense_h_to_4h.weight": "pytorch_model-00002-of-00002.bin",
|
| 161 |
+
"transformer.h.31.self_attention.dense.weight": "pytorch_model-00002-of-00002.bin",
|
| 162 |
+
"transformer.h.31.self_attention.query_key_value.weight": "pytorch_model-00002-of-00002.bin",
|
| 163 |
+
"transformer.h.4.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 164 |
+
"transformer.h.4.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 165 |
+
"transformer.h.4.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 166 |
+
"transformer.h.4.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 167 |
+
"transformer.h.4.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 168 |
+
"transformer.h.4.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 169 |
+
"transformer.h.5.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 170 |
+
"transformer.h.5.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 171 |
+
"transformer.h.5.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 172 |
+
"transformer.h.5.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 173 |
+
"transformer.h.5.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 174 |
+
"transformer.h.5.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 175 |
+
"transformer.h.6.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 176 |
+
"transformer.h.6.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 177 |
+
"transformer.h.6.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 178 |
+
"transformer.h.6.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 179 |
+
"transformer.h.6.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 180 |
+
"transformer.h.6.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 181 |
+
"transformer.h.7.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 182 |
+
"transformer.h.7.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 183 |
+
"transformer.h.7.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 184 |
+
"transformer.h.7.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 185 |
+
"transformer.h.7.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 186 |
+
"transformer.h.7.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 187 |
+
"transformer.h.8.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 188 |
+
"transformer.h.8.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 189 |
+
"transformer.h.8.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 190 |
+
"transformer.h.8.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 191 |
+
"transformer.h.8.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 192 |
+
"transformer.h.8.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 193 |
+
"transformer.h.9.input_layernorm.bias": "pytorch_model-00001-of-00002.bin",
|
| 194 |
+
"transformer.h.9.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
| 195 |
+
"transformer.h.9.mlp.dense_4h_to_h.weight": "pytorch_model-00001-of-00002.bin",
|
| 196 |
+
"transformer.h.9.mlp.dense_h_to_4h.weight": "pytorch_model-00001-of-00002.bin",
|
| 197 |
+
"transformer.h.9.self_attention.dense.weight": "pytorch_model-00001-of-00002.bin",
|
| 198 |
+
"transformer.h.9.self_attention.query_key_value.weight": "pytorch_model-00001-of-00002.bin",
|
| 199 |
+
"transformer.ln_f.bias": "pytorch_model-00002-of-00002.bin",
|
| 200 |
+
"transformer.ln_f.weight": "pytorch_model-00002-of-00002.bin",
|
| 201 |
+
"transformer.word_embeddings.weight": "pytorch_model-00001-of-00002.bin"
|
| 202 |
+
}
|
| 203 |
+
}
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"additional_special_tokens": [
|
| 3 |
+
">>TITLE<<",
|
| 4 |
+
">>ABSTRACT<<",
|
| 5 |
+
">>INTRODUCTION<<",
|
| 6 |
+
">>SUMMARY<<",
|
| 7 |
+
">>COMMENT<<",
|
| 8 |
+
">>ANSWER<<",
|
| 9 |
+
">>QUESTION<<",
|
| 10 |
+
">>DOMAIN<<",
|
| 11 |
+
">>PREFIX<<",
|
| 12 |
+
">>SUFFIX<<",
|
| 13 |
+
">>MIDDLE<<"
|
| 14 |
+
],
|
| 15 |
+
"eos_token": "<|endoftext|>"
|
| 16 |
+
}
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"add_prefix_space": false,
|
| 3 |
+
"clean_up_tokenization_spaces": true,
|
| 4 |
+
"eos_token": "<|endoftext|>",
|
| 5 |
+
"model_max_length": 2048,
|
| 6 |
+
"tokenizer_class": "PreTrainedTokenizerFast"
|
| 7 |
+
}
|