Text Generation
Transformers
Safetensors
English
gpt2
text-generation-inference
Gerson Fabian Buenahora Ormaza commited on
Commit
e6695af
·
verified ·
1 Parent(s): 8ea28ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -3
README.md CHANGED
@@ -1,3 +1,102 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - neural-bridge/rag-dataset-12000
5
+ - neural-bridge/rag-dataset-1200
6
+ language:
7
+ - en
8
+ ---
9
+
10
+ # RAGPT-2: Fine-tuned GPT-2 for Context-Based Question Answering
11
+
12
+ ## Model Description
13
+
14
+ RAGPT-2 is a fine-tuned version of [GPT-2 small](https://huggingface.co/BueormLLC/CleanGPT), specifically adapted for context-based question answering tasks. This model has been trained to generate relevant answers based on a given context and question, similar to a Retrieval-Augmented Generation (RAG) system.
15
+
16
+ ### Key Features
17
+
18
+ - Based on the GPT-2 small architecture (124M parameters)
19
+ - Fine-tuned on the "neural-bridge/rag-dataset-12000" and others dataset from Hugging Face
20
+ - Capable of generating answers based on provided context and questions
21
+ - Suitable for various question-answering applications
22
+
23
+ ## Training Data
24
+
25
+ The model was fine-tuned using the "neural-bridge/rag-dataset-12000" and "neural-bridge/rag-dataset-1200" dataset, which contains:
26
+ - Context passages
27
+ - Questions related to the context
28
+ - Corresponding answers
29
+
30
+ ## Fine-tuning Process
31
+
32
+ The fine-tuning process involved:
33
+ 1. Loading the pre-trained GPT-2 small model
34
+ 2. Preprocessing the dataset to combine context, question, and answer into a single text
35
+ 3. Training the model to predict the next token given the context and question
36
+
37
+ ### Hyperparameters
38
+
39
+ - Base model: GPT-2 small
40
+ - Number of training epochs: 8
41
+ - Batch size: 4
42
+ - Learning rate: Default AdamW optimizer settings
43
+ - Max sequence length: 512 tokens
44
+
45
+ ## Usage
46
+
47
+ To use the model:
48
+
49
+ ```python
50
+ from transformers import AutoTokenizer, AutoModelForCausalLM
51
+
52
+ model_name = "BueormLLC/RAGPT-2"
53
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
54
+ model = AutoModelForCausalLM.from_pretrained(model_name)
55
+
56
+ # Prepare input
57
+ context = "Your context here"
58
+ question = "Your question here"
59
+ input_text = f"Contexto: {context}\nPregunta: {question}\nRespuesta:"
60
+
61
+ # Generate answer
62
+ input_ids = tokenizer.encode(input_text, return_tensors="pt")
63
+ output = model.generate(input_ids, max_length=150, num_return_sequences=1)
64
+ answer = tokenizer.decode(output[0], skip_special_tokens=True)
65
+ ```
66
+
67
+ ## Limitations
68
+
69
+ - The model's knowledge is limited to its training data and the base GPT-2 model.
70
+ - It may sometimes generate irrelevant or incorrect answers, especially for topics outside its training domain.
71
+ - The model does not have access to external information or real-time data.
72
+
73
+ ## Ethical Considerations
74
+
75
+ Users should be aware that this model, like all language models, may reflect biases present in its training data. It should not be used as a sole source of information for critical decisions.
76
+
77
+ ## Future Improvements
78
+
79
+ - Fine-tuning on a larger and more diverse dataset
80
+ - Experimenting with larger base models (e.g., GPT-2 medium or large)
81
+ - Implementing techniques to improve factual accuracy and reduce hallucinations
82
+
83
+ ## Support us
84
+
85
+ - [Paypal](https://paypal.me/bueorm)
86
+ - [Patreon](https://patreon.com/bueorm)
87
+ ### We appreciate your support, without you we could not do what we do.
88
+
89
+ ## Citation
90
+
91
+ If you use this model in your research, please cite:
92
+
93
+ ```
94
+ @misc{RAGPT,
95
+ author = {Bueorm},
96
+ title = {RAGPT-2: Fine-tuned GPT-2 for Context-Based Question Answering},
97
+ year = {2024},
98
+ publisher = {GitHub},
99
+ journal = {None},
100
+ howpublished = {\url{https://huggingface.co/BueormLLC/RAGPT-2}}
101
+ }
102
+ ```