Update README.md
Browse files
README.md
CHANGED
|
@@ -14,17 +14,17 @@ model-index:
|
|
| 14 |
|
| 15 |
# train_2025-05-05-15-36-22
|
| 16 |
|
| 17 |
-
This model is a fine-tuned version of [../pretrained/Qwen3-4B](https://huggingface.co/../pretrained/Qwen3-
|
| 18 |
|
| 19 |
## Model description
|
| 20 |
|
| 21 |
Gaia-Petro-LLM is a large language model specialized in the oil and gas industry, fine-tuned from Qwen/Qwen3-4B. It was further pre-trained on a curated 20GB corpus of petroleum engineering texts, including technical documents, academic papers, and domain literature. The model is designed to support domain experts, researchers, and engineers in petroleum-related tasks, providing high-quality, domain-specific language understanding and generation.
|
| 22 |
## Model Details
|
| 23 |
-
Base Model: Qwen/Qwen3-
|
| 24 |
Domain: Oil & Gas / Petroleum Engineering
|
| 25 |
Corpus Size: ~20GB (petroleum engineering)
|
| 26 |
Languages: Primarily Chinese; domain-specific English supported
|
| 27 |
-
Repository: my2000cup/Gaia-LLM-
|
| 28 |
## Intended uses & limitations
|
| 29 |
|
| 30 |
Technical Q&A in petroleum engineering
|
|
@@ -49,7 +49,7 @@ Technical standards and manuals
|
|
| 49 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 50 |
|
| 51 |
# Replace with your model repository
|
| 52 |
-
model_name = "my2000cup/Gaia-LLM-
|
| 53 |
|
| 54 |
# Load tokenizer and model
|
| 55 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
|
|
| 14 |
|
| 15 |
# train_2025-05-05-15-36-22
|
| 16 |
|
| 17 |
+
This model is a fine-tuned version of [../pretrained/Qwen3-4B](https://huggingface.co/../pretrained/Qwen3-8B) on the wikipedia_zh, petro_books, datasets001, the datasets002, the datasets003, the datasets004 and the datasets006 datasets.
|
| 18 |
|
| 19 |
## Model description
|
| 20 |
|
| 21 |
Gaia-Petro-LLM is a large language model specialized in the oil and gas industry, fine-tuned from Qwen/Qwen3-4B. It was further pre-trained on a curated 20GB corpus of petroleum engineering texts, including technical documents, academic papers, and domain literature. The model is designed to support domain experts, researchers, and engineers in petroleum-related tasks, providing high-quality, domain-specific language understanding and generation.
|
| 22 |
## Model Details
|
| 23 |
+
Base Model: Qwen/Qwen3-8B
|
| 24 |
Domain: Oil & Gas / Petroleum Engineering
|
| 25 |
Corpus Size: ~20GB (petroleum engineering)
|
| 26 |
Languages: Primarily Chinese; domain-specific English supported
|
| 27 |
+
Repository: my2000cup/Gaia-LLM-8B
|
| 28 |
## Intended uses & limitations
|
| 29 |
|
| 30 |
Technical Q&A in petroleum engineering
|
|
|
|
| 49 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 50 |
|
| 51 |
# Replace with your model repository
|
| 52 |
+
model_name = "my2000cup/Gaia-LLM-8B"
|
| 53 |
|
| 54 |
# Load tokenizer and model
|
| 55 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|