Gemma-3-12B Lithology Description (Russian)

LoRA adapter for google/gemma-3-12b-it fine-tuned on geological core lithology descriptions in Russian.

Model Description

This model generates professional lithological descriptions of drill core samples for mining/drilling operations. Trained on expert geological descriptions from the Zhosabay (Жосабай) deposit in Kazakhstan.

Training Details

Parameter	Value
Base Model	google/gemma-3-12b-it (12.19B params)
Method	SFT with LoRA
LoRA r	16
LoRA alpha	32
Target Modules	q_proj, v_proj, k_proj, o_proj
Dataset Size	1,664 samples (1,497 train / 167 eval)
Epochs	4
Learning Rate	1e-5
Batch Size	8 (effective)
Hardware	4x NVIDIA RTX 5090
Training Time	~37 minutes

Metrics

Metric	Value
Eval Loss	1.121
Eval Accuracy	75.1%
Train Loss	0.34

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model and adapter
base_model = "google/gemma-3-12b-it"
adapter = "tauken/gemma-3-12b-lithology-ru"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

# Generate description
prompt = "Ты опытный геолог-эксперт. Опиши литологию керна. Скважина: ZHS_028_25, интервал: 94.5-101.9м."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Example Output

Input: Скважина ZHS_028_25, интервал 94.5-101.9м

Output:

В интервале 94.5-101.9м: Андезитовые порфириты от темно-коричнево-зеленого до серого цвета
с зеленоватым оттенком. Структура порфировая, текстура массивная. Основная масса
скрытокристаллическая, порфировые включения выполнены плагиоклазом таблитчатых,
полуокруглых форм размерами до 5х12мм, составляющие около 40-50% от общей массы.

Породы слабо, умеренно трещиноватые в количестве до 5-7 трещин на п.м., преимущественно
ориентированные под углом 30-70° к оси керна. Вторичные изменения представлены слабым
окварцеванием и серицитизацией по массе.

Рудная минерализация выполнена пиритом в виде вкраплений по массе.

Output Structure

The model follows standard geological description format:

Lithotype & Color - Rock type identification (андезитовые порфириты)
Texture & Structure - Porphyritic structure, phenocryst description
Fracturing - Fracture density per running meter, angles to core axis
Secondary Alterations - Veining, alteration minerals (окварцевание, серицитизация)
Mineralization - Ore minerals if present (пирит)
Contacts - Description of boundaries with other intervals

Limitations

⚠️ Important limitations:

Text-only model - Does not process images (for VLM version, see upcoming release)
May occasionally hallucinate depth intervals outside the requested range
Quantitative estimates (sizes, percentages) may be approximate
Best used as an assistant with mandatory expert verification
Not production-ready for autonomous use

Training Data

Geological descriptions from Zhosabay copper-gold deposit (Kazakhstan):

Well intervals with detailed lithological descriptions in Russian
Expert-written by professional geologists
Rock types: andesite porphyrites, metasomatites, weathering crusts
Features: fracturing, veining, alteration, mineralization

Intended Use

✅ Recommended uses:

Drafting lithological descriptions for geologist review
Accelerating core logging workflows
Training geological terminology

❌ Not recommended:

Autonomous geological reporting without expert review
Critical mining decisions based solely on model output

License

This model inherits the Gemma license from the base model.

Citation

@misc{tauken-lithology-2025,
  author = {Tauken Team},
  title = {Gemma-3-12B Lithology Description Model},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/tauken/gemma-3-12b-lithology-ru}
}

Framework Versions

PEFT: 0.18.1
Transformers: 4.51+
PyTorch: 2.11.0
TRL: 0.27.0

Downloads last month: 1

Model tree for dastrix/gemma-3-12b-lithology-ru

Base model

google/gemma-3-12b-pt

Finetuned

google/gemma-3-12b-it

Adapter

(351)

this model