Gemma-3-12B Lithology Description (Russian)

LoRA adapter for google/gemma-3-12b-it fine-tuned on geological core lithology descriptions in Russian.

Model Description

This model generates professional lithological descriptions of drill core samples for mining/drilling operations. Trained on expert geological descriptions from the Zhosabay (Жосабай) deposit in Kazakhstan.

Training Details

Parameter Value
Base Model google/gemma-3-12b-it (12.19B params)
Method SFT with LoRA
LoRA r 16
LoRA alpha 32
Target Modules q_proj, v_proj, k_proj, o_proj
Dataset Size 1,664 samples (1,497 train / 167 eval)
Epochs 4
Learning Rate 1e-5
Batch Size 8 (effective)
Hardware 4x NVIDIA RTX 5090
Training Time ~37 minutes

Metrics

Metric Value
Eval Loss 1.121
Eval Accuracy 75.1%
Train Loss 0.34

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model and adapter
base_model = "google/gemma-3-12b-it"
adapter = "tauken/gemma-3-12b-lithology-ru"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

# Generate description
prompt = "Ты опытный геолог-эксперт. Опиши литологию керна. Скважина: ZHS_028_25, интервал: 94.5-101.9м."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Example Output

Input: Скважина ZHS_028_25, интервал 94.5-101.9м

Output:

В интервале 94.5-101.9м: Андезитовые порфириты от темно-коричнево-зеленого до серого цвета
с зеленоватым оттенком. Структура порфировая, текстура массивная. Основная масса
скрытокристаллическая, порфировые включения выполнены плагиоклазом таблитчатых,
полуокруглых форм размерами до 5х12мм, составляющие около 40-50% от общей массы.

Породы слабо, умеренно трещиноватые в количестве до 5-7 трещин на п.м., преимущественно
ориентированные под углом 30-70° к оси керна. Вторичные изменения представлены слабым
окварцеванием и серицитизацией по массе.

Рудная минерализация выполнена пиритом в виде вкраплений по массе.

Output Structure

The model follows standard geological description format:

  1. Lithotype & Color - Rock type identification (андезитовые порфириты)
  2. Texture & Structure - Porphyritic structure, phenocryst description
  3. Fracturing - Fracture density per running meter, angles to core axis
  4. Secondary Alterations - Veining, alteration minerals (окварцевание, серицитизация)
  5. Mineralization - Ore minerals if present (пирит)
  6. Contacts - Description of boundaries with other intervals

Limitations

⚠️ Important limitations:

  • Text-only model - Does not process images (for VLM version, see upcoming release)
  • May occasionally hallucinate depth intervals outside the requested range
  • Quantitative estimates (sizes, percentages) may be approximate
  • Best used as an assistant with mandatory expert verification
  • Not production-ready for autonomous use

Training Data

Geological descriptions from Zhosabay copper-gold deposit (Kazakhstan):

  • Well intervals with detailed lithological descriptions in Russian
  • Expert-written by professional geologists
  • Rock types: andesite porphyrites, metasomatites, weathering crusts
  • Features: fracturing, veining, alteration, mineralization

Intended Use

Recommended uses:

  • Drafting lithological descriptions for geologist review
  • Accelerating core logging workflows
  • Training geological terminology

Not recommended:

  • Autonomous geological reporting without expert review
  • Critical mining decisions based solely on model output

License

This model inherits the Gemma license from the base model.

Citation

@misc{tauken-lithology-2025,
  author = {Tauken Team},
  title = {Gemma-3-12B Lithology Description Model},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/tauken/gemma-3-12b-lithology-ru}
}

Framework Versions

  • PEFT: 0.18.1
  • Transformers: 4.51+
  • PyTorch: 2.11.0
  • TRL: 0.27.0
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dastrix/gemma-3-12b-lithology-ru

Adapter
(351)
this model