asrith05/deepseek_pretrain_90k

This is a pretrained multilingual model based on DeepSeek architecture, trained on English, Telugu, and Sanskrit data.

Model Details

  • Base Architecture: DeepSeek
  • Languages: English, Telugu, Sanskrit
  • Training Stage: Pretraining (90k steps)
  • Model Type: Base model (not fine-tuned)
  • Size: ~1253MB

Description

This model represents the pretrained base version before any task-specific fine-tuning. It has been trained on a diverse multilingual corpus and can be used as a foundation for various downstream tasks.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "asrith05/deepseek_pretrain_90k"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage
prompt = "The quick brown fox"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

  • Training Steps: 90,000
  • Architecture: DeepSeek-based transformer
  • Context Length: 2048 tokens
  • Vocabulary: Multilingual (English/Telugu/Sanskrit)

Intended Use

This model is intended as a base model for:

  • Fine-tuning on specific tasks
  • Research in multilingual NLP
  • Building specialized applications

Limitations

  • This is a base model and may require fine-tuning for specific tasks
  • Generated content should be reviewed for accuracy and appropriateness
  • May reflect biases present in training data

Training Data

The model was trained on a curated multilingual corpus including:

  • English text from various sources
  • Telugu language content
  • Sanskrit texts and literature

License

This model is released under the Apache 2.0 License.

Downloads last month
7
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for asrith05/deepseek_pretrain_90k

Finetunes
1 model