library_name: transformers pipeline_tag: text-generation tags: - base-model - llama - causal-lm - pretrained - vllm - pytorch model-index: - name: keural-alpha-base results: []

Keural-Alpha-Base

Model Summary

Keural-Alpha-Base is a base (foundation) large language model trained using a LLaMA-compatible, decoder-only Transformer architecture.
It is comparable in role to models such as GPT-2 (base) or LLaMA base, and is intended to serve as a strong pretrained backbone for downstream fine-tuning.

This model is not instruction-tuned and not chat-aligned.

Model Details

Model type: Causal Language Model
Architecture: LLaMA-style Transformer
Framework: Hugging Face Transformers
Tokenizer: SentencePiece (LLaMA-compatible)
Vocabulary size: 32,000
Max sequence length: 2,048 tokens

Architecture Configuration

Component	Value
Architecture	LlamaForCausalLM
Hidden size	2048
Intermediate size	8192
Number of layers	24
Attention heads	16
Key-value heads	16
Head dimension	128
Activation	SiLU
Normalization	RMSNorm (ε = 1e-6)
Dropout	0.0
Vocabulary size	32,000
Max position embeddings	2048
Positional encoding	RoPE (θ = 10000)
Attention bias	Disabled
Weight tying	Disabled

Tokenizer Details

Tokenizer type: SentencePiece
BOS token: <s>
EOS token: </s>
PAD token: Not defined (standard for LLaMA base models)

The absence of a padding token is intentional and follows standard LLaMA base design.
During inference, it is recommended to set pad_token_id = eos_token_id and provide an explicit attention_mask.

Intended Use

This model is designed for:

Further instruction tuning
Chat alignment
Domain-specific fine-tuning
Research on large language models
Serving as a pretrained backbone for NLP tasks

⚠️ Out-of-the-box generations may show repetition or incoherence, which is expected behavior for base models.

Limitations

Not instruction-following
Not safety-aligned
No RLHF applied
Requires fine-tuning for chat or production deployment

Supported Hardware

The model has been validated on:

NVIDIA H100 / H200
NVIDIA A100
NVIDIA Spark
Dell GB10
Modern RTX GPUs (Ampere / Ada / Blackwell)

vLLM Compatibility

Keural-Alpha-Base is fully compatible with vLLM.

Example:

python -m vllm.entrypoints.openai.api_server \
  --model mkd-hossain/keural-alpha-base \
  --served-model-name keural-alpha-base \
  --tensor-parallel-size 2 \
  --dtype bfloat16 \
  --max-model-len 2048 \
  --disable-log-stats



example command
curl http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "keural-alpha-base",
    "prompt": "Hello, my name is",
    "max_tokens": 60,
    "temperature": 0.7,
    "top_p": 0.9,
    "repetition_penalty": 1.15
  }'


Usage Example (Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "mkd-hossain/keural-alpha-base"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

tokenizer.pad_token = tokenizer.eos_token

inputs = tokenizer(
    "Hello, I am Hossain from Bangladesh.",
    return_tensors="pt"
)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=100,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.15,
        no_repeat_ngram_size=4,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))



Ethical Considerations

As a base model, Keural-Alpha-Base may generate biased, incorrect, or unsafe content.
Users are responsible for applying appropriate alignment, filtering, and safeguards before deployment.



Author
Organization MKD Co LTD.
Developed by 
Project: Keural AI Systems

Downloads last month: 1

Safetensors

Model size

2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mkd-hossain/keural-alpha-base

Unable to build the model tree, the base model loops to the model itself. Learn more.