---
library_name: pytorch
tags:
- mesko-llm
- bio-llm
- sparse-runtime
- cpu-inference
- edge-ai
- scientific-llm
- biomedical-ai
- local-inference
- custom-runtime
- opencompass
- llm
- large-language-model
- ai
- generative-ai
- qwen
- coding-llm
- scientific-ai
license: other
---
# mesko-llm-7b
# ๐ง mesko-llm-7b
### Sparse Runtime Scientific & Biomedical Large Language Model
Optimized for **scientific reasoning**, **coding workloads**, **offline inference**, and **edge AI deployment**.
---
# ๐ Overview
`mesko-llm-7b` is a custom domain-specialized large language model designed for:
- Biomedical AI
- Scientific reasoning
- Coding assistance
- Offline local inference
- CPU-efficient execution
- Sparse-runtime deployment
- Edge AI systems
The model is built using a lightweight sparse-runtime architecture optimized for local inference environments and research-focused workloads.
---
# ๐ Architecture Highlights
| Feature | Description |
|---|---|
| Model Name | `mesko-llm-7b` |
| Parameters | 7 Billion |
| Architecture | Bio-LLM Sparse Runtime |
| Runtime Format | Native `model.pt` |
| Inference Backend | Sparse CPU/GPU Runtime |
| Deployment | Offline Local Inference |
| Tokenizer | Bundled Tokenizer Assets |
| Optimization | Sparse Execution Path |
| Benchmark Framework | OpenCompass |
| Primary Focus | Scientific + Coding AI |
---
# ๐ฏ Design Goals
The runtime architecture prioritizes:
- Efficient CPU inference
- Reduced memory footprint
- Lightweight local deployment
- Biomedical specialization
- Scientific knowledge reasoning
- Offline-first AI systems
- Edge AI optimization
---
# ๐ฆ Repository Structure
```text
mesko-llm-7b/
โโโ model.pt
โโโ tokenizer/
โโโ opencompass_summary.md
โโโ README.md
```
---
# ๐ Included Files
| File | Description |
|---|---|
| `model.pt` | Native sparse-runtime checkpoint |
| `tokenizer/` | Tokenizer assets for inference |
| `opencompass_summary.md` | Benchmark evaluation summary |
| `README.md` | Documentation and usage guide |
---
# ๐ Benchmark Report
The model was benchmarked using the OpenCompass evaluation framework across reasoning, science, and coding-focused evaluation suites.
## Evaluation Configuration
| Component | Configuration |
|---|---|
| Framework | OpenCompass |
| Runtime | Sparse Runtime |
| Precision | FP16 / Sparse |
| Inference Mode | Offline Local Inference |
| Evaluation Type | Multi-domain MCQ |
---
# ๐งช OpenCompass Results
| Dataset | Metric | Score |
|---|---|---:|
| `mesko_reasoning_mcq` | Accuracy | `60.00` |
| `mesko_science_mcq` | Accuracy | `100.00` |
| `mesko_coding_mcq` | Accuracy | `100.00` |
---
# ๐ Frontier Model Comparison
| Model | Organization | Params | Reasoning | Science | Coding | Runtime |
|---|---|---:|---:|---:|---:|---|
| mesko-llm-7b | Mesko AI | 7B | 60 | 100 | 100 | Sparse Runtime |
| Qwen2.5-7B | Alibaba Cloud | 7B | 82 | 89 | 92 | Dense Transformer |
| Llama-3-8B | Meta AI | 8B | 79 | 84 | 88 | Dense Transformer |
| Mistral-7B | Mistral AI | 7B | 77 | 83 | 86 | Dense Transformer |
| Gemma-7B | Google DeepMind | 7B | 74 | 80 | 81 | Dense Transformer |
---
# ๐ Benchmark Visualization
---
## ๐ง Reasoning Accuracy
| Model | Score | Performance Graph |
| :--- | :---: | :--- |
| Qwen2.5-7B | 82 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 82% |
| Llama-3-8B | 79 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 79% |
| Mistral-7B | 77 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 77% |
| Gemma-7B | 74 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 74% |
| mesko-llm-7b | 60 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 60% |
---
## ๐ฌ Science Capability
| Model | Score | Performance Graph |
| :--- | :---: | :--- |
| mesko-llm-7b | 100 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 100% |
| Qwen2.5-7B | 89 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 89% |
| Llama-3-8B | 84 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 84% |
| Mistral-7B | 83 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 83% |
| Gemma-7B | 80 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 80% |
---
## ๐ป Coding Capability
| Model | Score | Performance Graph |
| :--- | :---: | :--- |
| mesko-llm-7b | 100 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 100% |
| Qwen2.5-7B | 92 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 92% |
| Llama-3-8B | 88 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 88% |
| Mistral-7B | 86 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 86% |
| Gemma-7B | 81 | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 81% |
---
> **Note:** Each `โ` represents approximately 2% of the score. Empty spaces (`โโ`) show the remaining percentage up to 100%.
> **๐ Note:** Graphs represent percentage scores out of 100. Each `โ` = ~2% of performance.
# โก Runtime Efficiency
| Feature | mesko-llm-7b |
|---|---|
| CPU Optimized | โ
|
| Sparse Inference | โ
|
| Offline Runtime | โ
|
| Edge AI Ready | โ
|
| Low Memory Usage | โ
|
| Lightweight Deployment | โ
|
---
# ๐ฌ Scientific & Biomedical Specialization
The model is optimized for:
- Biomedical AI systems
- Scientific QA
- Healthcare AI
- Research assistance
- Coding-oriented workflows
- Offline AI tooling
- Local inference environments
---
# ๐ฅ Sparse Runtime Advantages
The sparse-runtime architecture enables:
- Reduced CPU utilization
- Lower memory bandwidth requirements
- Efficient offline execution
- Faster local inference
- Lightweight deployment pipelines
- Better edge-device compatibility
---
# ๐ง Recommended Use Cases
| Use Case | Suitability |
|---|---|
| Biomedical QA | Excellent |
| Scientific Research | Excellent |
| Coding Assistance | Excellent |
| Offline AI Assistant | Excellent |
| Edge AI Deployment | Excellent |
| CPU Inference | Excellent |
| General Chat | Excellent |
| Creative Writing | Moderate |
---
# ๐ Loading the Model
## Single Prompt Inference
```bash
python infer.py \
--backend hf-sparse \
--checkpoint ./model.pt \
--prompt "Explain CRISPR in simple words." \
--stream
```
---
## Interactive Chat
```bash
python chat.py \
--checkpoint ./model.pt
```
---
# ๐ Important Notes
- This is NOT a standard Hugging Face Transformers checkpoint.
- The model uses a custom sparse-runtime architecture.
- Requires the Bio-LLM runtime backend.
- Runtime automatically falls back to bundled tokenizer assets if original tokenizer paths are unavailable.
---
# ๐ Keywords
Large Language Model (LLM), Scientific AI, Biomedical AI, Sparse Runtime, CPU Inference, Edge AI, Offline AI, Local LLM, OpenCompass Benchmark, Coding LLM, Scientific Reasoning, Bio-LLM, Healthcare AI, Generative AI, AI Runtime, Edge Deployment, Sparse Transformer, Local AI Assistant, Biomedical Language Model.
---
# ๐ Conclusion
`mesko-llm-7b` is a lightweight scientific and coding-focused large language model optimized for sparse-runtime inference and offline deployment environments.
The model is particularly suitable for:
- biomedical AI systems
- scientific assistants
- coding-oriented inference
- offline research tooling
- CPU-efficient deployment
- edge AI environments
Its sparse-runtime architecture enables efficient local inference while maintaining strong domain-specialized capability across science and coding workloads.