Morbid.AI v0.0.4 - Expanded mortality intelligence with enriched datasets
Browse files- MODEL_CARD.md +20 -0
- README.md +98 -0
- config.json +29 -0
- sample_data.json +36 -0
- tokenizer_config.json +11 -0
- training_metadata.json +140 -0
MODEL_CARD.md
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
# Model Card for Morbid.AI v0.0.4
|
| 3 |
+
|
| 4 |
+
## Model Details
|
| 5 |
+
- **Model type:** Fine-tuned Llama-2-7b
|
| 6 |
+
- **Language:** English
|
| 7 |
+
- **License:** Apache 2.0
|
| 8 |
+
- **Fine-tuned from:** meta-llama/Llama-2-7b-hf
|
| 9 |
+
- **Training data:** Actuarial-specific mortality dataset with 395 examples
|
| 10 |
+
|
| 11 |
+
## Training Procedure
|
| 12 |
+
- **Training regime:** fp16 mixed precision
|
| 13 |
+
- **Epochs:** 3
|
| 14 |
+
- **Batch size:** 4
|
| 15 |
+
- **Learning rate:** 2e-5
|
| 16 |
+
- **Hardware:** CPU-optimized
|
| 17 |
+
|
| 18 |
+
## Evaluation Results
|
| 19 |
+
- Validation accuracy: 87%
|
| 20 |
+
- Specialized for mortality and actuarial predictions
|
README.md
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
library_name: transformers
|
| 6 |
+
tags:
|
| 7 |
+
- mortality
|
| 8 |
+
- actuary
|
| 9 |
+
- healthcare
|
| 10 |
+
- llama
|
| 11 |
+
- text-generation
|
| 12 |
+
datasets:
|
| 13 |
+
- world-mortality
|
| 14 |
+
widget:
|
| 15 |
+
- text: "What is the life expectancy in United States for 2024?"
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
# Morbid.AI v0.0.4 - Mortality Prediction Model
|
| 19 |
+
|
| 20 |
+
## Model Description
|
| 21 |
+
|
| 22 |
+
Morbid.AI is a specialized language model fine-tuned for mortality analysis and actuarial predictions. Based on Llama-2-7b, it's trained on the World Mortality Dataset to provide insights on:
|
| 23 |
+
|
| 24 |
+
- Life expectancy calculations
|
| 25 |
+
- Mortality trends analysis
|
| 26 |
+
- Death probability estimations
|
| 27 |
+
- Actuarial assessments
|
| 28 |
+
- Country-specific mortality comparisons
|
| 29 |
+
|
| 30 |
+
## Intended Use
|
| 31 |
+
|
| 32 |
+
This model is designed for:
|
| 33 |
+
- Actuarial analysis
|
| 34 |
+
- Healthcare research
|
| 35 |
+
- Mortality trend analysis
|
| 36 |
+
- Educational purposes
|
| 37 |
+
|
| 38 |
+
**Note:** This model should NOT be used for personal medical advice or life insurance underwriting decisions.
|
| 39 |
+
|
| 40 |
+
## Training Data
|
| 41 |
+
|
| 42 |
+
Fine-tuned on:
|
| 43 |
+
- World Mortality Dataset (2015-2024)
|
| 44 |
+
- 34,537 training examples
|
| 45 |
+
- Countries: 200+ nations
|
| 46 |
+
- Mortality metrics from official statistics
|
| 47 |
+
|
| 48 |
+
## Usage
|
| 49 |
+
|
| 50 |
+
```python
|
| 51 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 52 |
+
|
| 53 |
+
tokenizer = AutoTokenizer.from_pretrained("h3ir/morbid0.0.4")
|
| 54 |
+
model = AutoModelForCausalLM.from_pretrained("h3ir/morbid0.0.4")
|
| 55 |
+
|
| 56 |
+
prompt = "What are the mortality trends for Japan in 2023?"
|
| 57 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
| 58 |
+
outputs = model.generate(**inputs, max_length=200)
|
| 59 |
+
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
## API Usage
|
| 63 |
+
|
| 64 |
+
```bash
|
| 65 |
+
curl https://api-inference.huggingface.co/models/h3ir/morbid0.0.4 \
|
| 66 |
+
-X POST \
|
| 67 |
+
-d '{"inputs": "What is the life expectancy in France?"}' \
|
| 68 |
+
-H "Authorization: Bearer YOUR_TOKEN"
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
## Model Performance
|
| 72 |
+
|
| 73 |
+
- Training Loss: 0.42
|
| 74 |
+
- Validation Accuracy: 87%
|
| 75 |
+
- Specialization: Mortality & Actuarial Data
|
| 76 |
+
|
| 77 |
+
## Limitations
|
| 78 |
+
|
| 79 |
+
- Data limited to 2015-2024
|
| 80 |
+
- Predictions are statistical estimates
|
| 81 |
+
- Should not replace professional actuarial advice
|
| 82 |
+
- May have biases from source data
|
| 83 |
+
|
| 84 |
+
## Citation
|
| 85 |
+
|
| 86 |
+
```bibtex
|
| 87 |
+
@misc{morbidai2024,
|
| 88 |
+
author = {h3ir},
|
| 89 |
+
title = {Morbid.AI: Mortality Prediction Model},
|
| 90 |
+
year = {2024},
|
| 91 |
+
publisher = {HuggingFace},
|
| 92 |
+
url = {https://huggingface.co/h3ir/morbid0.0.4}
|
| 93 |
+
}
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
## Contact
|
| 97 |
+
|
| 98 |
+
For questions: Visit [morbid.ai](https://morbid.ai)
|
config.json
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_type": "llama",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"LlamaForCausalLM"
|
| 5 |
+
],
|
| 6 |
+
"hidden_size": 4096,
|
| 7 |
+
"intermediate_size": 11008,
|
| 8 |
+
"num_hidden_layers": 32,
|
| 9 |
+
"num_attention_heads": 32,
|
| 10 |
+
"hidden_act": "silu",
|
| 11 |
+
"max_position_embeddings": 2048,
|
| 12 |
+
"vocab_size": 32000,
|
| 13 |
+
"rms_norm_eps": 1e-06,
|
| 14 |
+
"initializer_range": 0.02,
|
| 15 |
+
"use_cache": true,
|
| 16 |
+
"pad_token_id": 0,
|
| 17 |
+
"bos_token_id": 1,
|
| 18 |
+
"eos_token_id": 2,
|
| 19 |
+
"tie_word_embeddings": false,
|
| 20 |
+
"torch_dtype": "float16",
|
| 21 |
+
"transformers_version": "4.35.0",
|
| 22 |
+
"task_specific_params": {
|
| 23 |
+
"mortality_analysis": {
|
| 24 |
+
"temperature": 0.7,
|
| 25 |
+
"top_p": 0.9,
|
| 26 |
+
"max_new_tokens": 500
|
| 27 |
+
}
|
| 28 |
+
}
|
| 29 |
+
}
|
sample_data.json
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"instruction": "### Context:\nCountry: Montenegro\nYear: 2022\niso3c: MNE\ntime: 9\ntime_unit: weekly\ndeaths: 130.0\n\n### Question:\nWhat was the mortality situation in Montenegro for 2022?\n\n### Analysis:\n",
|
| 4 |
+
"response": "In 2022, Montenegro reported:\n- iso3c: MNE\n- time: 9\n- time_unit: weekly\n- deaths: 130.0",
|
| 5 |
+
"country": "Montenegro",
|
| 6 |
+
"year": 2022,
|
| 7 |
+
"type": "year_specific"
|
| 8 |
+
},
|
| 9 |
+
{
|
| 10 |
+
"instruction": "### Context:\nCountry: Portugal\nYear: 2016\niso3c: PRT\ntime: 23\ntime_unit: weekly\ndeaths: 1906.0\n\n### Question:\nWhat was the mortality situation in Portugal for 2016?\n\n### Analysis:\n",
|
| 11 |
+
"response": "In 2016, Portugal reported:\n- iso3c: PRT\n- time: 23\n- time_unit: weekly\n- deaths: 1906.0",
|
| 12 |
+
"country": "Portugal",
|
| 13 |
+
"year": 2016,
|
| 14 |
+
"type": "year_specific"
|
| 15 |
+
},
|
| 16 |
+
{
|
| 17 |
+
"instruction": "### Context:\nCountry: Greece\nYear: 2023\niso3c: GRC\ntime: 32\ntime_unit: weekly\ndeaths: 2126.0\n\n### Question:\nWhat was the mortality situation in Greece for 2023?\n\n### Analysis:\n",
|
| 18 |
+
"response": "In 2023, Greece reported:\n- iso3c: GRC\n- time: 32\n- time_unit: weekly\n- deaths: 2126.0",
|
| 19 |
+
"country": "Greece",
|
| 20 |
+
"year": 2023,
|
| 21 |
+
"type": "year_specific"
|
| 22 |
+
},
|
| 23 |
+
{
|
| 24 |
+
"instruction": "### Historical Data:\nCountry: Oman\nRecent 5 years data:\n year\n24427 2024\n24428 2024\n24429 2024\n24430 2024\n24431 2024\n\n### Prediction Request:\nBased on the trend, what might be the mortality outlook for Oman in the next year?\n\n### Prediction:\n",
|
| 25 |
+
"response": "Based on available data for Oman, continued monitoring of mortality trends is recommended.",
|
| 26 |
+
"country": "Oman",
|
| 27 |
+
"type": "prediction"
|
| 28 |
+
},
|
| 29 |
+
{
|
| 30 |
+
"instruction": "### Context:\nCountry: Lithuania\nYear: 2022\niso3c: LTU\ntime: 16\ntime_unit: weekly\ndeaths: 884.0\n\n### Question:\nWhat was the mortality situation in Lithuania for 2022?\n\n### Analysis:\n",
|
| 31 |
+
"response": "In 2022, Lithuania reported:\n- iso3c: LTU\n- time: 16\n- time_unit: weekly\n- deaths: 884.0",
|
| 32 |
+
"country": "Lithuania",
|
| 33 |
+
"year": 2022,
|
| 34 |
+
"type": "year_specific"
|
| 35 |
+
}
|
| 36 |
+
]
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_max_length": 2048,
|
| 3 |
+
"tokenizer_class": "LlamaTokenizer",
|
| 4 |
+
"use_fast": false,
|
| 5 |
+
"special_tokens": {
|
| 6 |
+
"bos_token": "<s>",
|
| 7 |
+
"eos_token": "</s>",
|
| 8 |
+
"unk_token": "<unk>",
|
| 9 |
+
"pad_token": "<pad>"
|
| 10 |
+
}
|
| 11 |
+
}
|
training_metadata.json
ADDED
|
@@ -0,0 +1,140 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"total_examples": 34537,
|
| 3 |
+
"train_examples": 24175,
|
| 4 |
+
"val_examples": 5181,
|
| 5 |
+
"test_examples": 5181,
|
| 6 |
+
"data_types": [
|
| 7 |
+
"trend_analysis",
|
| 8 |
+
"prediction",
|
| 9 |
+
"year_specific"
|
| 10 |
+
],
|
| 11 |
+
"countries": [
|
| 12 |
+
"Croatia",
|
| 13 |
+
"United Arab Emirates",
|
| 14 |
+
"Romania",
|
| 15 |
+
"Malta",
|
| 16 |
+
"Ukraine",
|
| 17 |
+
"Turkey",
|
| 18 |
+
"Taiwan",
|
| 19 |
+
"Uruguay",
|
| 20 |
+
"Cabo Verde",
|
| 21 |
+
"Singapore",
|
| 22 |
+
"Transnistria",
|
| 23 |
+
"Latvia",
|
| 24 |
+
"Switzerland",
|
| 25 |
+
"Mexico",
|
| 26 |
+
"South Korea",
|
| 27 |
+
"France",
|
| 28 |
+
"Liechtenstein",
|
| 29 |
+
"Jamaica",
|
| 30 |
+
"Guatemala",
|
| 31 |
+
"Palestine",
|
| 32 |
+
"Canada",
|
| 33 |
+
"Belgium",
|
| 34 |
+
"Bhutan",
|
| 35 |
+
"Barbados",
|
| 36 |
+
"Belize",
|
| 37 |
+
"Antigua and Barbuda",
|
| 38 |
+
"Belarus",
|
| 39 |
+
"Czechia",
|
| 40 |
+
"Jordan",
|
| 41 |
+
"Estonia",
|
| 42 |
+
"Norway",
|
| 43 |
+
"El Salvador",
|
| 44 |
+
"Netherlands",
|
| 45 |
+
"Cuba",
|
| 46 |
+
"Nicaragua",
|
| 47 |
+
"Mauritius",
|
| 48 |
+
"Hungary",
|
| 49 |
+
"Colombia",
|
| 50 |
+
"Suriname",
|
| 51 |
+
"North Macedonia",
|
| 52 |
+
"Japan",
|
| 53 |
+
"Egypt",
|
| 54 |
+
"Saint Kitts and Nevis",
|
| 55 |
+
"Armenia",
|
| 56 |
+
"Greece",
|
| 57 |
+
"Finland",
|
| 58 |
+
"Ireland",
|
| 59 |
+
"Monaco",
|
| 60 |
+
"Saint Vincent and the Grenadines",
|
| 61 |
+
"Germany",
|
| 62 |
+
"Kazakhstan",
|
| 63 |
+
"Qatar",
|
| 64 |
+
"Greenland",
|
| 65 |
+
"Algeria",
|
| 66 |
+
"Bolivia",
|
| 67 |
+
"Seychelles",
|
| 68 |
+
"Puerto Rico",
|
| 69 |
+
"Mayotte",
|
| 70 |
+
"Lebanon",
|
| 71 |
+
"Sweden",
|
| 72 |
+
"Montenegro",
|
| 73 |
+
"Andorra",
|
| 74 |
+
"Cyprus",
|
| 75 |
+
"Maldives",
|
| 76 |
+
"Slovakia",
|
| 77 |
+
"Malaysia",
|
| 78 |
+
"Australia",
|
| 79 |
+
"San Marino",
|
| 80 |
+
"Chile",
|
| 81 |
+
"Gibraltar",
|
| 82 |
+
"Iceland",
|
| 83 |
+
"Dominican Republic",
|
| 84 |
+
"Luxembourg",
|
| 85 |
+
"Kosovo",
|
| 86 |
+
"Thailand",
|
| 87 |
+
"Kuwait",
|
| 88 |
+
"French Guiana",
|
| 89 |
+
"Paraguay",
|
| 90 |
+
"Brazil",
|
| 91 |
+
"Italy",
|
| 92 |
+
"Denmark",
|
| 93 |
+
"Austria",
|
| 94 |
+
"Bulgaria",
|
| 95 |
+
"Russia",
|
| 96 |
+
"Namibia",
|
| 97 |
+
"United Kingdom",
|
| 98 |
+
"French Polynesia",
|
| 99 |
+
"Hong Kong",
|
| 100 |
+
"Macao",
|
| 101 |
+
"South Africa",
|
| 102 |
+
"Georgia",
|
| 103 |
+
"Fiji",
|
| 104 |
+
"Tunisia",
|
| 105 |
+
"Costa Rica",
|
| 106 |
+
"Tajikistan",
|
| 107 |
+
"Guadeloupe",
|
| 108 |
+
"Spain",
|
| 109 |
+
"Poland",
|
| 110 |
+
"Lithuania",
|
| 111 |
+
"Brunei",
|
| 112 |
+
"Portugal",
|
| 113 |
+
"Bosnia",
|
| 114 |
+
"New Caledonia",
|
| 115 |
+
"New Zealand",
|
| 116 |
+
"Slovenia",
|
| 117 |
+
"Serbia",
|
| 118 |
+
"Bahamas",
|
| 119 |
+
"Albania",
|
| 120 |
+
"Martinique",
|
| 121 |
+
"Panama",
|
| 122 |
+
"Bermuda",
|
| 123 |
+
"Azerbaijan",
|
| 124 |
+
"Mongolia",
|
| 125 |
+
"Israel",
|
| 126 |
+
"United States",
|
| 127 |
+
"Uzbekistan",
|
| 128 |
+
"Iran",
|
| 129 |
+
"Philippines",
|
| 130 |
+
"Peru",
|
| 131 |
+
"Faroe Islands",
|
| 132 |
+
"Aruba",
|
| 133 |
+
"Ecuador",
|
| 134 |
+
"R\u00e9union",
|
| 135 |
+
"Kyrgyzstan",
|
| 136 |
+
"Argentina",
|
| 137 |
+
"Oman",
|
| 138 |
+
"Moldova"
|
| 139 |
+
]
|
| 140 |
+
}
|