Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
rtferraz
/
domainTokenizer
like
0
arxiv:
9 papers
Model card
Files
Files and versions
xet
Community
main
domainTokenizer
/
notebooks
57.9 kB
Ctrl+K
Ctrl+K
1 contributor
History:
7 commits
rtferraz
Fix label leakage: temporal split β use first 70% of events as input, predict purchase in last 30%. Remove n_purchases/purchase_rate from features.
e4d8561
verified
1 day ago
01_finance_pretrain.ipynb
Safe
19.2 kB
Fix notebook: total_mem β total_memory, add hub_model_id push, add wandb logging support
6 days ago
02_ecommerce_pretrain.ipynb
20.8 kB
Update 02_ecommerce notebook: add HF login, memory-free cell, subsample option for <64GB RAM machines
1 day ago
03_ecommerce_finetune.ipynb
17.9 kB
Fix label leakage: temporal split β use first 70% of events as input, predict purchase in last 30%. Remove n_purchases/purchase_rate from features.
1 day ago