ChemFM
/

ChemFMv2-20M

Model card Files Files and versions

feiyang-cai commited on Jun 19, 2025

Commit

c401070

·

verified ·

1 Parent(s): a1e9843

Update README.md

Files changed (1) hide show

README.md +3 -28

README.md CHANGED Viewed

@@ -1,32 +1,7 @@
 # ChemFMv2-20M
-This is the ChemFMv2-20M model.
-## Model Details
-- **Model Type**: LlamaForCausalLM
-- **Architecture**: LLaMA-based
-- **Parameters**: 20M
-- **Hidden Size**: 640
-- **Layers**: 4
-- **Attention Heads**: 10
-- **Vocabulary Size**: 320
-- **Max Position Embeddings**: 512
 ## Usage
-```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-# Load model and tokenizer
-model_name = "ChemFM/ChemFMv2-20M"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForCausalLM.from_pretrained(model_name)
-# Example usage
-text = "Your chemical input here"
-inputs = tokenizer(text, return_tensors="pt")
-outputs = model.generate(**inputs, max_length=100)
-result = tokenizer.decode(outputs[0], skip_special_tokens=True)
-print(result)
-```

 # ChemFMv2-20M
+ChemFM is a large-scale foundation model, specifically designed for chemistry.
+It has been [pre-trained](https://github.com/TheLuoFengLab/ChemFM/tree/master/pretraining) on 178 million molecules from [UniChem](https://www.ebi.ac.uk/unichem/) using self-supervised causal language modeling, enabling the extraction of versatile and generalizable molecular representations.
 ## Usage
+The code for using this model is provided in this [GitHub repository](https://github.com/TheLuoFengLab/ChemFM).