feiyang-cai commited on
Commit
c401070
·
verified ·
1 Parent(s): a1e9843

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -28
README.md CHANGED
@@ -1,32 +1,7 @@
1
  # ChemFMv2-20M
2
 
3
- This is the ChemFMv2-20M model.
4
-
5
- ## Model Details
6
-
7
- - **Model Type**: LlamaForCausalLM
8
- - **Architecture**: LLaMA-based
9
- - **Parameters**: 20M
10
- - **Hidden Size**: 640
11
- - **Layers**: 4
12
- - **Attention Heads**: 10
13
- - **Vocabulary Size**: 320
14
- - **Max Position Embeddings**: 512
15
 
16
  ## Usage
17
-
18
- ```python
19
- from transformers import AutoTokenizer, AutoModelForCausalLM
20
-
21
- # Load model and tokenizer
22
- model_name = "ChemFM/ChemFMv2-20M"
23
- tokenizer = AutoTokenizer.from_pretrained(model_name)
24
- model = AutoModelForCausalLM.from_pretrained(model_name)
25
-
26
- # Example usage
27
- text = "Your chemical input here"
28
- inputs = tokenizer(text, return_tensors="pt")
29
- outputs = model.generate(**inputs, max_length=100)
30
- result = tokenizer.decode(outputs[0], skip_special_tokens=True)
31
- print(result)
32
- ```
 
1
  # ChemFMv2-20M
2
 
3
+ ChemFM is a large-scale foundation model, specifically designed for chemistry.
4
+ It has been [pre-trained](https://github.com/TheLuoFengLab/ChemFM/tree/master/pretraining) on 178 million molecules from [UniChem](https://www.ebi.ac.uk/unichem/) using self-supervised causal language modeling, enabling the extraction of versatile and generalizable molecular representations.
 
 
 
 
 
 
 
 
 
 
5
 
6
  ## Usage
7
+ The code for using this model is provided in this [GitHub repository](https://github.com/TheLuoFengLab/ChemFM).