web3se
/

SmartBERT-v2

Feature Extraction

software-engineering

code-understanding

Model card Files Files and versions

devilyouwei commited on Feb 1

Commit

0960430

·

verified ·

1 Parent(s): 6d3ed54

anonymous readme

Files changed (1) hide show

README.md +18 -25

README.md CHANGED Viewed

@@ -59,38 +59,31 @@ training_args = TrainingArguments(
 To train and deploy the SmartBERT V2 model for Web API services, please refer to our GitHub repository: [web3se-lab/SmartBERT](https://github.com/web3se-lab/SmartBERT).
-Or use pipline:
 ```python
-from transformers import RobertaTokenizer, RobertaForMaskedLM, pipeline
-model = RobertaForMaskedLM.from_pretrained('web3se/SmartBERT-v3')
-tokenizer = RobertaTokenizer.from_pretrained('web3se/SmartBERT-v3')
-code_example = "function totalSupply() external view <mask> (uint256);"
-fill_mask = pipeline('fill-mask', model=model, tokenizer=tokenizer)
-outputs = fill_mask(code_example)
-print(outputs)
-```
-## Contributors
-- [Youwei Huang](https://www.devil.ren)
-- [Sen Fang](https://github.com/TomasAndersonFang)
-## Citations
-```tex
-@article{huang2025smart,
-  title={Smart Contract Intent Detection with Pre-trained Programming Language Model},
-  author={Huang, Youwei and Li, Jianwen and Fang, Sen and Li, Yao and Yang, Peng and Hu, Bin},
-  journal={arXiv preprint arXiv:2508.20086},
-  year={2025}
-}
 ```
-## Sponsors
-- [Institute of Intelligent Computing Technology, Suzhou, CAS](http://iict.ac.cn/)
-- CAS Mino (中科劢诺)

 To train and deploy the SmartBERT V2 model for Web API services, please refer to our GitHub repository: [web3se-lab/SmartBERT](https://github.com/web3se-lab/SmartBERT).
+Or use pipeline:
 ```python
+import torch
+from transformers import RobertaTokenizer, RobertaModel
+tokenizer = RobertaTokenizer.from_pretrained("web3se/SmartBERT-v2")
+model = RobertaModel.from_pretrained("web3se/SmartBERT-v2")
+code = "function totalSupply() external view returns (uint256);"
+inputs = tokenizer(
+    code,
+    return_tensors="pt",
+    truncation=True,
+    max_length=512
+)
+with torch.no_grad():
+    outputs = model(**inputs)
+# Option 1: CLS embedding
+cls_embedding = outputs.last_hidden_state[:, 0, :]
+# Option 2: Mean pooling (often better for code)
+mean_embedding = outputs.last_hidden_state.mean(dim=1)
 ```