# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("CAUKiel/JavaBERT-uncased")
model = AutoModelForMaskedLM.from_pretrained("CAUKiel/JavaBERT-uncased")Quick Links
YAML Metadata Error:"language[0]" with value "java" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
JavaBERT
A BERT-like model pretrained on Java software code.
Training Data
The model was trained on 2,998,345 Java files retrieved from open source projects on GitHub. A bert-base-uncased tokenizer is used by this model.
Training Objective
A MLM (Masked Language Model) objective was used to train this model.
Usage
from transformers import pipeline
pipe = pipeline('fill-mask', model='CAUKiel/JavaBERT')
output = pipe(CODE) # Replace with Java code; Use '[MASK]' to mask tokens/words in the code.
- Downloads last month
- 23
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="CAUKiel/JavaBERT-uncased")