BERTose IAR Resolver

This repository contains the contrastively refined BERTose checkpoint used for iterative ambiguity resolution (IAR) over ambiguous WURCS BPE tokens.

Quick Start

The recommended user path is the companion notebook:

from huggingface_hub import hf_hub_download

checkpoint = hf_hub_download(
    repo_id="supanthadey1/bertose-iar-resolver",
    filename="checkpoints/bertose_iar_resolver.pt",
)
ambiguity_map = hf_hub_download(
    repo_id="supanthadey1/bertose-iar-resolver",
    filename="vocab/bpe_ambiguity_tokens.json",
)

No Hugging Face token is required for this BERTose IAR checkpoint now that the repository is public.

Files

  • checkpoints/bertose_iar_resolver.pt - BERTose IAR checkpoint.
  • vocab/bpe_vocabulary.json - WURCS BPE vocabulary.
  • vocab/bpe_ambiguity_tokens.json - ambiguous-token map used by the resolver.
  • src/bertose_model.py - BERTose model definition.
  • src/bertose_layers.py - Transformer layers used by BERTose.
  • src/wurcs_bpe_tokenizer.py - WURCS BPE tokenizer.

Input

Provide one WURCS glycan string or a CSV batch with sample_id,wurcs. The resolver is intended for glycans that already contain uncertainty markers in WURCS form.

Free-text ambiguous glycan names are not parsed directly. Convert the name or IUPAC-condensed notation to WURCS first. If the structure is ambiguous, preserve that ambiguity in the WURCS string with WURCS-style uncertainty markers before running BERTose IAR.

Output

Token-level ambiguity-resolution predictions with confidence scores. The companion notebook writes both summary and detail CSVs for batch runs.

Scope

The resolver provides model-backed token updates and confidence values for ambiguous positions. It does not claim to reconstruct a final canonical WURCS string by itself, and it does not perform IUPAC-condensed/name-to-WURCS conversion.

License metadata is currently other; update it when the final release license and citation text are chosen.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support