CxGrammar
/

ase-gpt-medium-wiki

Model card Files Files and versions

XLXW commited on Jul 10, 2025

Commit

f928ee9

·

verified ·

1 Parent(s): cc128c1

Update README.md

Files changed (1) hide show

README.md +64 -3

README.md CHANGED Viewed

@@ -1,3 +1,64 @@
----
-license: mit
----

+---
+license: mit
+datasets:
+- togethercomputer/RedPajama-Data-1T
+language:
+- en
+base_model:
+- openai-community/gpt2
+---
+## Model Information
+The ASE collection of language models is a collection of pretrained models for assessing association strength. More details can be found in [CxGLearner](https://learner.xlxw.org/) .
+**Model developer**: [ZJU MMF (CxGrammar)](https://github.com/CxGrammar)
+**Model Architecture:** GPT-2 (6 layers)
+**Supported languages:** English
+**License:** MIT
+## Use with cxglearner
+Starting with `cxglearner >= 1.3.2` onward, you can run ASE using cxglearner `Association` module.
+Make sure to update your cxglearner installation via `pip install --upgrade cxglearner`.
+### Example
+```python
+from cxglearner.config import Config, DefaultConfigs
+from cxglearner.lm import Association
+from cxglearner.encoder import Encoder
+from cxglearner.utils import init_logger
+config = Config(DefaultConfigs.eng)
+# Set the specific model
+config.lm.output_path = "CxGrammar/ase-gpt-medium-wiki"
+logger = init_logger(config)
+encoder = Encoder(config, logger)
+# When instantiating Association, cxglearner will automatically download model parameter files from Huggingface Hub.
+# However, you can also manually download pytorch_model.bin and set the output_path to a local path.
+asso = Association(config, logger, encoder=encoder)
+example_sentence = "The wetlands can be more"
+select_mask = ['lexical', 'lexical', 'lexical', 'lexical']
+select_mask = [level_map[level] for level in select_mask]
+select_mask_2 = ['upos', 'lexical', 'lexical', 'lexical']
+select_mask_2 = [level_map[level] for level in select_mask_2]
+encoded = encoder.encode(example_sentence, need_ids=True)
+res = encoder.convert_ids_to_tokens([ele[0] for ele in encoded])
+encoded = encoded[1:]
+inputs_1 = [element[select_mask[i]] for i, element in enumerate(encoded)]
+inputs_2 = [element[select_mask_2[i]] for i, element in enumerate(encoded)]
+inputs1_tensor = torch.Tensor(inputs_1).type(torch.int64)
+inputs2_tensor = torch.Tensor(inputs_2).type(torch.int64)
+# dynamic candidates
+candidate_dynamic = asso_handler.compute_candidate(inputs_1)
+print(candidate_dynamic)
+```