Tsukihjy's picture
download
raw
693 Bytes
import json
def extract_tcb_ids_without_standalone_hash(json_file_path, output_file_path):
with open(json_file_path, 'r', encoding='utf-8') as f:
data = json.load(f)
def has_standalone_hash(query):
tokens = query.strip().split()
return '#' in tokens
tcb_ids = [
item['tcb_id']
for item in data
if not has_standalone_hash(item.get('query_en', '')) and 'tcb_id' in item
]
with open(output_file_path, 'w', encoding='utf-8') as f:
json.dump(tcb_ids, f, indent=2, ensure_ascii=False)
# 示例调用
extract_tcb_ids_without_standalone_hash('EN_section-6修正(1).json', 'filtered_tcb_ids_6.json')

Xet Storage Details

Size:
693 Bytes
·
Xet hash:
5e6aad17a6f6d466962a3f7dd624db14e0610f8c2cf67c8f819abb2c756b476e

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.