CAIA-Benchmark-Leaderboard

Runtime error

App Files Files Community

CAIA-Benchmark-Leaderboard / content.py

zhejianzhang

typo

1ea2007 7 months ago

raw

history blame contribute delete

2.99 kB

	def model_hyperlink(link, model_name):
	return f'<a target="_blank" href="{link}" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">{model_name}</a>'


	def make_clickable_model(model_name):
	link = f"https://huggingface.co/{model_name}"
	return model_hyperlink(link, model_name)


	def styled_error(error):
	return f"<p style='color: red; font-size: 20px; text-align: center;'>{error}</p>"


	def styled_warning(warn):
	return f"<p style='color: orange; font-size: 20px; text-align: center;'>{warn}</p>"


	def styled_message(message):
	return f"<p style='color: green; font-size: 20px; text-align: center;'>{message}</p>"


	def has_no_nan_values(df, columns):
	return df[columns].notna().all(axis=1)


	def has_nan_values(df, columns):
	return df[columns].isna().any(axis=1)


	def format_log(message):
	return f'<div style="color: green;">{message}</div>'

	TITLE = """<h1 align="center" id="space-title">CAIA(Crypto AI Agent) Leaderboard</h1>"""
	INTRODUCTION_TEXT = """Issued by [Crypto AI benchmark Alliance(CAIBA)](https://www.caiba.ai/), CAIA is a benchmark which aims at evaluating LLM-based AI Agent in Crypto (agents with augmented capabilities due to added tooling, efficient prompting, access to search, etc).

	## Data
	CAIA consists of 41 questions/tasks (first version) to solve real-world challenges, requiring different levels of tooling and autonomy to solve.

	Inspired by benchmarks like GAIA, CAIA is rigorously tested against 3 levels of complexity:

	- Level 1: Routine tasks (e.g., token price queries, basic contract verification).

	- Level 2: Multi-step analysis (e.g., cross-chain arbitrage opportunities, MEV strategies).

	- Level 3: Autonomous problem-solving (e.g., simulating tokenomics models, resolving bridge exploits).

	Each tier is validated through a transparent development framework, with public dev sets for community feedback and private test sets to ensure robustness.

	## Privacy & Security First
	All interactions are encrypted end-to-end, with zero data retention policies. Your keys, your coins, your control—always.

	## Join the Crypto AI Revolution
	Whether you're building the next DeFi protocol, optimizing trading strategies, or securing decentralized systems, CAIA delivers the precision and autonomy needed to stay ahead in Web3. Explore the future of blockchain intelligence- Try Crypto AI Agent Beta Today."""
	CITATION_BUTTON_TEXT = """CITATION_BUTTON_TEXT"""
	CITATION_TEXT = """CITATION_TEXT"""
	CITATION_BUTTON_LABEL = """CITATION_BUTTON_LABEL"""
	SUBMISSION_TEXT = """## Submit a new entry for evaluation

	Make sure your submission contains JSON file format like the [example json](https://github.com/caiba-ai/caia-benchmark/blob/main/dataset/example_agent_output.json)

	## Contact
	Contact us if you have any trouble submitting. Email: jamesdai@cybertinolab.com or zhejian.zhang@cybertinolab.com
	"""


	def get_citation_text():
	return CITATION_TEXT