60B_MoE_Coder_v3 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
38c77ec verified
|
raw
history blame
5.54 kB
metadata
license: other
tags:
  - yi
  - moe
license_name: yi-license
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
model-index:
  - name: 60B_MoE_Coder_v3
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 71.16
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=cloudyu/60B_MoE_Coder_v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 85.44
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=cloudyu/60B_MoE_Coder_v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 75.37
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=cloudyu/60B_MoE_Coder_v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 67.01
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=cloudyu/60B_MoE_Coder_v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 82.56
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=cloudyu/60B_MoE_Coder_v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 66.94
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=cloudyu/60B_MoE_Coder_v3
          name: Open LLM Leaderboard

This is an English and Chinese chat model for Coding but also general purpose.

example output

写一个python程序抓取youtube最流行的内容:从网页获取数据,

以下是一个python脚本,可以从Youtube上抓取最受欢迎的视频并将其保存到本地文件中。该脚本使用Selenium WebDriver来模拟浏览器操作以登录并搜索视频,然后利用Beautiful Soup库提取所需信息。如果您需要任何帮助或修改建议,请随时告诉我!

from selenium import webdriver
from bs4 import BeautifulSoup as BS
import os

# 设置ChromeDriver路径
CHROMEDRIVER_PATH = './chromedriver.exe' # 更换为实际路径

def main():
    # 启动Chrome浏览器并登录Youtube
    browser = webdriver.Chrome(executable_path=CHROMEDRIVER_PATH)
    browser.get('https://www.youtube.com/')
    login_url = 'https://accounts.google.com/signin'
    browser.get(login_url)
    # 模拟输入账号和密码并登录
    username = input('Username: ')
    password = getpass.getpass() # 防止密码显示在终端上
    submit_button = browser.find_element_by_class_name('whsLW', timeout=5)
    submit_button.click()

    # 在搜索栏中输入关键词并查找结果
    search_query = 'most popular youtube videos today'
    search_box = browser.find_element_by_id('search_query')
    search_box.send_keys(search_query)
    enter_key = browser.find_element_by_class_name('ytp-suggest-renderer__input-icon')
    enter_key.click()
    results = BS(browser.page_source, 'html.parser')

    # 从搜索结果中提取出最受欢迎的视频链接并保存到本地文件
    videos = results.find('ol', {'class': 'r'})
    video_urls = [a['href'] for a in videos.find_all('a')]
    saved_path = 'popular_videos/' + os.getcwd().split('/')[-2] + '_' + str(int(time.time())) + '/'
    if not os.path.exists(saved_path):
        os.makedirs(saved_path)
    for i, url in enumerate(video_urls[:50]):
        filename = saved_path + '%d.mp4' % (i+1)
        urllib.request.urlretrieve(url, filename)
    print('Downloaded top 50 videos to:', saved_path)

except Exception as e:
    print(e)

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 74.75
AI2 Reasoning Challenge (25-Shot) 71.16
HellaSwag (10-Shot) 85.44
MMLU (5-Shot) 75.37
TruthfulQA (0-shot) 67.01
Winogrande (5-shot) 82.56
GSM8k (5-shot) 66.94