BFCL V3 | Notion

<aside> 💡

벤치마크 목록들의 사용법
- 다운받아서 쓰는지 or 특정 코드를 돌리는지
- 다운받아서 쓰면 어디서 다운받고 어떻게 돌리는지 </aside>

BFCL V3 벤치마크 개요

BFCL(Berkeley Function Calling Leaderboard)

We introduce the Berkeley Function Calling Leaderboard (BFCL), the first comprehensive and executable function call evaluation dedicated to assessing Large Language Models' (LLMs) ability to invoke functions. Unlike previous evaluations, BFCL accounts for various forms of function calls, diverse scenarios, and executability.

벤치마크 소개: https://github.com/ShishirPatil/gorilla/tree/main/berkeley-function-call-leaderboard#generating-llm-responses
논문: Gorilla: Large Language Model Connected with Massive APIs

릴리즈 버전

BFCL V1: Our initial BFCL release
BFCL V2: Our second release, employing enterprise and OSS-contributed live data
BFCL V3: Introduces multi-turn and multi-step function calling scenarios

BFCL 사용 방법

특정 코드

https://github.com/ShishirPatil/gorilla

git clone <https://github.com/ShishirPatil/gorilla.git>

from datasets import Dataset
import copy
import json

def load_json_dataset(test_entries: List[Dict[str, Any]]):
    data = {"id": [], "question": [], "function": []}
    test_entries_copy = copy.deepcopy(test_entries)

    for item in test_entries_copy:
        data["id"].append(item["id"])
        data["question"].append(item["question"])

        for func in item["function"]:
            func["parameters"]["properties"] = json.dumps(
                func["parameters"]["properties"]
            )
        data["function"].append(func)
    return Dataset.from_dict(data)

# Example usage
test_entries = load_file("path_to_your_file.json")
ds = load_json_dataset(test_entries)