<aside> 💡

BFCL V3 벤치마크 개요

BFCL(Berkeley Function Calling Leaderboard)

We introduce the Berkeley Function Calling Leaderboard (BFCL), the first comprehensive and executable function call evaluation dedicated to assessing Large Language Models' (LLMs) ability to invoke functions. Unlike previous evaluations, BFCL accounts for various forms of function calls, diverse scenarios, and executability.

릴리즈 버전

BFCL 사용 방법

특정 코드

git clone <https://github.com/ShishirPatil/gorilla.git>
from datasets import Dataset
import copy
import json

def load_json_dataset(test_entries: List[Dict[str, Any]]):
    data = {"id": [], "question": [], "function": []}
    test_entries_copy = copy.deepcopy(test_entries)

    for item in test_entries_copy:
        data["id"].append(item["id"])
        data["question"].append(item["question"])

        for func in item["function"]:
            func["parameters"]["properties"] = json.dumps(
                func["parameters"]["properties"]
            )
        data["function"].append(func)
    return Dataset.from_dict(data)

# Example usage
test_entries = load_file("path_to_your_file.json")
ds = load_json_dataset(test_entries)