gpt_engineer.benchmark.run.run
- gpt_engineer.benchmark.run.run(agent: BaseAgent, benchmark: Benchmark, verbose=False) List[TaskResult][source]
Runs the benchmark tasks using the provided agent and returns a list of TaskResult objects.
- Parameters:
agent (BaseAgent) – The agent to use for running the benchmark tasks.
benchmark (Benchmark) – The benchmark containing the tasks to run.
verbose (bool, default=False) – A flag to indicate whether to print verbose output during the benchmark.
- Returns:
A list of TaskResult objects representing the results of the benchmark tasks.
- Return type:
List[TaskResult]