YAML · 3788 bytes Raw Blame History
1 # Hand-authored tool-use cases used by integration + dlm-template tests.
2 # Eight cases spanning the most common tool-call shapes; each schema is
3 # small enough that a 135M-parameter base model can produce a valid call
4 # with a one-shot system prompt. Update sparingly: the integration test
5 # baseline depends on these prompts staying stable.
6 cases:
7 - prompt: |
8 Use the search_web tool to find recent news about the James Webb telescope.
9 Respond with only a JSON tool call.
10 tool_spec:
11 name: search_web
12 description: Search the public web for a query string.
13 parameters:
14 type: object
15 properties:
16 query:
17 type: string
18 max_results:
19 type: integer
20 required: [query]
21 gold_tool_name: search_web
22
23 - prompt: |
24 Use the read_file tool to read /etc/hosts.
25 Respond with only a JSON tool call.
26 tool_spec:
27 name: read_file
28 description: Read the contents of a file from disk.
29 parameters:
30 type: object
31 properties:
32 path:
33 type: string
34 encoding:
35 type: string
36 required: [path]
37 gold_tool_name: read_file
38
39 - prompt: |
40 Use the run_python tool to compute factorial of 7.
41 Respond with only a JSON tool call.
42 tool_spec:
43 name: run_python
44 description: Execute a snippet of Python and return stdout.
45 parameters:
46 type: object
47 properties:
48 code:
49 type: string
50 timeout_s:
51 type: number
52 required: [code]
53 gold_tool_name: run_python
54
55 - prompt: |
56 Use the run_shell tool to list files in /tmp.
57 Respond with only a JSON tool call.
58 tool_spec:
59 name: run_shell
60 description: Run a shell command and return its output.
61 parameters:
62 type: object
63 properties:
64 command:
65 type: string
66 working_dir:
67 type: string
68 required: [command]
69 gold_tool_name: run_shell
70
71 - prompt: |
72 Use the http_fetch tool to GET https://example.com/health.
73 Respond with only a JSON tool call.
74 tool_spec:
75 name: http_fetch
76 description: Issue an HTTP GET against a URL.
77 parameters:
78 type: object
79 properties:
80 url:
81 type: string
82 headers:
83 type: object
84 required: [url]
85 gold_tool_name: http_fetch
86
87 - prompt: |
88 Use the db_query tool to SELECT name FROM users WHERE id = 5.
89 Respond with only a JSON tool call.
90 tool_spec:
91 name: db_query
92 description: Execute a read-only SQL query.
93 parameters:
94 type: object
95 properties:
96 sql:
97 type: string
98 params:
99 type: array
100 required: [sql]
101 gold_tool_name: db_query
102
103 - prompt: |
104 Use the calendar_create_event tool to add a meeting at 2pm tomorrow titled "Sync".
105 Respond with only a JSON tool call.
106 tool_spec:
107 name: calendar_create_event
108 description: Create a new calendar event.
109 parameters:
110 type: object
111 properties:
112 title:
113 type: string
114 start_iso:
115 type: string
116 duration_minutes:
117 type: integer
118 attendees:
119 type: array
120 required: [title, start_iso]
121 gold_tool_name: calendar_create_event
122
123 - prompt: |
124 Use the calculator tool to compute 137 * 42.
125 Respond with only a JSON tool call.
126 tool_spec:
127 name: calculator
128 description: Evaluate a simple arithmetic expression.
129 parameters:
130 type: object
131 properties:
132 expression:
133 type: string
134 required: [expression]
135 gold_tool_name: calculator
136