Custom agents using the low level completion APIs tend to outperform these gener...

Custom agents using the low level completion APIs tend to outperform these generic tools, especially when you are working with complex problems.

It's hard to beat domain specific code. I can avoid massive prompts and token bloat if my execution environment, tools and error feedback provide effectively the same constraints.

If I had to pick only one tool for a generic agent to use, it would definitely be ExecuteSqlQuery (or a superset like ExecuteShell). If you gave me an agent framework and this is all it could do, I'd probably be ok for quite a while. SQL can absorb the domain specific concerns quite well. Consider that tool definitions also consume tokens.