Commercial LLMs like gpt-3.5-turbo and Claude are the best models to use for us right now. Nothing in the open source world comes close. However, this only means they鈥檙e the best of available options. They can take many seconds to produce a valid Honeycomb query, with latency ranging from two to 15+ seconds depending on the model, natural language input, size of the schema, makeup of the schema, and instructions in the prompt. As of this writing, although we have access to gpt-4鈥檚 API, it鈥檚 far too slow to work for our use case.