A 3-step fix for the persistent MALFORMED_FUNCTION_CALL error in production agents

If you are running agents in production with the Gemini API, you have likely stared at this error in your logs: MALFORMED_FUNCTION_CALL.

You probably hit the forums. You found the unsolved reports (1, 2, 3). You asked the LLMs for help and got the standard generic advice: ”Fix your descriptions,” ”Adjust your parameter naming”, or ”Tweak the prompt.”

I tried all of those. They didn’t work.

I found this error persists when tool inputs become complex, specifically arrays of objects or multi-line strings. After much debugging and testing, I found a combination of three approaches that brought our user-facing error rate down to near zero.

The Diagnosis: It’s Writing Python, Not JSON

Digging into Google AI Studio logs, I noticed a pattern. When the model fails, the finishMessage often reveals that it didn’t just fail to format JSON—it actually tried to write Python code to call the function. It looks like this (notice the print(default_api.save_scenarios…)):

{
	"candidates": [
		{
			"content": {},
			"finishReason": "MALFORMED_FUNCTION_CALL",
			"finishMessage": "Malformed function call: print(default_api.save_scenarios(scenarios=[\n    default_api.SaveScenariosScenarios(title='Project Manager Feedback', description='''**User\'s Role and Context:** A new Project Manager for Quality Improvement at a large urban hospital.\n**Specific Situation or Problem:** The Project Manager has developed a comprehensive proposal to reduce ED wait times, focusing on a new triage system and fast-track pathways. They are seeking a critical..."
		}
	]
}

The model also frequently hallucinates a namespace like default_api. My hypothesis is that the training data likely contains an abundance of swagger codegen patterns and Python code. The complex prompts trigger this auto-complete behavior. An independent user report on GitHub suggests the same.

The fix: A 3-layer defense

1. The Prevention Instructions (The First Line of Defense)

Since the model thinks it needs to write code, we must explicitly forbid it. I appended this section to our prompts:

## Function Calling

- Do not generate code. Always generate the function call json
  When calling functions, output the function name exactly as defined. Do not prepend 'default_api.' or any other namespace to the function name

Result: this significantly reduced the error rate with no latency cost, but it did not eliminate it. For production reliability, we needed more.

2. The Feedback Loop Retry (Currently Golang Only)

In the Golang SDK, we have access to the finishMessage, which contains the hallucinated Python code. We can use this to our advantage.

Instead of just retrying the same prompt, we catch the error and feed the mistake back to the model:

retryMessage = fmt.Sprintf("You replied with Python code instead of a valid tool call.\n%s\nPlease try again...", res.candidates[0].finishMessage)

Why this works: The chat history drops the malformed response (can be checked using chat.GetHistory()). Retrying without letting the model know what exactly it did wrong, leads to the model steering to a different direction (it asked a follow-up question in my case). By manually injecting the error message back into the context, the model “sees” its mistake and corrects it.

Note: Currently, a bug in the Python SDK returns None for the finish message, making this specific tactic Golang-only for now. I have filed a bug report for it.

3. The Forced Function Calling Mode Retry (Python & Golang)

Since we can’t always rely on the finish_message in Python, we need a “nuclear option” for the retry: Forcing the Function Calling in Tool Config.

When a MALFORMED error occurs, we trigger a retry where we overwrite the function_calling_config mode to ANY (instead of AUTO), just for that turn.

new_chat = client.chats.create(
  model="gemini-2.5-flash",
  config=genai.types.GenerateContentConfig(
    system_instruction=<your-system-prompt>,
    tools=[fns],
    temperature=0.0,
    top_p=0.0,
    top_k=1.0,
    tool_config=genai.types.ToolConfig(
      function_calling_config=genai.types.FunctionCallingConfig(
      mode=ANY
    ),
  ),
  history=chat.get_history(),
)

Why this works: This forces the model to drop any conversational token generation and focus exclusively on executing the tool call. Once the valid tool call is generated, we switch back to AUTO for subsequent turns.

If we had the finish_message we could also extract the function name and further limit the scope to that function by adding allowed_function_names=[<function-name>] to the function calling config.

Resources

I’ve put together a Notebook showing the failure and the fix here: Notebook
I’ve also filed an issue regarding the Python SDK finish_message bug here: Report
While all the implementation in this article talks about Gemini, the same should, in principle, apply to the same issue in OpenAI API as well