Practical recipes

Use workflow recipes when the question is how to run the lane, not only which model or provider to buy.

This page turns recurring problems into bounded operating patterns for coding review, research, retrieval, browser agents, multimodal analysis and local-first deployment.

ℹ️

Local snapshot rules

Workflow Recipes is the operating layer. It uses a 2026-03 local snapshot and should be opened after provider and model choice are already narrower. If the problem is still vendor, scenario or raw specs, step back to the LLM route first. If the bottleneck is local serving or hardware spend, jump to Inference Hardware Guide.

Recipes live

Actionable workflow lanes

Low-cost lanes

Worker or local-first friendly

High-complexity lanes

Need stronger control or tooling

Local-first lanes

Privacy or self-host posture

How to read this board

1

Problem

Start with the operational constraint

Pick the recipe by the real bottleneck: review cost, retrieval quality, multimodal input or control.
2

Stack

Separate worker lanes from final-close lanes

Most useful flows split cheap workers, stronger planners and hard validation gates.
3

Validation

Keep tests, checkpoints or grounded evidence visible

A workflow without real validation is only a demo, not an operating lane.

LLM route

Go back to the routing layer if you still need to clarify whether the problem is vendor, model or scenario.

Open LLM route

Agent stack board

Open the stack layer when the recipe is blocked by architecture, governance or browser orchestration choices.

Open stack board

Model fit radar

Use scenario-first model picks when the recipe is stable but the model lane is not.

Open radar

Inference guide

Open hardware guidance when the recipe is now limited by local serving, VRAM, RAM or power budget.

Open hardware

Coding review

Coding review loop with a cheap worker and a strong close

When you need to review real PRs or diffs without paying a frontier lane on every intermediate pass.

Research

Research and synthesis with a long-context planner and executive close

When the task mixes several sources, contradictions and a conclusion that must stay compact and useful.

Retrieval

Retrieval plus tools for flows that need real grounding

When the model must operate against documents or external systems and internal knowledge is not enough.

Browser agents

Browser flow with agents and human closes

When the task requires navigating real UIs, extracting state and closing actions with some control.

Multimodal

Multimodal analysis with one strong lane and a compact output

When image, audio, video or long PDFs enter the flow and the decision must land in one useful read.

Local-first

Local-first workflow for privacy, edge and control

When data residence, fixed cost or stack autonomy matter more than the last point of frontier quality.

Operating board

Recipes that answer what to run next

Practical lane map

Recipe	Recommended stack	Model and provider	Decision profile	Main caution
Coding review Coding review loop with a cheap worker and a strong close When you need to review real PRs or diffs without paying a frontier lane on every intermediate pass.	Low-cost worker for diff triage and smell detection Stronger planner for risks, regressions and the final answer Tests and lint as a mandatory gate before close	Model: GPT-5.4 mini for workers and GPT-5.4 for the final close or review. Provider: OpenAI fits best when the loop depends on tooling, repo context and a strong close.	Medium Medium Medium	Do not leave the final answer entirely to the cheap worker. If tests are missing, the loop creates false confidence very quickly.
Research Research and synthesis with a long-context planner and executive close When the task mixes several sources, contradictions and a conclusion that must stay compact and useful.	Source retrieval kept separate from synthesis Long-context planner to consolidate findings Final memo or table format before publishing	Model: GPT-5.4 as the main planner and Gemini 2.5 Flash-Lite if you need a cheap classification pass. Provider: OpenAI for sustained reasoning; Google as support when reading throughput matters.	Mid-high Moderate Medium	Do not use a single pass to gather sources and decide. If you ask for too much prose, usefulness drops even when reasoning is strong.
Retrieval Retrieval plus tools for flows that need real grounding When the model must operate against documents or external systems and internal knowledge is not enough.	Retriever or hybrid search with simple filters Tool layer for external actions or verifiable queries Synthesis model kept separate from retrieval	Model: Gemini 2.5 Flash-Lite or GPT-5.4 mini for retrieval workers; GPT-5.4 for delicate synthesis. Provider: Google and OpenAI work well when you separate cheap workers from the final layer; retrieval design matters most.	Medium Medium High	More context does not fix badly designed retrieval. If the tool layer is not auditable, the flow only looks safe.
Browser agents Browser flow with agents and human closes When the task requires navigating real UIs, extracting state and closing actions with some control.	Browser worker for exploration and capture Separate planner to decide the next step Human checkpoint or strong rule before sensitive actions	Model: GPT-5.4 mini for browser workers and GPT-5.4 for the planner or approver. Provider: OpenAI fits best when the flow depends on iterative tool use and controlled outputs.	Medium Mid-high High	Do not confuse observation with safe autonomous execution. Unstable UIs drive cost and error up if the flow is not bounded.
Multimodal Multimodal analysis with one strong lane and a compact output When image, audio, video or long PDFs enter the flow and the decision must land in one useful read.	One strong multimodal lane for the main read Short actionable text summary as the base output Secondary verification path when the asset is critical	Model: Gemini 2.5 Pro as the first choice; GPT-5.4 mini only when the visual input is light. Provider: Google is usually stronger when the problem is truly multimodal and not just text with a decorative image.	Mid-high Moderate Medium	Do not force multimodality if the real case is text and tables. Without a compact output, the user ends up with a pretty description and little utility.
Local-first Local-first workflow for privacy, edge and control When data residence, fixed cost or stack autonomy matter more than the last point of frontier quality.	Open-weight or self-hosted model as the primary lane Remote fallback only for premium cases or final validation Simple observability over cost, latency and quality	Model: Mistral Large 3 or Ministral 3 8B depending on desired quality and operational footprint. Provider: Mistral fits best when open-weight or private-cloud strategy is a real decision, not a slogan.	Competitive Variable High	Local-first does not mean local-only: keep an exit path for complex cases. If you do not measure hardware and throughput, the cost story becomes rhetorical.

Coding review

Coding review loop with a cheap worker and a strong close

Medium

Problem: When you need to review real PRs or diffs without paying a frontier lane on every intermediate pass.

Recommended stack

Low-cost worker for diff triage and smell detection
Stronger planner for risks, regressions and the final answer
Tests and lint as a mandatory gate before close

Flow steps

Cut the diff into blocks and ask a cheap worker for risks, test gaps and critical files.
Escalate only the strongest findings to the main planner or reviewer.
Run real tests before turning the review into a final decision.

Model: GPT-5.4 mini for workers and GPT-5.4 for the final close or review.

Provider: OpenAI fits best when the loop depends on tooling, repo context and a strong close.

Medium Medium Medium

Cautions

Do not leave the final answer entirely to the cheap worker.
If tests are missing, the loop creates false confidence very quickly.

Research

Research and synthesis with a long-context planner and executive close

Mid-high

Problem: When the task mixes several sources, contradictions and a conclusion that must stay compact and useful.

Recommended stack

Source retrieval kept separate from synthesis
Long-context planner to consolidate findings
Final memo or table format before publishing

Flow steps

Keep source capture and synthesis separate so evidence and summary do not collapse too early.
Mark contradictions and gaps before writing the conclusion.
Close with a short executive output and a list of remaining validations.

Model: GPT-5.4 as the main planner and Gemini 2.5 Flash-Lite if you need a cheap classification pass.

Provider: OpenAI for sustained reasoning; Google as support when reading throughput matters.

Mid-high Moderate Medium

Cautions

Do not use a single pass to gather sources and decide.
If you ask for too much prose, usefulness drops even when reasoning is strong.

Retrieval

Retrieval plus tools for flows that need real grounding

Medium

Problem: When the model must operate against documents or external systems and internal knowledge is not enough.

Recommended stack

Retriever or hybrid search with simple filters
Tool layer for external actions or verifiable queries
Synthesis model kept separate from retrieval

Flow steps

Retrieve a few good blocks before expanding context.
Use tools to verify or act and leave clear traces for every call.
Ask the final synthesis to cite grounding limits and gaps.

Model: Gemini 2.5 Flash-Lite or GPT-5.4 mini for retrieval workers; GPT-5.4 for delicate synthesis.

Provider: Google and OpenAI work well when you separate cheap workers from the final layer; retrieval design matters most.

Medium Medium High

Cautions

More context does not fix badly designed retrieval.
If the tool layer is not auditable, the flow only looks safe.

Browser agents

Browser flow with agents and human closes

Medium

Problem: When the task requires navigating real UIs, extracting state and closing actions with some control.

Recommended stack

Browser worker for exploration and capture
Separate planner to decide the next step
Human checkpoint or strong rule before sensitive actions

Flow steps

Explore the UI in short steps and capture evidence before acting.
Make the planner summarize state, risks and the next action.
Gate submits or destructive changes behind a clear approval.

Model: GPT-5.4 mini for browser workers and GPT-5.4 for the planner or approver.

Provider: OpenAI fits best when the flow depends on iterative tool use and controlled outputs.

Medium Mid-high High

Cautions

Do not confuse observation with safe autonomous execution.
Unstable UIs drive cost and error up if the flow is not bounded.

Multimodal

Multimodal analysis with one strong lane and a compact output

Mid-high

Problem: When image, audio, video or long PDFs enter the flow and the decision must land in one useful read.

Recommended stack

One strong multimodal lane for the main read
Short actionable text summary as the base output
Secondary verification path when the asset is critical

Flow steps

Define which signals matter and what actionable output you need first.
Process the multimodal asset in one strong lane before summarizing.
Collapse the result into a memo, checklist or scorecard so the decision is useful.

Model: Gemini 2.5 Pro as the first choice; GPT-5.4 mini only when the visual input is light.

Provider: Google is usually stronger when the problem is truly multimodal and not just text with a decorative image.

Mid-high Moderate Medium

Cautions

Do not force multimodality if the real case is text and tables.
Without a compact output, the user ends up with a pretty description and little utility.

Local-first

Local-first workflow for privacy, edge and control

Competitive

Problem: When data residence, fixed cost or stack autonomy matter more than the last point of frontier quality.

Recommended stack

Open-weight or self-hosted model as the primary lane
Remote fallback only for premium cases or final validation
Simple observability over cost, latency and quality

Flow steps

Define which tasks must stay local and which ones may escalate remotely.
Size the local lane for the base case, not the most extreme benchmark.
Measure latency, quality and cost before activating a frontier fallback.

Model: Mistral Large 3 or Ministral 3 8B depending on desired quality and operational footprint.

Provider: Mistral fits best when open-weight or private-cloud strategy is a real decision, not a slogan.

Competitive Variable High

Cautions

Local-first does not mean local-only: keep an exit path for complex cases.
If you do not measure hardware and throughput, the cost story becomes rhetorical.

How to read this board

Start with the operational constraint

Separate worker lanes from final-close lanes

Keep tests, checkpoints or grounded evidence visible

LLM route

Agent stack board

Model fit radar

Inference guide

Coding review loop with a cheap worker and a strong close

Research and synthesis with a long-context planner and executive close

Retrieval plus tools for flows that need real grounding

Browser flow with agents and human closes

Multimodal analysis with one strong lane and a compact output

Local-first workflow for privacy, edge and control

Recipes that answer what to run next

Coding review loop with a cheap worker and a strong close

Research and synthesis with a long-context planner and executive close

Retrieval plus tools for flows that need real grounding

Browser flow with agents and human closes

Multimodal analysis with one strong lane and a compact output

Local-first workflow for privacy, edge and control