Practical recipes

Use workflow recipes when the question is how to run the lane, not only which model or provider to buy.

This page turns recurring problems into bounded operating patterns for coding review, research, retrieval, browser agents, multimodal analysis and local-first deployment.

ℹ️

Local snapshot rules

Workflow Recipes is the operating layer. It uses a 2026-03 local snapshot and should be opened after provider and model choice are already narrower. If the problem is still vendor, scenario or raw specs, step back to the LLM route first. If the bottleneck is local serving or hardware spend, jump to Inference Hardware Guide.

Recipes live

6

Actionable workflow lanes

Low-cost lanes

1

Worker or local-first friendly

High-complexity lanes

3

Need stronger control or tooling

Local-first lanes

1

Privacy or self-host posture

How to read this board

  1. 1

    Problem

    Start with the operational constraint

    Pick the recipe by the real bottleneck: review cost, retrieval quality, multimodal input or control.

  2. 2

    Stack

    Separate worker lanes from final-close lanes

    Most useful flows split cheap workers, stronger planners and hard validation gates.

  3. 3

    Validation

    Keep tests, checkpoints or grounded evidence visible

    A workflow without real validation is only a demo, not an operating lane.

LLM route

Go back to the routing layer if you still need to clarify whether the problem is vendor, model or scenario.

Agent stack board

Open the stack layer when the recipe is blocked by architecture, governance or browser orchestration choices.

Model fit radar

Use scenario-first model picks when the recipe is stable but the model lane is not.

Inference guide

Open hardware guidance when the recipe is now limited by local serving, VRAM, RAM or power budget.

Coding review

Coding review loop with a cheap worker and a strong close

When you need to review real PRs or diffs without paying a frontier lane on every intermediate pass.

Research

Research and synthesis with a long-context planner and executive close

When the task mixes several sources, contradictions and a conclusion that must stay compact and useful.

Retrieval

Retrieval plus tools for flows that need real grounding

When the model must operate against documents or external systems and internal knowledge is not enough.

Browser agents

Browser flow with agents and human closes

When the task requires navigating real UIs, extracting state and closing actions with some control.

Multimodal

Multimodal analysis with one strong lane and a compact output

When image, audio, video or long PDFs enter the flow and the decision must land in one useful read.

Local-first

Local-first workflow for privacy, edge and control

When data residence, fixed cost or stack autonomy matter more than the last point of frontier quality.

Operating board

Recipes that answer what to run next

Practical lane map
Recipe Recommended stack Model and provider Decision profile Main caution

Coding review

Coding review loop with a cheap worker and a strong close

When you need to review real PRs or diffs without paying a frontier lane on every intermediate pass.

  • Low-cost worker for diff triage and smell detection
  • Stronger planner for risks, regressions and the final answer
  • Tests and lint as a mandatory gate before close

Model: GPT-5.4 mini for workers and GPT-5.4 for the final close or review.

Provider: OpenAI fits best when the loop depends on tooling, repo context and a strong close.

Medium Medium Medium
  • Do not leave the final answer entirely to the cheap worker.
  • If tests are missing, the loop creates false confidence very quickly.

Research

Research and synthesis with a long-context planner and executive close

When the task mixes several sources, contradictions and a conclusion that must stay compact and useful.

  • Source retrieval kept separate from synthesis
  • Long-context planner to consolidate findings
  • Final memo or table format before publishing

Model: GPT-5.4 as the main planner and Gemini 2.5 Flash-Lite if you need a cheap classification pass.

Provider: OpenAI for sustained reasoning; Google as support when reading throughput matters.

Mid-high Moderate Medium
  • Do not use a single pass to gather sources and decide.
  • If you ask for too much prose, usefulness drops even when reasoning is strong.

Retrieval

Retrieval plus tools for flows that need real grounding

When the model must operate against documents or external systems and internal knowledge is not enough.

  • Retriever or hybrid search with simple filters
  • Tool layer for external actions or verifiable queries
  • Synthesis model kept separate from retrieval

Model: Gemini 2.5 Flash-Lite or GPT-5.4 mini for retrieval workers; GPT-5.4 for delicate synthesis.

Provider: Google and OpenAI work well when you separate cheap workers from the final layer; retrieval design matters most.

Medium Medium High
  • More context does not fix badly designed retrieval.
  • If the tool layer is not auditable, the flow only looks safe.

Browser agents

Browser flow with agents and human closes

When the task requires navigating real UIs, extracting state and closing actions with some control.

  • Browser worker for exploration and capture
  • Separate planner to decide the next step
  • Human checkpoint or strong rule before sensitive actions

Model: GPT-5.4 mini for browser workers and GPT-5.4 for the planner or approver.

Provider: OpenAI fits best when the flow depends on iterative tool use and controlled outputs.

Medium Mid-high High
  • Do not confuse observation with safe autonomous execution.
  • Unstable UIs drive cost and error up if the flow is not bounded.

Multimodal

Multimodal analysis with one strong lane and a compact output

When image, audio, video or long PDFs enter the flow and the decision must land in one useful read.

  • One strong multimodal lane for the main read
  • Short actionable text summary as the base output
  • Secondary verification path when the asset is critical

Model: Gemini 2.5 Pro as the first choice; GPT-5.4 mini only when the visual input is light.

Provider: Google is usually stronger when the problem is truly multimodal and not just text with a decorative image.

Mid-high Moderate Medium
  • Do not force multimodality if the real case is text and tables.
  • Without a compact output, the user ends up with a pretty description and little utility.

Local-first

Local-first workflow for privacy, edge and control

When data residence, fixed cost or stack autonomy matter more than the last point of frontier quality.

  • Open-weight or self-hosted model as the primary lane
  • Remote fallback only for premium cases or final validation
  • Simple observability over cost, latency and quality

Model: Mistral Large 3 or Ministral 3 8B depending on desired quality and operational footprint.

Provider: Mistral fits best when open-weight or private-cloud strategy is a real decision, not a slogan.

Competitive Variable High
  • Local-first does not mean local-only: keep an exit path for complex cases.
  • If you do not measure hardware and throughput, the cost story becomes rhetorical.

Coding review

Coding review loop with a cheap worker and a strong close

Medium

Problem: When you need to review real PRs or diffs without paying a frontier lane on every intermediate pass.

Recommended stack

  • Low-cost worker for diff triage and smell detection
  • Stronger planner for risks, regressions and the final answer
  • Tests and lint as a mandatory gate before close

Flow steps

  1. Cut the diff into blocks and ask a cheap worker for risks, test gaps and critical files.
  2. Escalate only the strongest findings to the main planner or reviewer.
  3. Run real tests before turning the review into a final decision.

Model: GPT-5.4 mini for workers and GPT-5.4 for the final close or review.

Provider: OpenAI fits best when the loop depends on tooling, repo context and a strong close.

Medium Medium Medium

Cautions

  • Do not leave the final answer entirely to the cheap worker.
  • If tests are missing, the loop creates false confidence very quickly.

Research

Research and synthesis with a long-context planner and executive close

Mid-high

Problem: When the task mixes several sources, contradictions and a conclusion that must stay compact and useful.

Recommended stack

  • Source retrieval kept separate from synthesis
  • Long-context planner to consolidate findings
  • Final memo or table format before publishing

Flow steps

  1. Keep source capture and synthesis separate so evidence and summary do not collapse too early.
  2. Mark contradictions and gaps before writing the conclusion.
  3. Close with a short executive output and a list of remaining validations.

Model: GPT-5.4 as the main planner and Gemini 2.5 Flash-Lite if you need a cheap classification pass.

Provider: OpenAI for sustained reasoning; Google as support when reading throughput matters.

Mid-high Moderate Medium

Cautions

  • Do not use a single pass to gather sources and decide.
  • If you ask for too much prose, usefulness drops even when reasoning is strong.

Retrieval

Retrieval plus tools for flows that need real grounding

Medium

Problem: When the model must operate against documents or external systems and internal knowledge is not enough.

Recommended stack

  • Retriever or hybrid search with simple filters
  • Tool layer for external actions or verifiable queries
  • Synthesis model kept separate from retrieval

Flow steps

  1. Retrieve a few good blocks before expanding context.
  2. Use tools to verify or act and leave clear traces for every call.
  3. Ask the final synthesis to cite grounding limits and gaps.

Model: Gemini 2.5 Flash-Lite or GPT-5.4 mini for retrieval workers; GPT-5.4 for delicate synthesis.

Provider: Google and OpenAI work well when you separate cheap workers from the final layer; retrieval design matters most.

Medium Medium High

Cautions

  • More context does not fix badly designed retrieval.
  • If the tool layer is not auditable, the flow only looks safe.

Browser agents

Browser flow with agents and human closes

Medium

Problem: When the task requires navigating real UIs, extracting state and closing actions with some control.

Recommended stack

  • Browser worker for exploration and capture
  • Separate planner to decide the next step
  • Human checkpoint or strong rule before sensitive actions

Flow steps

  1. Explore the UI in short steps and capture evidence before acting.
  2. Make the planner summarize state, risks and the next action.
  3. Gate submits or destructive changes behind a clear approval.

Model: GPT-5.4 mini for browser workers and GPT-5.4 for the planner or approver.

Provider: OpenAI fits best when the flow depends on iterative tool use and controlled outputs.

Medium Mid-high High

Cautions

  • Do not confuse observation with safe autonomous execution.
  • Unstable UIs drive cost and error up if the flow is not bounded.

Multimodal

Multimodal analysis with one strong lane and a compact output

Mid-high

Problem: When image, audio, video or long PDFs enter the flow and the decision must land in one useful read.

Recommended stack

  • One strong multimodal lane for the main read
  • Short actionable text summary as the base output
  • Secondary verification path when the asset is critical

Flow steps

  1. Define which signals matter and what actionable output you need first.
  2. Process the multimodal asset in one strong lane before summarizing.
  3. Collapse the result into a memo, checklist or scorecard so the decision is useful.

Model: Gemini 2.5 Pro as the first choice; GPT-5.4 mini only when the visual input is light.

Provider: Google is usually stronger when the problem is truly multimodal and not just text with a decorative image.

Mid-high Moderate Medium

Cautions

  • Do not force multimodality if the real case is text and tables.
  • Without a compact output, the user ends up with a pretty description and little utility.

Local-first

Local-first workflow for privacy, edge and control

Competitive

Problem: When data residence, fixed cost or stack autonomy matter more than the last point of frontier quality.

Recommended stack

  • Open-weight or self-hosted model as the primary lane
  • Remote fallback only for premium cases or final validation
  • Simple observability over cost, latency and quality

Flow steps

  1. Define which tasks must stay local and which ones may escalate remotely.
  2. Size the local lane for the base case, not the most extreme benchmark.
  3. Measure latency, quality and cost before activating a frontier fallback.

Model: Mistral Large 3 or Ministral 3 8B depending on desired quality and operational footprint.

Provider: Mistral fits best when open-weight or private-cloud strategy is a real decision, not a slogan.

Competitive Variable High

Cautions

  • Local-first does not mean local-only: keep an exit path for complex cases.
  • If you do not measure hardware and throughput, the cost story becomes rhetorical.