I cut my OpenClaw bill by 50%. Without changing a line of code.

三德子曰:此文洞见成本之耗,能以智能路由分派任务,实为开源节流之上策。然成本之降,非止于此。系统之建立,尤需审慎考量,勿因急于省钱而轻忽稳定性,亦防误判致他耗,反失长远之利。盖投资之道,亦当如是,资源配置,贵在精准。不以一时之小利而定取舍,当虑其根本,方能慢中求胜,长久持有,此乃生存之本,慢钱之道也。

作者: @akshay_pachaar
> 原文: https://x.com/akshay_pachaar/status/2025933798953865284

A casual "What's the plan today?" was costing me the same as "Rewrite this entire auth service."

Every prompt hit the same model at the same price, no matter how simple or complex. That's a broken setup.

Here’s how I fixed it:

If you use OpenClaw seriously, you know the tradeoff:

But that's only a tradeoff if you assume every prompt needs the same model. What if it didn't?

I’m using OpenClaw here, but this applies to any agentic workflow where you’re sending every prompt to a single model.

LLM routing is a layer between your app and your providers.

It reads each incoming prompt, classifies the task, and routes it to the model best suited for the job. Simple queries go to fast, cheap models.

Complex tasks go to more capable, expensive ones.

And it all happens automatically.

Plano is an open-source AI-native proxy that handles routing, orchestration, guardrails, and observability in one place.

One of its most powerful capabilities is preference-aligned LLM routing, which perfectly fit to my use case.

Instead of routing by benchmark scores, Plano routes by what developers actually prefer for each task. You encode that directly in config.

For example:

The router matches each prompt to your preferences and dispatches automatically. Zero changes to OpenClaw.

Plano is built on Arch-Router-1.5B, a model trained on human preference data, not just benchmark scores. It's already been deployed at scale at HuggingFace.

The model on HuggingFace:

Arch-router doesn’t guess which model is “smarter.” It routes based on what developers actually prefer for each task type. You define routing preferences in plain config.

Plano reads each prompt, matches it to a preference, and routes it. Zero changes to OpenClaw.

OpenClaw + Plano: An Architecture for Smart Model Routing

1️⃣ Setup

Set API keys as environment variables:

2️⃣ Create plano config file

Plano operates based on a configuration file where you can define LLM routing.

Create a config file to get started with Plano.

Key things that you need to set here are the LLM providers and your routing preferences (just plain english descriptions)

Check this out:

3️⃣ Start Plano

Now that the configuration file is created and environment variables are defined in a .env file, you start Plano with the following command:

4️⃣ Start OpenClaw

This installs the gateway as a background service.

You can also connect messaging channels, such as WhatsApp or Telegram.

Run ‘openclaw doctor’ to verify that everything is working correctly.

Check this out:

5️⃣ Point OpenClaw at Plano

During the OpenClaw onboarding wizard, when prompted to choose an LLM provider:

This registers Plano as OpenClaw’s LLM backend. All requests are routed through Plano on port 12000, which directs them to Kimi K2.5 or Claude based on the prompt content.

Check this:

Here’s what gets routed, and where:

Since we started Plano with --with-tracing, you can inspect exactly how each prompt was classified and which model it was routed to.

Here's a trace showing how Plano handled a routing decision, when we wanted to generate code, it used Claude instead of Kimi.

Check this:

I used to spend time picking the “right” model. That’s the wrong game.

The real question isn’t which model. It’s which model for which task?

Smart routing answers that automatically. You define your preferences once, and every prompt gets matched to the right model without you thinking about it.

And the result is you end up saving a lot of money without compromising on quality.

If you are solving a problem on smart LLM routing, Plano is 100% open-source (Apache 2.0).

Check out their GitHub repo → github.com/katanemo/plano

(don't forget to star🌟)

Thanks for reading!

And stay tuned for a detailed video that I'm doing this week on securely deploying OpenClaw.

Cheers! :)


本文由 sandezi-page 自动发布