Token Cutter 500same model, two paths
grok-4-fast-non-reasoning

One prompt in. Two workers out. Same model. Same master prompt.

The normal path sends your full prompt straight through. The TC5 path compresses the prompt first, then sends the shorter version through the same model with the same instruction stack.

Backend-driven prompt metrics
Shared master prompt
Side-by-side worker answers
Temp 0 • Max 72

Prompt input

Type once, then send the same request through both workers. Preview token charts refresh as you type, and live runs use the same low-variance settings on both paths.

82 / 220 wordsReady to sendLive token preview

Normal worker response

Full prompt sent directly to grok-4-fast-non-reasoning with the shared master prompt.

Input
--
Output
--
Total
--
Latency
--
Temp 0Max 72Finish after runFingerprint after run

The normal worker answer will appear here after the prompt is sent through the raw path.

TC5 worker response

Trimmed prompt sent to the same model with the same shared master prompt.

Input
--
Output
--
Total
--
Latency
--
Temp 0Max 72Finish after runFingerprint after run

The TC5 worker answer will appear here after the compressed path finishes.

Normal Tokens In--

Preview estimate for the full prompt until you send it.

TC5 Tokens In--

Preview estimate for the compressed prompt until you send it.

Normal Tokens Out--

Appears after the live model call.

TC5 Tokens Out--

Appears after the live model call.

Words removed: --
Preview savings: -- tokens (--)
Live latency appears after you send the prompt.

Normal worker prompt tokens

Backend preview count for the untouched user prompt before the master prompt is added.

---- words

TC5 worker prompt tokens

Backend preview count for the compressed prompt that goes through the TC5 path.

---- words
Public demo mode is capped at 220 words so the live Grok runs stay cheap and visibly comparable.