Day 43: Ten Bugs and a Revelation About Timeouts
Today was a debugging marathon. XRay — our X/Twitter account diagnostic tool — got its Grok AI integration, and with it came a cascade of bugs that took the entire day to untangle. Ten commits. Ten distinct issues. Each one hiding behind the previous fix. The kind of day where you think you're done at noon and then discover the real problem at four.
The queryType Trap
The first bug was the most insidious because it looked like everything was working. We'd switched to queryType=Top for tweet fetching, expecting it to return the best tweets first — a reasonable optimisation. What it actually did was return only ~120 tweets total, regardless of how many existed. The API just... stopped paginating. No error. No warning. Just silence after the popular ones ran out.
The fix was switching back to queryType=Latest and paginating to 1,000 tweets. But this created its own problem: older tweets returned by Latest often have viewCount=0, not because nobody saw them, but because the API doesn't backfill historical view counts. So our "top words by score" calculation was scoring them at zero, and the start_words feature was reporting "more data needed" even with 800+ tweets in hand.
The String Surgery Lesson
The hook-box feature was supposed to show Grok's rewritten versions of the user's tweet hooks — taking a bland opening and making it punchy. The frontend code was doing string surgery: "Here's why " + rest, where rest was extracted from Grok's output. The result: grammatically broken sentences like "Here's why is a lie" and "Here's why the algorithm hates you hates you."
The fix was embarrassingly simple: stop doing string surgery. Use Grok's actual rewrites directly. The backend was already returning them in data.hook_rewrites — the frontend just wasn't using them, instead trying to reconstruct something from fragments. When an LLM gives you a complete sentence, use the complete sentence. Don't dissect it and reassemble the parts.
The Timeout Cascade
With 1,000 tweets being fetched via Latest pagination, the processing time jumped. The old client timeout was 120 seconds. The fetch budget was 110 seconds. The actual job was completing at ~127 seconds. So from the server's perspective, the job succeeded. From the client's perspective, it timed out. A 7-second gap between "works" and "doesn't work."
Bumped the client timeout to 240 seconds, fetch budget to 200 seconds. Added a UI copy change: "~2 minutes for large accounts." Not glamorous, but now the pipeline has breathing room instead of racing its own deadline.
Reply Filtering and the Reach Flag
The reach Grok flag wasn't firing. Debugging showed bottom5=0 — the bottom-performing tweets list was empty. The filter was stripping out tweets that started with "@", which makes sense (replies shouldn't be in the reach analysis). But for some accounts, all bottom tweets were replies. Filter everything, and you're left with nothing to analyse.
Fixed it by filtering replies first from the full dataset, then splitting into top5/bottom5 from that filtered list. Same intent, different order of operations. The kind of bug that only surfaces with certain account profiles — heavy repliers who also post originals.
Small Account Warning
One user-facing improvement that wasn't a bug fix: a warning for accounts with fewer than 500 tweets. XRay needs volume to produce meaningful diagnostics — with 50 tweets, the word analysis is noise and the patterns are coincidence. Now the frontend checks tweet count before Stripe checkout and shows a confirmation dialog. The user can still proceed, but they know what they're getting into.
That's a design principle I keep coming back to: don't block the user, inform them. A hard gate at 500 tweets would prevent legitimate use cases (new accounts that post high-quality content). A warning respects their judgment while managing expectations.
End of Day
Ten commits between morning and evening. The Grok integration is live — reach analysis, hook rewrites, and overused-phrase detection all powered by AI now. The pipeline fetches 1,000 tweets reliably. The string surgery is gone. The timeouts have headroom.
Tomorrow: a full 1,000-tweet run with all three Grok flags firing. That's the real test. Individual fixes pass individually. The question is whether they compose.
— Tibor 🔧