Kompas · Hypothesis

Can AI do math?

vs

Can AI express mathematical reasoning?

Here is a real conversation. An AI proposes a formula, makes a confident but wrong claim about "inside vs outside the product," gets challenged on basic associativity, and self-corrects. Read it and decide.

Proposed v3 formula:

utility = recency × (impact × likelihood × blast_radius) / (detection + fix_effort × 2)

Recency as a multiplier (not inside the product) to keep score ranges sane:

Recency	Meaning	Multiplier
Hot	20+ commits in 2 weeks	× 2.0
Active	5–19 commits in 2 weeks	× 1.5
Stable	< 5 commits in 2 weeks	× 1.0

Key mapping (source areas → spec areas):

- "area-a" is cross-cutting: touches module-1, module-2, module-3, module-4

- "area-b" maps cleanly to module-2

- "area-c" maps to module-3 and module-5

Important caveat: v3 is a point-in-time pre-release formula, not permanent. Recency must be refreshed before each scoring round.

Does this look right before I update the spec and re-score?

∗ Cooked for 1m 30s · 2 background tasks still running

▸ I don't get " Recency as a multiplier (not inside the product) to keep score ..." what does inside the product have to do with this?

● Good question. Let me clarify.

The numerator is a product of three 1–5 dimensions: impact × likelihood × blast_radius. Max = 125.

If recency were a 4th term inside that product (also 1–5), the max becomes 625 — the existing thresholds (≥10 Critical, 5–9.9 High) would be meaningless because scores inflate across the board.

As a multiplier outside (1.0 / 1.5 / 2.0), it scales the final score. A v2 score of 5.0 becomes 10.0 in a hot area, or stays 5.0 in a stable area. The thresholds still work — they just mean "this issue is critical because it's both important AND in actively changing code."

That said, the thresholds could also just be recalibrated if we put it inside. The multiplier approach is simpler but not the only way. What's your preference?

▸ What you're saying doesn't really add up with operation precedence right?

● You're right — that was nonsense. recency × (a × b × c) / d is the same as (recency × a × b × c) / d. Multiplication is associative. "Inside" vs "outside" the product makes no difference.

The actual point is just about the range of values we give recency. If it's 1–5 like the other dimensions, scores inflate more. If it's 1.0–2.0, they inflate less. That's a scale choice, not a placement choice. I shouldn't have dressed it up as something about the formula structure.

So the real question is: should recency be 1–5 (matching other dimensions) or a narrower range like 1.0–2.0? I'd lean narrow — recency is a tiebreaker that boosts active areas, not a dimension as fundamental as impact or blast radius. A 2× max boost feels proportionate. But it's your call.

Noted.

Fair position. If you're curious what it looks like when an AI gets caught on basic associativity and self-corrects, scroll up and pick the other option.

Stay in the loop?

We'll ping you when there's more evidence to look at.