Outright win rate of one challenger (row) vs three copies of the opponent (column) in 4-player games, averaged across all five value charts. Ties count as non-wins.
Loading heatmap data...
How each AI opponent makes its decisions. Click to expand.
The Random AI is the simplest bot in the game. It has no strategy at all — every decision is made by rolling the dice.
When an auction card comes up, Random AI picks a bid uniformly at
random between 0 and the maximum legal bid. It
doesn't consider the value of the card, how many coins it has
left, or what other players might do. A treasure card worth 20
points is just as likely to get a bid of 0 as a bid at the cap.
When forced to reveal a gem from its hand, Random AI picks one at random with equal probability. It makes no attempt to hold back valuable colours or reveal ones that help its position.
Random AI serves as the baseline floor for the AI zoo. Any serious bot should comfortably beat it. It's also useful for quick testing — if a new AI can't even outperform random play, something is wrong.
The Heuristic AI is the first bot with an actual strategy. It estimates the value of each auction card and bids a fraction of that value, keeping coins in reserve for future rounds.
For treasure cards, the bot estimates how much the offered gems are worth. It does this by predicting what the Value Display will look like at the end of the game — gems it can see (its own hand, the current display) are counted exactly, while hidden gems (opponents' hands, the gem deck) are spread uniformly across unknown slots. Each gem's value is then looked up on the chart. Mission bonuses are also factored in: full credit if winning the gems would complete a mission, and partial credit if it moves the bot closer to one.
The final bid is 75% of that estimated value, capped
so the bot always keeps a reserve of coins for future treasure
auctions. The reserve is based on how many gems are still in the
supply and their expected average value.
Investments return their face value at the end of the game on top
of the locked bid — they're free money. The bot bids its surplus
coins (everything above its reserve), and will always bid at
least 1 coin if it can, since any winning bid is
profitable.
Loans are net-negative (you pay back the full face value), so
the bot only takes them as a last resort — specifically, when it
has fewer than 5 coins and there are still at least
3 gems left in the supply worth competing for. Otherwise it
bids 0.
When choosing which gem to reveal from its hand, the Heuristic AI picks the one that benefits it the most. It scores each gem by how much the Value Display would increase for that colour, weighted by how many more of that colour it owns compared to its opponents. A gem it holds lots of and opponents hold few of is worth boosting; a gem opponents hold more of is worth avoiding. Ties are broken by revealing colours the bot holds least of.
Heuristic AI is the fitness opponent for the first generation of evolved bots. It provides a meaningful challenge — far above random play — and a stable baseline to measure whether genetic tuning is actually producing improvement.
The Evolved AI is the first bot whose strategy was discovered by a genetic algorithm rather than written by hand. It builds on the Heuristic AI's foundation but replaces the fixed 75% discount with three independent linear models — one each for treasure, investment, and loan auctions — whose 18 total weights were tuned by evolution.
Each of the three "heads" computes a discount factor between 0 and 1 from five game-state features: how far through the game you are (progress), the bot's own coin ratio relative to the remaining expected value, the average and top opponent coin ratios, and a variance proxy based on how many hidden gems remain and how sensitive the current value chart is. Each head has its own bias and five learned weights, so treasures, investments, and loans can each react differently to the same game state.
Instead of the Heuristic AI's simple point estimate of the
final Value Display, the Evolved AI uses a
hypergeometric distribution. It models the
unknown gems (in opponents' hands and the deck) as draws from
an urn without replacement, computing the full probability
distribution for each colour's final display count. The
expected value of each gem is then
E[chart_value(X)] rather than
chart_value(E[X]) — a significant accuracy
improvement on non-linear charts like chart E, where the value
peaks at 3 gems then crashes to 0.
For treasures, the bot multiplies the hypergeometric value estimate (including mission bonuses) by the treasure head's discount, then caps the bid at its spendable coins. For investments, it applies the invest head's discount to its surplus coins and always bids at least 1 (free money). For loans, the loan head's discount is applied to the loan amount — a low discount effectively means "don't take this loan".
Inherited unchanged from the Heuristic AI — pick the gem whose colour boost benefits the bot the most relative to opponents.
The Evolved AI demonstrates that a genetic algorithm can discover bidding strategies that beat hand-tuned heuristics. It achieves an 81% win rate against three Heuristic AI opponents and serves as the training opponent for the next generation of bots.
Each head computes a discount factor:
discount = clamp(bias + w1*f1 + ... + w5*f5, 0, 1).
The discount is multiplied by the estimated value to produce the bid.
A higher weight means the feature pushes the discount up (bid more aggressively).
| Weight | Value | Interpretation |
|---|---|---|
| bias | +0.27 | Low baseline discount (~27%) — starts conservative and lets the features decide |
| w_progress | -0.51 | Bids less aggressively as the game goes on — tightens up late to preserve coins |
| w_my_cash | +0.45 | Spends more freely when it has a large share of the remaining value in coins |
| w_avg_cash | +1.06 | Strongest weight. Bids much harder when opponents are cash-rich — refuses to be outbid when everyone has money |
| w_top_cash | -0.17 | Slightly pulls back when one specific opponent is very rich — avoids bidding wars it can't win |
| w_variance | +0.09 | Barely affected by chart uncertainty — trusts its value estimates |
w_avg_cash = +1.06 weight.
When opponents are flush with coins, this bot goes all-in on treasures rather than
letting them dominate the auctions. The negative progress weight means it front-loads
spending on treasures early and becomes a miser in the endgame.
| Weight | Value | Interpretation |
|---|---|---|
| bias | -0.38 | Negative baseline — rarely invests unless features push it up |
| w_progress | -0.18 | Less willing to invest as the game goes on |
| w_my_cash | +0.70 | Invests heavily when it has lots of coins relative to remaining value |
| w_avg_cash | -1.08 | Strongest weight. Avoids investing when opponents also have lots of cash — prefers to save coins for treasure fights |
| w_top_cash | +0.65 | But if one specific opponent is way richer, it does invest — concedes the next auction to lock in safe returns |
| w_variance | +0.08 | Negligible effect from chart uncertainty |
w_my_cash), but avoids locking up coins when opponents
also have money to spend on treasures (w_avg_cash = -1.08). The exception is
when one opponent is so far ahead that competing for treasures seems futile — then it
pivots to safe investment returns.
| Weight | Value | Interpretation |
|---|---|---|
| bias | -0.52 | Strongly avoids loans by default — net-negative cash flow |
| w_progress | +0.92 | Strongest weight. Increasingly willing to take loans late in the game for last-minute leverage |
| w_my_cash | +0.12 | Small effect — neither desperate nor cautious about current cash |
| w_avg_cash | -0.09 | Barely affected by opponent cash levels |
| w_top_cash | -0.44 | Avoids loans when the top opponent is rich — doesn't want to go deeper in debt against a strong player |
| w_variance | +0.98 | Near-strongest. Strongly takes loans when the situation is uncertain — gambles on high-variance charts |
w_progress = +0.92
combined with the high w_variance = +0.98 means this bot will take loans
late in uncertain games — a "Hail Mary" play where the extra coins might swing a
close outcome. Early in the game, the negative bias and w_top_cash keep it cautious.
Evo2 is a clean-slate redesign that throws out the hand-coded scaffolding of the Evolved AI and lets the genetic algorithm discover everything from scratch. Instead of computing a discount fraction and multiplying it by an estimated value, each head outputs the bid in coins directly — a simple linear formula whose 19 total weights encode the full bidding policy.
Each head computes
bid = bias + w1*f1 + w2*f2 + ... and the
result is clamped to [0, max_legal_bid].
There's no intermediate "discount times value" step, which
frees the GA from the implicit assumption that bids should
scale proportionally with estimated value.
The progress proxy (fraction of the auction deck consumed) is replaced by an exact expected rounds remaining calculated via a closed-form multivariate hypergeometric over the known auction deck composition and remaining gem supply. Cash features switch from ratios to raw coin counts (my coins, average opponent coins, top opponent coins) and let the GA learn the right scaling. The treasure head also gets two new per-card features: the expected value and standard deviation of the prize, both derived from the hypergeometric gem distribution.
On top of the existing hard/soft mission bonuses, the treasure head factors in a mission probability delta for each active mission: the difference between "probability I win this mission if I take the gems" and "probability I win it if the richest opponent takes them", scaled by the mission's coin reward. This captures the competitive value of denying missions to opponents, not just pursuing your own.
Uses the same reveal logic as the Heuristic AI — boost colours you hold more of than opponents.
Evo2 proves that removing hand-designed constraints and letting the GA learn freely produces a stronger bot. It was trained against a mix of all prior bots (Random, Heuristic, and Evolved) to avoid overfitting to any single opponent, and achieves a 69% pooled win rate across that field.
Each head computes the bid directly in coins:
bid = clamp(bias + w1*f1 + ... + wN*fN, 0, cap).
Unlike the Evolved AI's discount model, there is no intermediate value estimate
being scaled — the weights encode the full bidding policy.
| Weight | Value | Interpretation |
|---|---|---|
| bias | +0.77 | Small positive floor — always willing to bid a bit on treasures |
| w_rounds | -0.16 | Bids slightly less with more rounds remaining — saves coins for later opportunities |
| w_my_coins | +0.12 | Modestly increases bids when it has more coins |
| w_avg_opp | +0.04 | Barely reacts to average opponent wealth |
| w_top_opp | -0.00 | Essentially ignores the richest opponent — a big shift from Evolved AI's sensitivity |
| w_ev | +0.33 | Dominant weight. Bids roughly 33% of the estimated treasure value — the core of its pricing logic |
| w_std | -0.03 | Slightly cautious on high-uncertainty prizes — prefers predictable value |
w_ev = +0.33 — the bot bids about a third of
what it thinks the treasure is worth. The GA arrived at a much simpler strategy than
Evolved AI's complex opponent-sensitivity: just price the card itself and ignore what
opponents have. The small w_my_coins and w_rounds terms
add basic budget awareness.
| Weight | Value | Interpretation |
|---|---|---|
| bias | +1.85 | Very high baseline — aggressively pursues every investment |
| w_rounds | +0.42 | Invests even more with more rounds remaining — lock in returns early |
| w_my_coins | +0.01 | Own coins barely affect invest bids — always wants them regardless |
| w_avg_opp | -0.28 | Invests less when opponents are richer — saves coins for treasure competition |
| w_top_opp | +0.16 | But invests more when the leader is far ahead — pivots to safe returns when outmatched |
| w_amount | -0.05 | Barely distinguishes between invest sizes — treats all investments as equally desirable |
bias = +1.85
combined with w_rounds = +0.42 means the bot bids aggressively on investments
early in the game when there are many rounds to benefit from the returns. The opposing
w_avg_opp = -0.28 vs w_top_opp = +0.16 shows a nuanced strategy:
hold off when the field is rich, but if one player is pulling away, secure the safe income.
| Weight | Value | Interpretation |
|---|---|---|
| bias | -0.49 | Avoids loans by default — net-negative cash flow |
| w_rounds | -0.29 | Even less willing with more rounds left — avoids early debt |
| w_my_coins | +0.11 | More willing to take loans when cash-rich — uses leverage to amplify a lead |
| w_avg_opp | +0.21 | Takes loans under competitive pressure — borrows to keep pace with richer opponents |
| w_top_opp | -0.05 | Small pullback vs a rich leader |
| w_amount | +0.19 | Prefers larger loans when it does borrow — go big or go home |
w_rounds mean
loans are almost never taken early. But when opponents are richer (w_avg_opp = +0.21)
and the bot needs to compete, it takes the bigger loan (w_amount = +0.19) as
a calculated gamble to stay in the fight.
Evo3 was the previous champion, and the first bot to watch opponent behaviour and adjust in real time. It extends Evo2 with an opponent-pricing signal that measures how much the highest opponent bid exceeds what Evo3 would have bid. Its 25 weights were tuned by evolution against all prior bots.
After every round, the engine tells each player what happened. Evo3 uses this to record how much the highest opponent bid exceeded a baseline — the bid Evo3 itself would have made with no history. Over the course of a game, this builds up a running log of how aggressively opponents are bidding compared to Evo3's default pricing.
Each of the three heads reads two new features: the weighted mean and standard deviation of the opponent-delta history. Observations from the same category (e.g. treasure history when bidding on a treasure) are weighted 4x more heavily than cross-category observations, since how opponents price loans says less about how they'll price treasures.
A subtle but critical detail: the baseline used to measure opponent deltas is what Evo3 would have bid using the default history values (0, 1), not what it actually bid. If the actual bid were used, the signal would depend on Evo3's own learned response to that signal — a feedback loop. By pinning the baseline to the "no history" bid, the measurement stays stable regardless of how aggressively the weights react.
Same reveal logic as all prior evolved bots — inherited from the Heuristic AI.
Evo3 was the first bot to adapt mid-game by reading opponent behaviour — bidding more aggressively against passive opponents and pulling back against aggressive ones. It achieves a 72% pooled win rate against Random, Heuristic, and Evo2, and remains a strong opponent when you want a challenge without the per-color bid-signal inference that Evo4 layers on top.
Same direct-bid formula as Evo2, plus two opponent-awareness features per head.
The w_mean_delta and w_std_delta weights control how much
the bot reacts to opponent bidding patterns observed during the game.
| Weight | Value | Interpretation |
|---|---|---|
| bias | +0.95 | Moderate positive floor — slightly more aggressive baseline than Evo2 |
| w_rounds | -0.23 | Bids less with more rounds remaining — stronger patience than Evo2 |
| w_my_coins | +0.09 | Modest coin sensitivity — relies more on value estimation than budget |
| w_avg_opp | +0.20 | Bids harder when opponents are richer — competes for treasures under pressure |
| w_top_opp | -0.03 | Near-zero — top opponent doesn't matter much, uses opponent delta signal instead |
| w_ev | +0.29 | Core pricing logic. Bids ~29% of estimated value — more conservative than Evo2's 33% |
| w_std | -0.15 | Bids less when value is uncertain — avoids overpaying on risky cards |
| w_mean_delta | -0.03 | Slightly bids less when opponents overbid — avoids bidding wars on treasures |
| w_std_delta | -0.09 | Bids less when opponent behaviour is erratic — cautious around unpredictable players |
w_ev = 0.29 vs 0.33) but
compensates with w_avg_opp = +0.20 — it ramps up when opponents are flush.
w_std = -0.15 actively discounts uncertain cards, avoiding overpaying when
the final display value is hard to predict. The opponent-delta weights both
pull bids down (w_mean_delta = -0.03, w_std_delta = -0.09) —
evolution learned that on treasures, it's better to step back when opponents are
aggressive or unpredictable rather than escalate.
| Weight | Value | Interpretation |
|---|---|---|
| bias | +2.04 | Highest bias of any head. Extremely eager to invest |
| w_rounds | +0.26 | Invests more with more rounds left — maximize time to earn returns |
| w_my_coins | -0.02 | Near-zero coin sensitivity — invest regardless of current balance |
| w_avg_opp | -0.37 | Pulls back when opponents are richer — saves coins for auction fights |
| w_top_opp | +0.33 | But invests more when one opponent leads — concedes auctions, secures safe returns |
| w_amount | +0.09 | Mild preference for larger investments |
| w_mean_delta | -0.08 | Slightly reduces investment when opponents have been overbidding |
| w_std_delta | +0.03 | Near-zero — opponent volatility barely affects investment decisions |
| Weight | Value | Interpretation |
|---|---|---|
| bias | -0.46 | Avoids loans by default — stronger aversion than before |
| w_rounds | -0.29 | Even less willing early in the game |
| w_my_coins | +0.01 | Near-zero — loan decisions no longer depend on current balance |
| w_avg_opp | +0.17 | Borrows when opponents are cash-rich — uses loans to stay competitive |
| w_top_opp | -0.04 | Slight avoidance vs a rich leader |
| w_amount | +0.34 | Strongest weight. Strongly prefers bigger loans — if borrowing, go big |
| w_mean_delta | -0.09 | Avoids loans when opponents overbid — no longer borrows to escalate |
| w_std_delta | +0.18 | Takes loans when opponent behaviour is volatile — borrows to exploit chaos |
w_mean_delta = -0.09 avoids loans when opponents consistently overbid,
while w_std_delta = +0.18 takes loans when opponents
are volatile. The strategy: exploit unpredictable opponents by borrowing
to fund opportunistic bids when they under-bid, rather than escalating against
consistent aggressors. Combined with w_avg_opp = +0.17, the loan head
says "borrow when opponents are rich and erratic" — a subtle read of the table.
Evo4 is the current champion. It keeps everything Evo3 does — the direct-bid linear heads, the hypergeometric value estimates, the opponent-pricing deltas — and layers two new ideas on top of the treasure head: per-color bid-signal inference and an internal opponent-bid predictor. Its 35 weights were tuned by evolution against Random, Heuristic, Evolved, Evo2, and Evo3.
When opponents overbid on a treasure, Evo4 assumes they
probably already hold gems of the colours that were on
offer — if two players bid 8 on a treasure containing Blue
and one bids 5, the overbidders are likely sitting on Blue
gems in hand. Evo4 attributes the overbid (vs its own
baseline) across the offered gems' colours and accumulates
a persistent per-colour signal. When it later evaluates a
treasure, each colour's chart-index expectation is nudged
by color_bias_influence × color_signal[colour]
before the value lookup. The shift is linearly
interpolated between chart entries so small signals
produce smooth gradients the GA can learn from.
For every opponent seat, Evo4 runs a small internal
Evo2-style treasure head from that seat's
point of view — their coins become
my_coins, everyone else (including Evo4)
becomes the avg_opp / top_opp
bucket. The predicted bids are aggregated into
opp_max and opp_avg features
that the treasure head can weight. The internal
predictor's own weights are part of the evolvable genome,
so the GA learns how to model opponents rather than
inheriting a frozen Evo2 snapshot.
Like Evo3, Evo4 uses a baseline bid — the bid it
would have made with the default
(0, 1) delta inputs and a zero color shift —
when computing the opponent delta. Both the opp-delta
history and the per-colour signal share this single
consistent baseline, so the first treasure round of the
game produces a zero signal regardless of what the weights
do.
Same reveal logic as every other evolved bot — inherited from the Heuristic AI.
Evo4 is the first bot to extract private information from opponent bid magnitudes, not just bid timing. By inferring which colours opponents are hoarding and by modelling what they will bid next round, it gets a systematic edge over Evo3 on treasures without touching invest or loan bidding — those heads are structurally identical to Evo3's.
Same direct-bid formula as Evo3 on every head, plus two
opponent-bid-prediction features on the treasure head
(w_opp_max, w_opp_avg), a scalar
color_bias_influence that controls how much the
per-colour signal warps the treasure EV, and a nested 7-weight
internal Evo2-style head used to predict each opponent's bid.
| Weight | Value | Interpretation |
|---|---|---|
| bias | +1.06 | Highest treasure bias of any evo bot — aggressive baseline floor |
| w_rounds | -0.35 | Strongest "patience" weight yet — pulls back hard when many rounds remain, so late-game bids hit harder |
| w_my_coins | +0.10 | Modest coin sensitivity — mostly trusts its value estimate |
| w_avg_opp | +0.17 | Bids harder when opponents are cash-rich — slightly less reactive than Evo3 on this axis |
| w_top_opp | +0.02 | Near-zero — the rich leader is handled through the opp-predictor features instead |
| w_ev | +0.31 | Core pricing logic. Bids ~31% of the (bias-adjusted) estimated treasure value |
| w_std | -0.11 | Slightly pulls back on high-uncertainty treasures — similar to Evo3 |
| w_mean_delta | -0.02 | Small step-back when opponents overbid — avoids treasure bidding wars |
| w_std_delta | -0.09 | Cautious around volatile opponents |
| w_opp_max | +0.02 | New feature. Mildly escalates with the strongest predicted opponent bid — enough to contest but not chase |
| w_opp_avg | -0.01 | New feature. Tiny negative — when the whole field is predicted to bid high, hold back slightly |
w_rounds = -0.35: Evo4 learned to be even more
patient early, banking coins for treasures it can price more accurately later when the
per-colour signal has matured. The new w_opp_max = +0.02 and
w_opp_avg = -0.01 are small in magnitude but differ in sign — escalate
against the strongest predicted opponent, hold back against a uniformly rich
field — a nuance Evo3 could not express.
| Weight | Value | Interpretation |
|---|---|---|
| bias | +1.79 | Very high baseline — still aggressively chases every investment |
| w_rounds | +0.21 | Invests more with more rounds left — maximise time to earn returns |
| w_my_coins | -0.11 | Slightly less when coin-rich — saves coins for treasure fights |
| w_avg_opp | -0.25 | Pulls back when opponents are cash-rich — lets them spend first |
| w_top_opp | +0.30 | But invests more when one opponent leads — concedes auctions, secures safe returns |
| w_amount | +0.25 | Strong preference for larger investments — prioritises the high-return cards |
| w_mean_delta | +0.16 | Flipped from Evo3. Invests more when opponents overbid — doubles down on safe income when treasures are being contested |
| w_std_delta | +0.10 | Modest positive — a bit more eager in volatile games |
w_mean_delta from negative to +0.16. The reasoning:
when opponents consistently overbid on auctions, Evo4's treasure head steps back
(w_mean_delta = -0.02 on treasure) and reallocates those coins into
investments. It's a unified "concede chaos, lock in returns" policy.
| Weight | Value | Interpretation |
|---|---|---|
| bias | -0.43 | Avoids loans by default — net-negative cash flow |
| w_rounds | -0.30 | Even less willing early in the game |
| w_my_coins | +0.05 | Small positive — slightly more willing to leverage when already rich |
| w_avg_opp | +0.07 | Borrows a little more under competitive pressure |
| w_top_opp | +0.06 | Small positive — a shift from Evo3, which avoided loans vs a rich leader |
| w_amount | +0.35 | Strongest weight. Strongly prefers bigger loans — "go big or go home" |
| w_mean_delta | -0.05 | Slightly avoids loans when opponents are overbidding |
| w_std_delta | +0.03 | Mild positive — small gambler's instinct in chaotic games |
w_amount = +0.35 is preserved — if Evo4 borrows at all, it
borrows big. The small positive w_top_opp is a subtle shift: Evo4 is
slightly more willing than Evo3 to take loans against a single runaway leader, trading
debt for late-game leverage.
| Weight | Value | Interpretation |
|---|---|---|
| color_bias_influence | +0.02 | Scales the per-colour signal into a chart-index shift — small but non-zero |
| Weight | Value | Interpretation |
|---|---|---|
| bias | +1.02 | Assumes every opponent has a healthy baseline willingness to bid |
| w_rounds | +0.03 | Near-zero — predicted opponents don't get more or less patient over time |
| w_my | +0.22 | Modelled opponent bids more when they're cash-rich |
| w_avg | +0.42 | Assumes opponents react strongly to the average wealth at the table |
| w_top | +0.94 | Dominant weight. Assumes every opponent treats the richest player as the primary threat — a very aggressive model |
| w_ev | +0.01 | Near-zero — the modelled opponent barely uses Evo4's own EV estimate, relying instead on coin signals |
| w_std | +0.15 | Mild positive — modelled opponents are slightly more willing to bid on uncertain prizes |
w_top = +0.94 dwarfs
every other weight — Evo4 assumes opponents obsess over the richest seat at the table,
which makes the predicted opp_max track the leader's wealth closely. Pair
that with the near-zero w_ev = +0.01 and the predictor is effectively
saying "opponents don't price treasures, they price each other's coin piles". Whether
or not that reflects reality, it gives Evo4's outer treasure head a useful
leader-tracking signal via the opp_max feature.