MegaGem — Bot Descriptions

Random AI

Very easy

The Random AI is the simplest bot in the game. It has no strategy at all — every decision is made by rolling the dice.

Bidding

When an auction card comes up, Random AI picks a bid uniformly at random between 0 and the maximum legal bid. It doesn't consider the value of the card, how many coins it has left, or what other players might do. A treasure card worth 20 points is just as likely to get a bid of 0 as a bid at the cap.

Revealing gems

When forced to reveal a gem from its hand, Random AI picks one at random with equal probability. It makes no attempt to hold back valuable colours or reveal ones that help its position.

Why it exists

Random AI serves as the baseline floor for the AI zoo. Any serious bot should comfortably beat it. It's also useful for quick testing — if a new AI can't even outperform random play, something is wrong.

Heuristic AI

Medium

The Heuristic AI is the first bot with an actual strategy. It estimates the value of each auction card and bids a fraction of that value, keeping coins in reserve for future rounds.

Treasure bidding

For treasure cards, the bot estimates how much the offered gems are worth. It does this by predicting what the Value Display will look like at the end of the game — gems it can see (its own hand, the current display) are counted exactly, while hidden gems (opponents' hands, the gem deck) are spread uniformly across unknown slots. Each gem's value is then looked up on the chart. Mission bonuses are also factored in: full credit if winning the gems would complete a mission, and partial credit if it moves the bot closer to one.

The final bid is 75% of that estimated value, capped so the bot always keeps a reserve of coins for future treasure auctions. The reserve is based on how many gems are still in the supply and their expected average value.

Investment bidding

Investments return their face value at the end of the game on top of the locked bid — they're free money. The bot bids its surplus coins (everything above its reserve), and will always bid at least 1 coin if it can, since any winning bid is profitable.

Loan bidding

Loans are net-negative (you pay back the full face value), so the bot only takes them as a last resort — specifically, when it has fewer than 5 coins and there are still at least 3 gems left in the supply worth competing for. Otherwise it bids 0.

Revealing gems

When choosing which gem to reveal from its hand, the Heuristic AI picks the one that benefits it the most. It scores each gem by how much the Value Display would increase for that colour, weighted by how many more of that colour it owns compared to its opponents. A gem it holds lots of and opponents hold few of is worth boosting; a gem opponents hold more of is worth avoiding. Ties are broken by revealing colours the bot holds least of.

Why it exists

Heuristic AI is the fitness opponent for the first generation of evolved bots. It provides a meaningful challenge — far above random play — and a stable baseline to measure whether genetic tuning is actually producing improvement.

Evolved AI

Hard

The Evolved AI is the first bot whose strategy was discovered by a genetic algorithm rather than written by hand. It builds on the Heuristic AI's foundation but replaces the fixed 75% discount with three independent linear models — one each for treasure, investment, and loan auctions — whose 18 total weights were tuned by evolution.

How the discount works

Each of the three "heads" computes a discount factor between 0 and 1 from five game-state features: how far through the game you are (progress), the bot's own coin ratio relative to the remaining expected value, the average and top opponent coin ratios, and a variance proxy based on how many hidden gems remain and how sensitive the current value chart is. Each head has its own bias and five learned weights, so treasures, investments, and loans can each react differently to the same game state.

Better value estimation

Instead of the Heuristic AI's simple point estimate of the final Value Display, the Evolved AI uses a hypergeometric distribution. It models the unknown gems (in opponents' hands and the deck) as draws from an urn without replacement, computing the full probability distribution for each colour's final display count. The expected value of each gem is then E[chart_value(X)] rather than chart_value(E[X]) — a significant accuracy improvement on non-linear charts like chart E, where the value peaks at 3 gems then crashes to 0.

Bidding

For treasures, the bot multiplies the hypergeometric value estimate (including mission bonuses) by the treasure head's discount, then caps the bid at its spendable coins. For investments, it applies the invest head's discount to its surplus coins and always bids at least 1 (free money). For loans, the loan head's discount is applied to the loan amount — a low discount effectively means "don't take this loan".

Revealing gems

Inherited unchanged from the Heuristic AI — pick the gem whose colour boost benefits the bot the most relative to opponents.

Why it exists

The Evolved AI demonstrates that a genetic algorithm can discover bidding strategies that beat hand-tuned heuristics. It achieves an 81% win rate against three Heuristic AI opponents and serves as the training opponent for the next generation of bots.

Learned weights (18 total)

Each head computes a discount factor: discount = clamp(bias + w1*f1 + ... + w5*f5, 0, 1). The discount is multiplied by the estimated value to produce the bid. A higher weight means the feature pushes the discount up (bid more aggressively).

Treasure head

Weight	Value	Interpretation
bias	+0.27	Low baseline discount (~27%) — starts conservative and lets the features decide
w_progress	-0.51	Bids less aggressively as the game goes on — tightens up late to preserve coins
w_my_cash	+0.45	Spends more freely when it has a large share of the remaining value in coins
w_avg_cash	+1.06	Strongest weight. Bids much harder when opponents are cash-rich — refuses to be outbid when everyone has money
w_top_cash	-0.17	Slightly pulls back when one specific opponent is very rich — avoids bidding wars it can't win
w_variance	+0.09	Barely affected by chart uncertainty — trusts its value estimates

The treasure head's defining trait is the huge w_avg_cash = +1.06 weight. When opponents are flush with coins, this bot goes all-in on treasures rather than letting them dominate the auctions. The negative progress weight means it front-loads spending on treasures early and becomes a miser in the endgame.

Invest head

Weight	Value	Interpretation
bias	-0.38	Negative baseline — rarely invests unless features push it up
w_progress	-0.18	Less willing to invest as the game goes on
w_my_cash	+0.70	Invests heavily when it has lots of coins relative to remaining value
w_avg_cash	-1.08	Strongest weight. Avoids investing when opponents also have lots of cash — prefers to save coins for treasure fights
w_top_cash	+0.65	But if one specific opponent is way richer, it does invest — concedes the next auction to lock in safe returns
w_variance	+0.08	Negligible effect from chart uncertainty

The invest head shows sophisticated resource management: it only invests when it can afford to (high w_my_cash), but avoids locking up coins when opponents also have money to spend on treasures (w_avg_cash = -1.08). The exception is when one opponent is so far ahead that competing for treasures seems futile — then it pivots to safe investment returns.

Loan head

Weight	Value	Interpretation
bias	-0.52	Strongly avoids loans by default — net-negative cash flow
w_progress	+0.92	Strongest weight. Increasingly willing to take loans late in the game for last-minute leverage
w_my_cash	+0.12	Small effect — neither desperate nor cautious about current cash
w_avg_cash	-0.09	Barely affected by opponent cash levels
w_top_cash	-0.44	Avoids loans when the top opponent is rich — doesn't want to go deeper in debt against a strong player
w_variance	+0.98	Near-strongest. Strongly takes loans when the situation is uncertain — gambles on high-variance charts

The loan head reveals an interesting endgame gambit: the massive w_progress = +0.92 combined with the high w_variance = +0.98 means this bot will take loans late in uncertain games — a "Hail Mary" play where the extra coins might swing a close outcome. Early in the game, the negative bias and w_top_cash keep it cautious.

Evo2 AI

Harder

Evo2 is a clean-slate redesign that throws out the hand-coded scaffolding of the Evolved AI and lets the genetic algorithm discover everything from scratch. Instead of computing a discount fraction and multiplying it by an estimated value, each head outputs the bid in coins directly — a simple linear formula whose 19 total weights encode the full bidding policy.

Direct bid output

Each head computes bid = bias + w1*f1 + w2*f2 + ... and the result is clamped to [0, max_legal_bid]. There's no intermediate "discount times value" step, which frees the GA from the implicit assumption that bids should scale proportionally with estimated value.

Smarter features

The progress proxy (fraction of the auction deck consumed) is replaced by an exact expected rounds remaining calculated via a closed-form multivariate hypergeometric over the known auction deck composition and remaining gem supply. Cash features switch from ratios to raw coin counts (my coins, average opponent coins, top opponent coins) and let the GA learn the right scaling. The treasure head also gets two new per-card features: the expected value and standard deviation of the prize, both derived from the hypergeometric gem distribution.

Mission probability delta

On top of the existing hard/soft mission bonuses, the treasure head factors in a mission probability delta for each active mission: the difference between "probability I win this mission if I take the gems" and "probability I win it if the richest opponent takes them", scaled by the mission's coin reward. This captures the competitive value of denying missions to opponents, not just pursuing your own.

Revealing gems

Uses the same reveal logic as the Heuristic AI — boost colours you hold more of than opponents.

Why it exists

Evo2 proves that removing hand-designed constraints and letting the GA learn freely produces a stronger bot. It was trained against a mix of all prior bots (Random, Heuristic, and Evolved) to avoid overfitting to any single opponent, and achieves a 69% pooled win rate across that field.

Learned weights (19 total)

Each head computes the bid directly in coins: bid = clamp(bias + w1*f1 + ... + wN*fN, 0, cap). Unlike the Evolved AI's discount model, there is no intermediate value estimate being scaled — the weights encode the full bidding policy.

Treasure head (7 weights)

Weight	Value	Interpretation
bias	+0.77	Small positive floor — always willing to bid a bit on treasures
w_rounds	-0.16	Bids slightly less with more rounds remaining — saves coins for later opportunities
w_my_coins	+0.12	Modestly increases bids when it has more coins
w_avg_opp	+0.04	Barely reacts to average opponent wealth
w_top_opp	-0.00	Essentially ignores the richest opponent — a big shift from Evolved AI's sensitivity
w_ev	+0.33	Dominant weight. Bids roughly 33% of the estimated treasure value — the core of its pricing logic
w_std	-0.03	Slightly cautious on high-uncertainty prizes — prefers predictable value

The treasure head is dominated by w_ev = +0.33 — the bot bids about a third of what it thinks the treasure is worth. The GA arrived at a much simpler strategy than Evolved AI's complex opponent-sensitivity: just price the card itself and ignore what opponents have. The small w_my_coins and w_rounds terms add basic budget awareness.

Invest head (6 weights)

Weight	Value	Interpretation
bias	+1.85	Very high baseline — aggressively pursues every investment
w_rounds	+0.42	Invests even more with more rounds remaining — lock in returns early
w_my_coins	+0.01	Own coins barely affect invest bids — always wants them regardless
w_avg_opp	-0.28	Invests less when opponents are richer — saves coins for treasure competition
w_top_opp	+0.16	But invests more when the leader is far ahead — pivots to safe returns when outmatched
w_amount	-0.05	Barely distinguishes between invest sizes — treats all investments as equally desirable

The invest head learned that investments are almost always good: the bias = +1.85 combined with w_rounds = +0.42 means the bot bids aggressively on investments early in the game when there are many rounds to benefit from the returns. The opposing w_avg_opp = -0.28 vs w_top_opp = +0.16 shows a nuanced strategy: hold off when the field is rich, but if one player is pulling away, secure the safe income.

Loan head (6 weights)

Weight	Value	Interpretation
bias	-0.49	Avoids loans by default — net-negative cash flow
w_rounds	-0.29	Even less willing with more rounds left — avoids early debt
w_my_coins	+0.11	More willing to take loans when cash-rich — uses leverage to amplify a lead
w_avg_opp	+0.21	Takes loans under competitive pressure — borrows to keep pace with richer opponents
w_top_opp	-0.05	Small pullback vs a rich leader
w_amount	+0.19	Prefers larger loans when it does borrow — go big or go home

A clear "reluctant borrower" strategy: negative bias and negative w_rounds mean loans are almost never taken early. But when opponents are richer (w_avg_opp = +0.21) and the bot needs to compete, it takes the bigger loan (w_amount = +0.19) as a calculated gamble to stay in the fight.

Evo3 AI

Very hard

Evo3 was the previous champion, and the first bot to watch opponent behaviour and adjust in real time. It extends Evo2 with an opponent-pricing signal that measures how much the highest opponent bid exceeds what Evo3 would have bid. Its 25 weights were tuned by evolution against all prior bots.

Opponent-pricing signal

After every round, the engine tells each player what happened. Evo3 uses this to record how much the highest opponent bid exceeded a baseline — the bid Evo3 itself would have made with no history. Over the course of a game, this builds up a running log of how aggressively opponents are bidding compared to Evo3's default pricing.

Weighted history

Each of the three heads reads two new features: the weighted mean and standard deviation of the opponent-delta history. Observations from the same category (e.g. treasure history when bidding on a treasure) are weighted 4x more heavily than cross-category observations, since how opponents price loans says less about how they'll price treasures.

No feedback loop

A subtle but critical detail: the baseline used to measure opponent deltas is what Evo3 would have bid using the default history values (0, 1), not what it actually bid. If the actual bid were used, the signal would depend on Evo3's own learned response to that signal — a feedback loop. By pinning the baseline to the "no history" bid, the measurement stays stable regardless of how aggressively the weights react.

Revealing gems

Same reveal logic as all prior evolved bots — inherited from the Heuristic AI.

Why it exists

Evo3 was the first bot to adapt mid-game by reading opponent behaviour — bidding more aggressively against passive opponents and pulling back against aggressive ones. It achieves a 72% pooled win rate against Random, Heuristic, and Evo2, and remains a strong opponent when you want a challenge without the per-color bid-signal inference that Evo4 layers on top.

Learned weights (25 total)

Same direct-bid formula as Evo2, plus two opponent-awareness features per head. The w_mean_delta and w_std_delta weights control how much the bot reacts to opponent bidding patterns observed during the game.

Treasure head (9 weights)

Weight	Value	Interpretation
bias	+0.95	Moderate positive floor — slightly more aggressive baseline than Evo2
w_rounds	-0.23	Bids less with more rounds remaining — stronger patience than Evo2
w_my_coins	+0.09	Modest coin sensitivity — relies more on value estimation than budget
w_avg_opp	+0.20	Bids harder when opponents are richer — competes for treasures under pressure
w_top_opp	-0.03	Near-zero — top opponent doesn't matter much, uses opponent delta signal instead
w_ev	+0.29	Core pricing logic. Bids ~29% of estimated value — more conservative than Evo2's 33%
w_std	-0.15	Bids less when value is uncertain — avoids overpaying on risky cards
w_mean_delta	-0.03	Slightly bids less when opponents overbid — avoids bidding wars on treasures
w_std_delta	-0.09	Bids less when opponent behaviour is erratic — cautious around unpredictable players

The treasure head bids more conservatively than Evo2 (w_ev = 0.29 vs 0.33) but compensates with w_avg_opp = +0.20 — it ramps up when opponents are flush. w_std = -0.15 actively discounts uncertain cards, avoiding overpaying when the final display value is hard to predict. The opponent-delta weights both pull bids down (w_mean_delta = -0.03, w_std_delta = -0.09) — evolution learned that on treasures, it's better to step back when opponents are aggressive or unpredictable rather than escalate.

Invest head (8 weights)

Weight	Value	Interpretation
bias	+2.04	Highest bias of any head. Extremely eager to invest
w_rounds	+0.26	Invests more with more rounds left — maximize time to earn returns
w_my_coins	-0.02	Near-zero coin sensitivity — invest regardless of current balance
w_avg_opp	-0.37	Pulls back when opponents are richer — saves coins for auction fights
w_top_opp	+0.33	But invests more when one opponent leads — concedes auctions, secures safe returns
w_amount	+0.09	Mild preference for larger investments
w_mean_delta	-0.08	Slightly reduces investment when opponents have been overbidding
w_std_delta	+0.03	Near-zero — opponent volatility barely affects investment decisions

The invest head is remarkably similar to Evo2's — the GA reinforced the same "invest early, invest often" strategy. The opponent-delta features are near-zero for investments, which makes sense: investment bidding doesn't benefit much from knowing opponent aggression patterns since the payoff is deterministic.

Loan head (8 weights)

Weight	Value	Interpretation
bias	-0.46	Avoids loans by default — stronger aversion than before
w_rounds	-0.29	Even less willing early in the game
w_my_coins	+0.01	Near-zero — loan decisions no longer depend on current balance
w_avg_opp	+0.17	Borrows when opponents are cash-rich — uses loans to stay competitive
w_top_opp	-0.04	Slight avoidance vs a rich leader
w_amount	+0.34	Strongest weight. Strongly prefers bigger loans — if borrowing, go big
w_mean_delta	-0.09	Avoids loans when opponents overbid — no longer borrows to escalate
w_std_delta	+0.18	Takes loans when opponent behaviour is volatile — borrows to exploit chaos

The loan head's opponent-awareness shows a nuanced strategy. w_mean_delta = -0.09 avoids loans when opponents consistently overbid, while w_std_delta = +0.18 takes loans when opponents are volatile. The strategy: exploit unpredictable opponents by borrowing to fund opportunistic bids when they under-bid, rather than escalating against consistent aggressors. Combined with w_avg_opp = +0.17, the loan head says "borrow when opponents are rich and erratic" — a subtle read of the table.

Evo4 AI

Hardest

Evo4 is the current champion. It keeps everything Evo3 does — the direct-bid linear heads, the hypergeometric value estimates, the opponent-pricing deltas — and layers two new ideas on top of the treasure head: per-color bid-signal inference and an internal opponent-bid predictor. Its 35 weights were tuned by evolution against Random, Heuristic, Evolved, Evo2, and Evo3.

Per-color bid signal

When opponents overbid on a treasure, Evo4 assumes they probably already hold gems of the colours that were on offer — if two players bid 8 on a treasure containing Blue and one bids 5, the overbidders are likely sitting on Blue gems in hand. Evo4 attributes the overbid (vs its own baseline) across the offered gems' colours and accumulates a persistent per-colour signal. When it later evaluates a treasure, each colour's chart-index expectation is nudged by color_bias_influence × color_signal[colour] before the value lookup. The shift is linearly interpolated between chart entries so small signals produce smooth gradients the GA can learn from.

Internal opponent-bid predictor

For every opponent seat, Evo4 runs a small internal Evo2-style treasure head from that seat's point of view — their coins become my_coins, everyone else (including Evo4) becomes the avg_opp / top_opp bucket. The predicted bids are aggregated into opp_max and opp_avg features that the treasure head can weight. The internal predictor's own weights are part of the evolvable genome, so the GA learns how to model opponents rather than inheriting a frozen Evo2 snapshot.

No feedback loop (inherited)

Like Evo3, Evo4 uses a baseline bid — the bid it would have made with the default (0, 1) delta inputs and a zero color shift — when computing the opponent delta. Both the opp-delta history and the per-colour signal share this single consistent baseline, so the first treasure round of the game produces a zero signal regardless of what the weights do.

Revealing gems

Same reveal logic as every other evolved bot — inherited from the Heuristic AI.

Why it exists

Evo4 is the first bot to extract private information from opponent bid magnitudes, not just bid timing. By inferring which colours opponents are hoarding and by modelling what they will bid next round, it gets a systematic edge over Evo3 on treasures without touching invest or loan bidding — those heads are structurally identical to Evo3's.

Learned weights (35 total)

Same direct-bid formula as Evo3 on every head, plus two opponent-bid-prediction features on the treasure head (w_opp_max, w_opp_avg), a scalar color_bias_influence that controls how much the per-colour signal warps the treasure EV, and a nested 7-weight internal Evo2-style head used to predict each opponent's bid.

Treasure head (11 weights)

Weight	Value	Interpretation
bias	+1.06	Highest treasure bias of any evo bot — aggressive baseline floor
w_rounds	-0.35	Strongest "patience" weight yet — pulls back hard when many rounds remain, so late-game bids hit harder
w_my_coins	+0.10	Modest coin sensitivity — mostly trusts its value estimate
w_avg_opp	+0.17	Bids harder when opponents are cash-rich — slightly less reactive than Evo3 on this axis
w_top_opp	+0.02	Near-zero — the rich leader is handled through the opp-predictor features instead
w_ev	+0.31	Core pricing logic. Bids ~31% of the (bias-adjusted) estimated treasure value
w_std	-0.11	Slightly pulls back on high-uncertainty treasures — similar to Evo3
w_mean_delta	-0.02	Small step-back when opponents overbid — avoids treasure bidding wars
w_std_delta	-0.09	Cautious around volatile opponents
w_opp_max	+0.02	New feature. Mildly escalates with the strongest predicted opponent bid — enough to contest but not chase
w_opp_avg	-0.01	New feature. Tiny negative — when the whole field is predicted to bid high, hold back slightly

The treasure head is the centrepiece of Evo4's upgrade. The biggest single change from Evo3 is the much stronger w_rounds = -0.35: Evo4 learned to be even more patient early, banking coins for treasures it can price more accurately later when the per-colour signal has matured. The new w_opp_max = +0.02 and w_opp_avg = -0.01 are small in magnitude but differ in sign — escalate against the strongest predicted opponent, hold back against a uniformly rich field — a nuance Evo3 could not express.

Invest head (8 weights)

Weight	Value	Interpretation
bias	+1.79	Very high baseline — still aggressively chases every investment
w_rounds	+0.21	Invests more with more rounds left — maximise time to earn returns
w_my_coins	-0.11	Slightly less when coin-rich — saves coins for treasure fights
w_avg_opp	-0.25	Pulls back when opponents are cash-rich — lets them spend first
w_top_opp	+0.30	But invests more when one opponent leads — concedes auctions, secures safe returns
w_amount	+0.25	Strong preference for larger investments — prioritises the high-return cards
w_mean_delta	+0.16	Flipped from Evo3. Invests more when opponents overbid — doubles down on safe income when treasures are being contested
w_std_delta	+0.10	Modest positive — a bit more eager in volatile games

The invest head inherits Evo3's "invest early, invest often" philosophy but flips w_mean_delta from negative to +0.16. The reasoning: when opponents consistently overbid on auctions, Evo4's treasure head steps back (w_mean_delta = -0.02 on treasure) and reallocates those coins into investments. It's a unified "concede chaos, lock in returns" policy.

Loan head (8 weights)

Weight	Value	Interpretation
bias	-0.43	Avoids loans by default — net-negative cash flow
w_rounds	-0.30	Even less willing early in the game
w_my_coins	+0.05	Small positive — slightly more willing to leverage when already rich
w_avg_opp	+0.07	Borrows a little more under competitive pressure
w_top_opp	+0.06	Small positive — a shift from Evo3, which avoided loans vs a rich leader
w_amount	+0.35	Strongest weight. Strongly prefers bigger loans — "go big or go home"
w_mean_delta	-0.05	Slightly avoids loans when opponents are overbidding
w_std_delta	+0.03	Mild positive — small gambler's instinct in chaotic games

The loan head is the most conservative of the three, and the least changed from Evo3. The dominant w_amount = +0.35 is preserved — if Evo4 borrows at all, it borrows big. The small positive w_top_opp is a subtle shift: Evo4 is slightly more willing than Evo3 to take loans against a single runaway leader, trading debt for late-game leverage.

Color bias influence (1 weight)

Weight	Value	Interpretation
color_bias_influence	+0.02	Scales the per-colour signal into a chart-index shift — small but non-zero

The GA kept the color-bias influence small (~+0.02) rather than zero. That's enough for several rounds of consistent overbidding to nudge one colour's chart index up by a visible fraction, shifting Evo4's treasure EV estimate toward whichever colours opponents appear to be sitting on — without swamping the value estimate on any single round.

Internal opponent predictor (7 weights)

Weight	Value	Interpretation
bias	+1.02	Assumes every opponent has a healthy baseline willingness to bid
w_rounds	+0.03	Near-zero — predicted opponents don't get more or less patient over time
w_my	+0.22	Modelled opponent bids more when they're cash-rich
w_avg	+0.42	Assumes opponents react strongly to the average wealth at the table
w_top	+0.94	Dominant weight. Assumes every opponent treats the richest player as the primary threat — a very aggressive model
w_ev	+0.01	Near-zero — the modelled opponent barely uses Evo4's own EV estimate, relying instead on coin signals
w_std	+0.15	Mild positive — modelled opponents are slightly more willing to bid on uncertain prizes

The internal predictor is fascinating: evolution produced a model of opponents that is very different from Evo2's real defaults. The w_top = +0.94 dwarfs every other weight — Evo4 assumes opponents obsess over the richest seat at the table, which makes the predicted opp_max track the leader's wealth closely. Pair that with the near-zero w_ev = +0.01 and the predictor is effectively saying "opponents don't price treasures, they price each other's coin piles". Whether or not that reflects reality, it gives Evo4's outer treasure head a useful leader-tracking signal via the opp_max feature.

Performance Heatmap

Bot Descriptions

Random AI

Heuristic AI

Evolved AI

Evo2 AI

Evo3 AI

Evo4 AI