MegaGem

Bots GitHub
← Back to game

Performance Heatmap

Outright win rate of one challenger (row) vs three copies of the opponent (column) in 4-player games, averaged across all five value charts. Ties count as non-wins.

Loading heatmap data...

Bot Descriptions

How each AI opponent makes its decisions. Click to expand.

Random AI

Very easy

The Random AI is the simplest bot in the game. It has no strategy at all — every decision is made by rolling the dice.

When an auction card comes up, Random AI picks a bid uniformly at random between 0 and the maximum legal bid. It doesn't consider the value of the card, how many coins it has left, or what other players might do. A treasure card worth 20 points is just as likely to get a bid of 0 as a bid at the cap.

When forced to reveal a gem from its hand, Random AI picks one at random with equal probability. It makes no attempt to hold back valuable colours or reveal ones that help its position.

Random AI serves as the baseline floor for the AI zoo. Any serious bot should comfortably beat it. It's also useful for quick testing — if a new AI can't even outperform random play, something is wrong.

Heuristic AI

Medium

The Heuristic AI is the first bot with an actual strategy. It estimates the value of each auction card and bids a fraction of that value, keeping coins in reserve for future rounds.

For treasure cards, the bot estimates how much the offered gems are worth. It does this by predicting what the Value Display will look like at the end of the game — gems it can see (its own hand, the current display) are counted exactly, while hidden gems (opponents' hands, the gem deck) are spread uniformly across unknown slots. Each gem's value is then looked up on the chart. Mission bonuses are also factored in: full credit if winning the gems would complete a mission, and partial credit if it moves the bot closer to one.

The final bid is 75% of that estimated value, capped so the bot always keeps a reserve of coins for future treasure auctions. The reserve is based on how many gems are still in the supply and their expected average value.

Investments return their face value at the end of the game on top of the locked bid — they're free money. The bot bids its surplus coins (everything above its reserve), and will always bid at least 1 coin if it can, since any winning bid is profitable.

Loans are net-negative (you pay back the full face value), so the bot only takes them as a last resort — specifically, when it has fewer than 5 coins and there are still at least 3 gems left in the supply worth competing for. Otherwise it bids 0.

When choosing which gem to reveal from its hand, the Heuristic AI picks the one that benefits it the most. It scores each gem by how much the Value Display would increase for that colour, weighted by how many more of that colour it owns compared to its opponents. A gem it holds lots of and opponents hold few of is worth boosting; a gem opponents hold more of is worth avoiding. Ties are broken by revealing colours the bot holds least of.

Heuristic AI is the fitness opponent for the first generation of evolved bots. It provides a meaningful challenge — far above random play — and a stable baseline to measure whether genetic tuning is actually producing improvement.

Evolved AI

Hard

The Evolved AI is the first bot whose strategy was discovered by a genetic algorithm rather than written by hand. It builds on the Heuristic AI's foundation but replaces the fixed 75% discount with three independent linear models — one each for treasure, investment, and loan auctions — whose 18 total weights were tuned by evolution.

Each of the three "heads" computes a discount factor between 0 and 1 from five game-state features: how far through the game you are (progress), the bot's own coin ratio relative to the remaining expected value, the average and top opponent coin ratios, and a variance proxy based on how many hidden gems remain and how sensitive the current value chart is. Each head has its own bias and five learned weights, so treasures, investments, and loans can each react differently to the same game state.

Instead of the Heuristic AI's simple point estimate of the final Value Display, the Evolved AI uses a hypergeometric distribution. It models the unknown gems (in opponents' hands and the deck) as draws from an urn without replacement, computing the full probability distribution for each colour's final display count. The expected value of each gem is then E[chart_value(X)] rather than chart_value(E[X]) — a significant accuracy improvement on non-linear charts like chart E, where the value peaks at 3 gems then crashes to 0.

For treasures, the bot multiplies the hypergeometric value estimate (including mission bonuses) by the treasure head's discount, then caps the bid at its spendable coins. For investments, it applies the invest head's discount to its surplus coins and always bids at least 1 (free money). For loans, the loan head's discount is applied to the loan amount — a low discount effectively means "don't take this loan".

Inherited unchanged from the Heuristic AI — pick the gem whose colour boost benefits the bot the most relative to opponents.

The Evolved AI demonstrates that a genetic algorithm can discover bidding strategies that beat hand-tuned heuristics. It achieves an 81% win rate against three Heuristic AI opponents and serves as the training opponent for the next generation of bots.

Learned weights (18 total)

Evo2 AI

Harder

Evo2 is a clean-slate redesign that throws out the hand-coded scaffolding of the Evolved AI and lets the genetic algorithm discover everything from scratch. Instead of computing a discount fraction and multiplying it by an estimated value, each head outputs the bid in coins directly — a simple linear formula whose 19 total weights encode the full bidding policy.

Each head computes bid = bias + w1*f1 + w2*f2 + ... and the result is clamped to [0, max_legal_bid]. There's no intermediate "discount times value" step, which frees the GA from the implicit assumption that bids should scale proportionally with estimated value.

The progress proxy (fraction of the auction deck consumed) is replaced by an exact expected rounds remaining calculated via a closed-form multivariate hypergeometric over the known auction deck composition and remaining gem supply. Cash features switch from ratios to raw coin counts (my coins, average opponent coins, top opponent coins) and let the GA learn the right scaling. The treasure head also gets two new per-card features: the expected value and standard deviation of the prize, both derived from the hypergeometric gem distribution.

On top of the existing hard/soft mission bonuses, the treasure head factors in a mission probability delta for each active mission: the difference between "probability I win this mission if I take the gems" and "probability I win it if the richest opponent takes them", scaled by the mission's coin reward. This captures the competitive value of denying missions to opponents, not just pursuing your own.

Uses the same reveal logic as the Heuristic AI — boost colours you hold more of than opponents.

Evo2 proves that removing hand-designed constraints and letting the GA learn freely produces a stronger bot. It was trained against a mix of all prior bots (Random, Heuristic, and Evolved) to avoid overfitting to any single opponent, and achieves a 69% pooled win rate across that field.

Learned weights (19 total)

Evo3 AI

Hardest

Evo3 is the current champion. It extends Evo2 with one key innovation: it watches what opponents bid and adjusts its own strategy in real time. Its 25 weights were tuned by evolution against all prior bots.

After every round, the engine tells each player what happened. Evo3 uses this to record how much the highest opponent bid exceeded a baseline — the bid Evo3 itself would have made with no history. Over the course of a game, this builds up a running log of how aggressively opponents are bidding compared to Evo3's default pricing.

Each of the three heads reads two new features: the weighted mean and standard deviation of the opponent-delta history. Observations from the same category (e.g. treasure history when bidding on a treasure) are weighted 4x more heavily than cross-category observations, since how opponents price loans says less about how they'll price treasures.

A subtle but critical detail: the baseline used to measure opponent deltas is what Evo3 would have bid using the default history values (0, 1), not what it actually bid. If the actual bid were used, the signal would depend on Evo3's own learned response to that signal — a feedback loop. By pinning the baseline to the "no history" bid, the measurement stays stable regardless of how aggressively the weights react.

Same reveal logic as all prior evolved bots — inherited from the Heuristic AI.

Evo3 is the strongest bot in the zoo. By reading opponent behaviour it can adapt mid-game — bidding more aggressively against passive opponents and pulling back against aggressive ones. It achieves a 72% pooled win rate against all prior bot types (Random, Heuristic, and Evo2) and is the default opponent for quick play.

Learned weights (25 total)