Autonomous Bot | Chris Peng

System Architecture and Multithreading

The bot operates on a high-speed, multi-threaded pipeline. To achieve superhuman reaction times without dropping frames, heavy AI workloads including YOLO object detection and Optical Character Recognition run asynchronously on background threads. These systems feed a live state matrix into the main decision loop, which ultimately dispatches pixel coordinates to the deployment engine.

Perception: Three Vision Systems

Instead of relying on basic color matching, the bot uses tailored computer vision techniques to bypass dynamic lighting, shadows, and animations. The Arena Radar utilizes a trained YOLOv8 ONNX model to detect enemy troops, classify their state, and output bounding boxes.

Card Vision relies on Canny Edge Detection to extract structural features of the cards in the player hand. By converting template matching to compare black and white edge gradients rather than RGB pixels, the bot flawlessly recognizes cards even when they are darkened by zero-elixir shadows.

Finally, the Sub-Pixel Elixir Tracker scans the interface at an eighty percent depth to dodge white text overlays, counting purple pixels across micro-slices to return highly accurate floating point values. A specialized EasyOCR pipeline handles tower health reading by applying bicubic upscaling and thresholding.

Computer Vision Card Detection Pipeline using OpenCV

Cognition: The Decision Engine

The bot does not just react to whatever is closest. It calculates actual mathematical danger using a Multivariate Kinematic Model. Every YOLO detection is fed through a physics engine. The bot calculates the Estimated Time of Arrival to the Princess Tower based on the unit velocity. It combines this with a power score to generate a final threat score.

The system utilizes Short-Term State Memory, tracking recent plays with an eight-second time-to-live timestamp. Before defending, it calculates how much elixir it has already committed to a specific lane. If it has already played enough to secure a positive trade, it intentionally saves its bank, preventing panic overcommitting.

When choosing a defense, the bot uses lambda functions to sort available valid counters in its hand from cheapest to most expensive, ensuring optimal resource management. Furthermore, if the enemy heavily invests in the right lane, the bot mathematically recognizes the imbalance and launches counter-attacks in the left lane to split enemy attention.

Execution: Geometry and Placement

YOLO pixel detections are transformed into an isometric tile grid via perspective mapping. If defending with a spell, the bot calculates the flight time of the spell and the velocity of the enemy troop to mathematically aim ahead of the moving target. Ground troops are strategically placed using offset vectors that pull incoming enemies toward the center kill zone.

Technical Stack

Python 3.11 Runtime Environment
OpenCV for capture, edge extraction, and template matching
ONNX Runtime for CPU-efficient YOLOv8 inference
EasyOCR and PyTorch for tower health reading
NumPy for matrix math, pixel arrays, and geometry transforms

View Source Code on GitHub

1. The Context & Problem

2. The Product

3. Enacting My Position

System Architecture and Multithreading

Perception: Three Vision Systems

Cognition: The Decision Engine

Execution: Geometry and Placement

Technical Stack