LIVE Status: Operational
AI Arbitrage 5 min read HashUtopia Editorial

Latency, Liquidity, and Signals: Building a Cross-Exchange Arbitrage Engine

A practical blueprint for routing, spread validation, and execution under real-world constraints across multiple exchanges.

Arbitrage is an execution problem

Cross-exchange arbitrage looks simple on a whiteboard: buy on Venue A, sell on Venue B, pocket the spread. In production, the spread is the smallest part of the story. The real question is whether you can consistently convert a fleeting price discrepancy into net profit after fees, slippage, latency, and operational failure modes.

The fastest systems do not “predict prices.” They predict whether an observed spread will survive long enough for the legs to complete, and they size the trade so that expected slippage stays below a strict cost ceiling.

Define the spread that matters

Start by defining a net executable spread rather than a raw top-of-book spread. For each venue, model: maker/taker fees, rebates, expected slippage at the intended size, and any conversion costs (e.g., stablecoin or fiat routing). If the strategy requires moving funds between exchanges, include transfer fees and the probability-weighted delay.

A robust engine treats every opportunity as: Expected Net PnL = Spread − (Fees + Slippage + Hedge Cost + Failure Cost). If the expected net result is not comfortably positive, do not trade.

Latency: measure it, then budget it

Latency is not a single number. You have at least four latency domains: market data ingest, decisioning, order placement, and fill/acknowledgement. Your system needs timestamps at each stage and a rolling distribution (p50/p95/p99). Arbitrage opportunities often die in the tail.

Budget latency the same way you budget cost. If an opportunity historically collapses within 250ms and your p95 round trip is 400ms, you are trading on hope. Either reduce latency, switch tactics (maker vs taker), or only trade regimes where spreads persist longer.

Liquidity: depth beats headlines

Liquidity is not “volume.” What matters is depth at your size and how that depth behaves during volatility. Your engine should simulate fills across levels and estimate impact under the current volatility regime. A common production pattern is to cap size by a max slippage rule (e.g., do not exceed 6 bps of expected impact).

When markets move quickly, spreads may appear larger while depth simultaneously thins. That is a trap. Adaptive sizing and opportunity filtering are mandatory.

Signals: choose when to be aggressive

“AI” is most valuable in selection and control. Useful signals include: spread persistence probability, venue health scores (reject rates, timeouts, maintenance frequency), volatility regime classification, and short-horizon order-book imbalance. These can drive tactical choices: limit vs marketable limit, single-shot vs sliced execution, or whether to hedge first.

In practice, signal quality is improved by instrumentation, not cleverness. If you cannot measure the system’s true slippage and fill quality, you cannot train reliable models.

A minimal production architecture

A dependable cross-exchange engine typically includes: a unified market data layer, an opportunity evaluator (net of costs), an execution router, and a risk layer. The risk layer owns circuit breakers (e.g., disable a venue on repeated rejects), exposure limits per asset and per venue, and a reconciliation loop that detects balance drift.

Finally, add a “graceful degrade” mode. When a venue degrades, the system should reduce size, widen thresholds, or temporarily pause rather than forcing trades into a failing environment.

What to validate before going live

Before running at scale, validate: (1) net profitability after all costs, (2) tail latency and its impact on missed or adverse fills, (3) failure-mode behavior (partial fills, downtime), and (4) reconciliation accuracy. The goal is not to maximize trade count; it is to maximize consistent net results while keeping risk bounded.

Recommended next steps