Real-Time AI Decisions

An AI agent that answers questions about last week's data is useful for retrospective analysis. An AI agent that acts on data arriving in the current second is useful for fraud prevention, dynamic pricing, live inventory management, and real-time customer personalization. Real-Time AI Decisions describes the architectural pattern that closes the latency gap between a business event occurring and an autonomous agent taking action in response to it.

Achieving this requires careful coordination across the streaming ingestion layer, the table format layer, and the query execution layer.

Streaming into Apache Iceberg

The data foundation for real-time AI decisions is a streaming write pipeline into Apache Iceberg tables. Apache Flink is the most common choice for this pattern. A Flink job reads from a Kafka topic, processes and enriches the incoming events, and writes them as new rows to an Iceberg table in object storage using a short-commit interval (often thirty seconds to two minutes).

This approach is sometimes called the "streaming lakehouse." The data is in Iceberg tables (giving the AI agent a clean, governed SQL interface) but is being updated continuously rather than in nightly batches. The AI agent queries the Iceberg table and sees data that is at most a few minutes old.

Reducing Query Latency

Writing data quickly is only half the problem. The AI agent's SQL query must return results quickly enough to make the decision while the business event is still actionable. A fraud decision that takes thirty seconds to compute is useless for blocking a transaction that completes in three seconds.

Execution engines like Dremio address this through a combination of partition pruning (limiting scans to only the most recently written Iceberg partitions), reflections (pre-aggregated accelerated views that serve common query patterns without full table scans), and Arrow Flight SQL connections (eliminating JDBC serialization overhead between the agent and the engine).

Pre-Computed Decision Signals

For decisions that must happen in under one second, the standard approach is to pre-compute the AI decision signal in the streaming pipeline before the agent query even occurs. The Flink job that writes to Iceberg also scores each event against a deployed ML model endpoint as the event arrives. The model score is written as an additional column alongside the raw event data. When the agent queries the table, it reads the pre-computed score directly rather than triggering a live model inference call.

Human Override Design

Real-time decisions carry higher stakes than retrospective reporting. An autonomous pricing agent that misinterprets a demand signal could lower prices below cost on thousands of SKUs before a human notices. Every real-time AI decision system must include a human override mechanism: a simple interface that allows an authorized operator to halt the agent's decision loop, roll back recent automated actions, and inspect the data signals that triggered them. Storing the agent's decision log in an Iceberg table makes this inspection programmatic rather than requiring access to opaque application server logs.

Streaming into Apache Iceberg

Reducing Query Latency

Pre-Computed Decision Signals

Human Override Design

Master the Agentic Lakehouse

Start Your Free Dremio Trial

Architecting an Apache Iceberg Lakehouse

The AI Lakehouse

Apache Iceberg and Agentic AI

Lakehouse Built for Everyone

Real-Time AI Decisions

Streaming into Apache Iceberg

Reducing Query Latency

Pre-Computed Decision Signals

Human Override Design

Related Articles

Master the Agentic Lakehouse

Start Your Free Dremio Trial

Architecting an Apache Iceberg Lakehouse

The AI Lakehouse

Apache Iceberg and Agentic AI

Lakehouse Built for Everyone