Point of View ODS Architecture Databricks IBM + Confluent 11 min read

The Next ODS: Databricks Real-Time Mode, IBM's Confluent Acquisition, and What It Means for Financial Services

Disclaimer: The views expressed in this article are entirely my own and do not represent the positions, strategies, or opinions of IBM, Databricks, Microsoft, or any affiliated organization. This is a personal point of view based on my experience designing data platforms for financial services.
⚡ TL;DR

Two things happened in the past six months that, separately, would each be worth paying attention to. Together, they could reshape how we build operational data stores for capital markets.

What actually changed

In August 2025, Databricks shipped Real-Time Mode (RTM) as a public preview inside Structured Streaming. It hit general availability in March 2026. The benchmarks are striking: latency dropping to sub-second, with many workloads hitting single-digit milliseconds. For context, micro-batch mode (which most production streaming jobs use today) runs at 10 to 30 second intervals. RTM closes that gap by an order of magnitude without requiring teams to rewrite their Spark code. You flip a configuration switch.

Two days before that, on March 17, IBM completed its acquisition of Confluent for roughly $11 billion. Confluent is the company behind managed Apache Kafka, and Kafka is the backbone of event streaming at most major banks. The deal had been announced the previous December, but seeing it close brings the implications into focus.

I spend my days designing ODS architectures for capital markets firms. These two events change the picture in ways that I think are worth laying out.

Why Databricks RTM matters for ODS

Until now, building a low-latency ODS on Databricks meant accepting a compromise. Structured Streaming is mature and well-governed, but micro-batch introduces an inherent floor on how fast data moves. For intraday risk, that floor was usually acceptable. For fraud detection or real-time pricing, it wasn't. Teams that needed sub-second latency had to bolt on Apache Flink or Kafka Streams as a separate processing layer, creating a dual-engine architecture that's expensive to run and painful to maintain.

RTM changes that tradeoff. If your Structured Streaming jobs can now process events continuously with sub-second latency, the argument for running a separate streaming engine gets much weaker. You keep one execution framework, one set of monitoring, one governance model through Unity Catalog. The operational simplification is significant.

I want to be specific about what this enables in an ODS context:

Architecture diagram showing data flowing from Core Banking, Order Management, and Market Data through Kafka/Confluent into Databricks Real-Time Mode, then to Delta Lake, and finally to Risk Dashboards, ML Feature Store, and Regulatory Reports, all governed by Unity Catalog
Fig. 1: Reference ODS architecture with Databricks RTM and Confluent/Kafka as the event backbone.

A practical note

RTM is still new. I'd run it in parallel with micro-batch on non-critical workloads first. The latency numbers are impressive, but production financial workloads need weeks of stability data before you bet your risk desk on them.

Why IBM acquiring Confluent is strategic

Here's my read on this, and I want to be clear that this is personal analysis, not an IBM talking point.

IBM has spent the past several years repositioning itself around AI and hybrid cloud. The watsonx platform, the Red Hat integration, the consulting push around generative AI. All of it points in the same direction: IBM wants to be the company that helps enterprises put AI into production, not just build demos.

But here's the gap that existed before this deal. AI models, whether they're generative copilots or agentic workflows, are only as useful as the data feeding them. And for financial services, that data needs to be real-time. A fraud detection agent running on stale data is just a reporting tool with extra steps. A KYC automation system that can't stream new watchlist updates is manually gated from the start.

Confluent fills that gap in a way that's hard to replicate organically. Here's what IBM gets:

Strategy diagram showing three layers: Source Systems at the bottom with Core Banking, Trade Systems, and Market Data; Real-Time Data Streaming in the middle with Confluent Cloud and Apache Kafka; and AI and Agentic Systems at the top with watsonx, Azure OpenAI, and Agentic Workflows, all connected by the IBM plus Confluent Stack
Fig. 2: How IBM's Confluent acquisition connects source systems to AI through real-time streaming.

The $11 billion price tag is steep, but I think the strategic logic holds. IBM was missing the real-time data plumbing layer, and Confluent is the market leader. Building that capability in-house would have taken years and billions anyway, with no guarantee of adoption.

How these two moves intersect

Animated real-time ODS event flow A trade event flows from a source system into Confluent Kafka (T+5ms), then into Databricks Real-Time Mode (T+50ms), then into the operational data store and a risk dashboard (T+200ms). Latency labels appear in turn as the packet moves. Source Trade · OMS · Mkt data Confluent Kafka topic Databricks RTM · Spark ODS + Risk Dashboard Intraday VaR · P&L T+5 ms T+50 ms T+200 ms · sub-second Confluent (IBM) feeds events · Databricks RTM closes the gap to single-digit ms
Fig. 1: The full-pipeline latency story. Source event arrives at Kafka in ~5 ms, hits Databricks RTM in ~50 ms, lands in the ODS / dashboard inside 200 ms. Confluent (now an IBM asset) is the streaming spine; RTM is what makes the lakehouse responsive enough to replace standalone CEP engines.

This is where it gets interesting. Databricks RTM reduces the need for a separate streaming engine between Kafka and the lakehouse. At the same time, IBM (now the owner of Confluent/Kafka) is trying to position Confluent as the universal data streaming layer that feeds everything, including lakehouses.

Are they competing? Partially. But I think the more likely outcome is a layered architecture where both have a clear role:

For an ODS architecture, this is actually a cleaner separation of concerns than what most firms run today. Kafka handles event capture and distribution. Databricks handles processing, storage, governance, and serving. You don't need Flink in the middle anymore for most use cases, which simplifies the stack considerably.

What this means for the ODS I build

I've been designing ODS platforms on Databricks and Fabric for capital markets firms. Here's how I'm thinking about updating those reference architectures:

  1. Adopt RTM for latency-sensitive layers. Position aggregation and risk views are the first candidates. Start with non-critical workloads in parallel, validate stability, then migrate the primary streams.
  2. Keep Kafka (now Confluent/IBM) as the event backbone. The ODS ingestion layer doesn't change. If anything, IBM's ownership of Confluent may accelerate enterprise Kafka adoption at firms that were hesitant about a smaller vendor.
  3. Account for Zerobus Ingest. Databricks made this generally available in February 2026 as a serverless, direct-to-lakehouse streaming path that does not need an intermediate message bus. It sustains sub-five-second latency at better than 10 GB/sec aggregate per table. It is positioned for simpler ingestion patterns (operational systems, telemetry) rather than the complex multi-consumer topologies Kafka excels at, so it does not replace Kafka in a serious capital markets ODS. It does mean the simpler ingestion edges of the architecture have a lighter option now.
  4. Factor in Lakebase. Databricks' serverless Postgres engine reached general availability in February 2026. It can handle the transactional layer of an ODS that traditionally needed a separate operational database. The convergence of real-time processing, governed analytics, and transactional storage on one platform is no longer a roadmap promise. It is a design option available today.

The bigger picture

Step back far enough and the pattern is clear. The streaming data market is consolidating rapidly. Databricks is building native real-time into the lakehouse. IBM just bought the largest managed Kafka provider. Microsoft Fabric has Eventstream. Snowflake has Dynamic Tables. Everyone is racing toward the same goal: eliminate the gap between when data is created and when it's usable.

For those of us building ODS platforms in financial services, this is mostly good news. More competition means better tooling, lower latency, and more options. But it also means the architecture decisions we make today will lock us into vendor relationships that are shifting underneath us.

A few months on from these moves, the early signal is in the product integrations, not the press releases. IBM named watsonx.data and IBM Z as day-one Confluent targets. Databricks shipped Zerobus and Lakebase to general availability. Real-Time Mode is GA. The roadmap is becoming product faster than most modernization cycles can absorb it.

My advice has not changed, and the last few months have only sharpened it. Design for portability where you can, because Delta Lake and open table formats are what keep you from being captured by any one vendor's real-time story. Pick the right tool for each layer instead of consolidating onto a single platform out of convenience. And sequence the migration so the risk desk feels the benefit first and the governance story holds up to an examiner. This is the work IBM Consulting does with capital markets clients now: not selling a product, but turning a fast-moving vendor landscape into an architecture that survives the next shift.

"The best ODS architectures aren't built around a single platform. They're built around clear layer boundaries, with each component doing the one thing it does best."

Reminder: This article reflects my personal analysis and opinions. It does not represent the views, strategy, or endorsement of IBM, Databricks, Microsoft, Confluent, or any other organization. All trademarks belong to their respective owners.

Want to talk ODS architecture?

I design streaming data platforms for capital markets. Happy to discuss how these shifts affect your roadmap.

Book a Session