Stream Processing World Meets Streaming Graph at Current 2024

Paige Roberts avatar Paige Roberts

The thatDot team had a great time last week at Confluent’s big conference, Current 2024. Our apologies to anyone that may have been hit. Attendees and exhibitors alike loved the thatDot Frisbees. We spoke with multiple attendees, learned about the challenges the stream processing community have when trying to do their jobs with KSQLDB or Flink in their data pipelines. We helped them understand how thatDot would fit into their architecture and how our approach to stream processing – thatDot Streaming Graph – can scale to meet their demands while solving deeper problems that challenge the key-value/relational models of other stream processors.

Ryan Wright explaining to multiple Current attendees how thatDot Streaming Graph event stream processor integrates with pipeline software like Kafka. At a small booth at Current, with Integration Options label on a diagram on the booth screen.

Some attendees mentioned that over the years Confluent has introduced two other technologies intended to provide stream processing in Kafka pipelines. The first technology introduced for event stream processing was Kafka Streams, but people were not satisfied with its capabilities. Then KSQLDB was the way to go. But that didn’t work quite as advertised, either.

Now Flink is the new technology touted to solve the problems of previous stream processing engines. Yet, Flink practitioners know that the complexity inherent in Flink operations necessitates a high level of expertise to run it. Out-of-memory errors due to making time windows too wide, or trying to join too many things across data streams were common problems people reported. We heard a repeated question from attendees about how long it will likely be until Confluent starts looking for the next stream processing technology. 

As an industry, we keep doing the same thing over and over again and expecting a different result–the definition of insanity. By continuing to use the same relational key/value type of mindset to process data with every stream processor, we continue to run into the same problems.

hatDot booth at Current conference showing blue frisbees, a paper stack that says, "thatDot is Categorically different," and a screen that says, "Multiple data streams with lots of joins on huge categorical datasets is super hard in most stream processors. For us, it's Tuesday."

Advantages of stream processors and graph data models

For years, people have understood the power of graph data to connect the dots and see the big contextual picture. But graph databases have the same problem as any other database, the data is no longer real-time streaming. It’s at rest. That inherently makes data analysis too slow for some of the most important and urgent actions a stream processor is needed for, like catching cybersecurity intrusions or stopping a fraudulent transaction. 

Ryan wright always says, “Answers now are always better than answers later.” Graph analysis that can work with modern data volumes at stream processor speed is a paradigm shift. You get the answers to deep questions fast. Instead of finding out months or even days later that you were breached, or your company was ripped off, you can stop problems before they cost you. 

Advantages of thatDot Streaming Graph

Some of the advantages the attendees of Confluent told us they found most compelling about thatDot Streaming Graph included:

  • Simplicity at scale – Flink has to manage current state and do complex logic for fault tolerance such as checkpoints/save points. Streaming graph doesn’t require any of that. Dynamic graph technologies don’t require state management, and high availability in our stream processor is more automatic. thatDot is far simpler to use, even at high scale.
  • Unlimited joins – For a relational key/value data model like Flink uses, multiple joins are difficult and memory intensive. For thatDot’s graph data model, unlimited joins in stream processing are the normal way we do things.
  • Categorical analysis – Most analytic tools can only analyze numbers. This means if you want to analyze people, places, events, locations, etc., you have to convert that data into wide, sparse numeric data, rest it in a database, analyze it, then turn it back into categorical data to get a final, understandable if a bit muddy answer. Having to rest the data before analysis slows response time hugely, and even then, your answer is likely to be unclear and inaccurate. thatDot analyzes categorical data directly, right in the stream processor.
  • Time unbound analysis – An event stream processor taking unbounded data streams and chopping them into little time-bounded chunks in order to analyze them has always been a workaround in our opinion. thatDot analyzes the whole data stream as it flows by, with no time window limitations. Even data stored in a file or in a database can be joined with current data. Important points from six months ago can be joined with data from six milliseconds ago to complete a picture and answer an analytical question.

Ryan’s talk on “Streaming Entity Resolution for Kafka”

Ryan Wright speaking at Current 2024 stream processing conference

Ryan Wright, our founder and CTO, did a very cool presentation at Current on entity resolution in stream processing which caught a lot of attention, especially from data engineers and anyone working toward real-time master data management. That’s live on the Current site now. Be sure to check it out: https://current.confluent.io/2024-sessions/streaming-entity-resolution-for-kafka-with-quine

To learn more, check out the thatDot Streaming Graph product page.

Or, try the Streaming Graph or Novelty free trial for yourself.

Get the handouts we gave to Current attendees.

And be sure to catch the thatDot team at Current 2025!

Foreground is five people with Ryan Wright, CTO of thatDot at center. Current background with Austin, Texas landmarks.