Modern Data Architecture & Engineering

Data technologies and SaaS platforms are evolving fast to meet the challenge of sophisticated use cases. They are changing the landscape of the modern data stack, from OLTP, data lakehouse to analytical systems, and empowering a new ecosystem in data engineering where batch and realtime systems are converging. But what should you choose? Can we do more with less? Come to this track to learn some fundamental, powerful yet versatile building blocks and their core engineering principles that you can leverage to build a simple yet efficient and scalable data architecture.


From this track

Session Streaming

Laying the Foundations for a Kappa Architecture - The Yellow Brick Road

Tuesday Jun 13 / 10:35AM EDT

In the ever changing landscape of big data, focus is slowly moving away from batch and towards realtime analytics. Data Science workflows are evolving to adapt to this changing landscape.

Speaker image - Sherin Thomas

Sherin Thomas

Staff Software Engineer @Chime

Session Stream Processing

Streaming from Apache Iceberg - Building Low-Latency and Cost-Effective Data Pipelines

Tuesday Jun 13 / 11:50AM EDT

Apache Flink is a very popular stream processing engine featuring sophisticated state management, even-time semantics, exactly-once state consistency. For low latency processing, Flink jobs typically consume data from streaming sources like Apache Kafka.

Speaker image - Steven Wu

Steven Wu

Software Engineer @Apple and Apache Iceberg PMC

Session Serverless

The Rise of the Serverless Data Architectures

Tuesday Jun 13 / 01:40PM EDT

For a while, it looked like Serverless was just a convenient way to run stateless functions in the cloud. But in the last year we’ve seen the rapid rise in serverless data stores.

Speaker image - Gwen Shapira

Gwen Shapira

Founder @Nile, PMC Member @Kafka

Session Data Architecture

Building a Large Scale Real-Time Ad Events Processing System

Tuesday Jun 13 / 02:55PM EDT

Two years ago, we embarked on building DoorDash's ad platform from the ground up. Today, our platform handles over 2 trillion events every day and our advertising business has experienced significant growth in recent years, becoming a key area of focus for the company.

Speaker image - Chao Chu

Chao Chu

Software Engineer @DoorDash

Session Architecture

Enabling Remote Query Execution Through DuckDB Extensions

Tuesday Jun 13 / 04:10PM EDT

DuckDB is a high-performance, embeddable analytical database system that has gained massive popularity in the last few years.

Speaker image - Stephanie Wang

Stephanie Wang

Founding Engineer @MotherDuck

Session

Unconference: Modern Data Architecture & Engineering

Tuesday Jun 13 / 05:25PM EDT

What is an unconference? An unconference is a participant-driven meeting. Attendees come together, bringing their challenges and relying on the experience and know-how of their peers for solutions.

Speaker image - Ben Linders

Ben Linders

Independent Consultant in Agile, Lean, Quality and Continuous Improvement

Date

Tuesday Jun 13 / 10:30AM EDT

Share

Track Host

Allen Wang

Senior Staff Engineer @DoorDash

Allen Wang is a tech lead of data platform at DoorDash where he architected and founded the real time streaming platform.

Prior to joining DoorDash, he was a lead in the real time data infrastructure team at Netflix where he created the Kafka infrastructure for Netflix’s Keystone data pipeline and shaped the Kafka ecosystem at Netflix.

He was the contributor to Apache Kafka’s rack aware partition assignment and cloud platform components in NetflixOSS. He is a regular speaker at QCon, Kafka Summit and other data conferences and meet-ups.

Read more