Data Engineering Podcast

By Tobias Macey

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by Tobias Macey

Category: Technology

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 762
Reviews: 0
Episodes: 423

Description

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Episode Date
Making Email Better With AI At Shortwave
Apr 21, 2024
Designing A Non-Relational Database Engine
Apr 14, 2024
Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer
Apr 07, 2024
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary
Mar 31, 2024
Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+
Mar 24, 2024
Reconciling The Data In Your Databases With Datafold
Mar 17, 2024
Version Your Data Lakehouse Like Your Software With Nessie
Mar 10, 2024
When And How To Conduct An AI Program
Mar 03, 2024
Find Out About The Technology Behind The Latest PFAD In Analytical Database Development
Feb 25, 2024
Using Trino And Iceberg As The Foundation Of Your Data Lakehouse
Feb 18, 2024
Data Sharing Across Business And Platform Boundaries
Feb 11, 2024
Tackling Real Time Streaming Data With SQL Using RisingWave
Feb 04, 2024
Build A Data Lake For Your Security Logs With Scanner
Jan 29, 2024
Modern Customer Data Platform Principles
Jan 22, 2024
Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel
Jan 07, 2024
Designing Data Platforms For Fintech Companies
Jan 01, 2024
Troubleshooting Kafka In Production
Dec 24, 2023
Adding An Easy Mode For The Modern Data Stack With 5X
Dec 18, 2023
Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack
Dec 11, 2023
Designing Data Transfer Systems That Scale
Dec 04, 2023
Addressing The Challenges Of Component Integration In Data Platform Architectures
Nov 27, 2023
Unlocking Your dbt Projects With Practical Advice For Practitioners
Nov 20, 2023
Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine
Nov 13, 2023
Shining Some Light In The Black Box Of PostgreSQL Performance
Nov 06, 2023
Surveying The Market Of Database Products
Oct 30, 2023
Defining A Strategy For Your Data Products
Oct 23, 2023
Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable
Oct 15, 2023
Using Data To Illuminate The Intentionally Opaque Insurance Industry
Oct 09, 2023
Building ETL Pipelines With Generative AI
Oct 01, 2023
Powering Vector Search With Real Time And Incremental Vector Indexes
Sep 25, 2023
Building Linked Data Products With JSON-LD
Sep 17, 2023
An Overview Of The State Of Data Orchestration In An Increasingly Complex Data Ecosystem
Sep 10, 2023
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library
Sep 04, 2023
Building An Internal Database As A Service Platform At Cloudflare
Aug 28, 2023
Harnessing Generative AI For Creating Educational Content With Illumidesk
Aug 20, 2023
Unpacking The Seven Principles Of Modern Data Pipelines
Aug 14, 2023
Quantifying The Return On Investment For Your Data Team
Aug 06, 2023
Strategies For A Successful Data Platform Migration
Jul 31, 2023
Build Real Time Applications With Operational Simplicity Using Dozer
Jul 24, 2023
Datapreneurs - How Todays Business Leaders Are Using Data To Define The Future
Jul 17, 2023
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling
Jul 09, 2023
How Data Engineering Teams Power Machine Learning With Feature Platforms
Jul 03, 2023
Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh
Jun 25, 2023
How Column-Aware Development Tooling Yields Better Data Models
Jun 18, 2023
Build Better Tests For Your dbt Projects With Datafold And data-diff
Jun 11, 2023
Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service
Jun 04, 2023
A Roadmap To Bootstrapping The Data Team At Your Startup
May 29, 2023
Keep Your Data Lake Fresh With Real Time Streams Using Estuary
May 21, 2023
What Happens When The Abstractions Leak On Your Data
May 15, 2023
Use Consistent And Up To Date Customer Profiles To Power Your Business With Segment Unify
May 07, 2023
Realtime Data Applications Made Easier With Meroxa
Apr 24, 2023
Building Self Serve Business Intelligence With AI And Semantic Modeling At Zenlytic
Apr 16, 2023
An Exploration Of The Composable Customer Data Platform
Apr 10, 2023
Mapping The Data Infrastructure Landscape As A Venture Capitalist
Apr 03, 2023
Unlocking The Potential Of Streaming Data Applications Without The Operational Headache At Grainite
Mar 25, 2023
Aligning Data Security With Business Productivity To Deploy Analytics Safely And At Speed
Mar 19, 2023
Use Your Data Warehouse To Power Your Product Analytics With NetSpring
Mar 10, 2023
Exploring The Nuances Of Building An Intentional Data Culture
Mar 06, 2023
Building A Data Mesh Platform At PayPal
Feb 27, 2023
The View Below The Waterline Of Apache Iceberg And How It Fits In Your Data Lakehouse
Feb 19, 2023
Let The Whole Team Participate In Data With The Quilt Versioned Data Hub
Feb 11, 2023
Reflecting On The Past 6 Years Of Data Engineering
Feb 06, 2023
Let Your Business Intelligence Platform Build The Models Automatically With Omni Analytics
Jan 30, 2023
Safely Test Your Applications And Analytics With Production Quality Data Using Tonic AI
Jan 22, 2023
Building Applications With Data As Code On The DataOS
Jan 16, 2023
Automate Your Pipeline Creation For Streaming Data Transformations With SQLake
Jan 08, 2023
Increase Your Odds Of Success For Analytics And AI Through More Effective Knowledge Management With AlignAI
Dec 29, 2022
Using Product Driven Development To Improve The Productivity And Effectiveness Of Your Data Teams
Dec 29, 2022
Simple And Scalable Encryption Of Data In Use For Analytics And Machine Learning With Opaque Systems
Dec 26, 2022
An Exploration Of Tobias' Experience In Building A Data Lakehouse From Scratch
Dec 26, 2022
Making Sense Of The Technical And Organizational Considerations Of Data Contracts
Dec 19, 2022
Revisit The Fundamental Principles Of Working With Data To Avoid Getting Caught In The Hype Cycle
Dec 19, 2022
Convert Your Unstructured Data To Embedding Vectors For More Efficient Machine Learning With Towhee
Dec 12, 2022
Run Your Applications Worldwide Without Worrying About The Database With Planetscale
Dec 12, 2022
Business Intelligence In The Palm Of Your Hand With Zing Data
Dec 05, 2022
Adopting Real-Time Data At Organizations Of Every Size
Dec 05, 2022
Supporting And Expanding The Arrow Ecosystem For Fast And Efficient Data Processing At Voltron Data
Nov 28, 2022
Analyze Massive Data At Interactive Speeds With The Power Of Bitmaps Using FeatureBase
Nov 28, 2022
A Look At The Data Systems Behind The Gameplay For League Of Legends
Nov 21, 2022
Tame The Entropy In Your Data Stack And Prevent Failures With Sifflet
Nov 21, 2022
Build Data Products Without A Data Team Using AgileData
Nov 14, 2022
Taking A Look Under The Hood At CreditKarma's Data Platform
Nov 14, 2022
Build Better Data Products By Creating Data, Not Consuming It
Nov 07, 2022
Clean Up Your Data Using Scalable Entity Resolution And Data Mastering With Zingg
Nov 07, 2022
Expanding The Reach of Business Intelligence Through Ubiquitous Embedded Analytics With Sisense
Oct 31, 2022
Analytics Engineering Without The Friction Of Complex Pipeline Development With Optimus and dbt
Oct 30, 2022
How To Bring Agile Practices To Your Data Projects
Oct 23, 2022
Going From Transactional To Analytical And Self-managed To Cloud On One Database With MariaDB
Oct 23, 2022
Speeding Up The Time To Insight For Supply Chains And Logistics With The Pathway Database That Thinks
Oct 16, 2022
An Exploration Of The Open Data Lakehouse And Dremio's Contribution To The Ecosystem
Oct 16, 2022
Making The Open Data Lakehouse Affordable Without The Overhead At Iomete
Oct 10, 2022
Investing In Understanding The Customer Journey At American Express
Oct 10, 2022
Gain Visibility And Insight Into Your Supply Chains Through Operational Analytics Powered By Roambee
Oct 03, 2022
Make Data Lineage A Ubiquitous Part Of Your Work By Simplifying Its Implementation With Alvin
Oct 03, 2022
Power Your Real-Time Analytics Without The Headache Using Fivetran's Change Data Capture Integrations
Sep 26, 2022
Build A Common Understanding Of Your Data Reliability Rules With Soda Core and Soda Checks Language
Sep 26, 2022
Building A Shared Understanding Of Data Assets In A Business Through A Single Pane Of Glass With Workstream
Sep 19, 2022
Operational Analytics To Increase Efficiency For Multi-Location Businesses With OpsAnalitica
Sep 19, 2022
Building Data Pipelines That Run From Source To Analysis And Activation With Hevo Data
Sep 12, 2022
Build Confidence In Your Data Platform With Schema Compatibility Reports That Span Systems And Domains Using Schemata
Sep 12, 2022
A Reflection On Data Observability As It Reaches Broader Adoption
Sep 05, 2022
Introduce Climate Analytics Into Your Data Platform Without The Heavy Lifting Using Sust Global
Sep 05, 2022
An Exploration Of What Data Automation Can Provide To Data Engineers And Ascend's Journey To Make It A Reality
Aug 29, 2022
Alumni Of AirBnB's Early Years Reflect On What They Learned About Building Data Driven Organizations
Aug 28, 2022
An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications
Aug 22, 2022
Understanding The Role Of The Chief Data Officer
Aug 22, 2022
Bringing Automation To Data Labeling For Machine Learning With Watchful
Aug 14, 2022
Collecting And Retaining Contextual Metadata For Powerful And Effective Data Discovery
Aug 14, 2022
Useful Lessons And Repeatable Patterns Learned From Data Mesh Implementations At AgileLab
Aug 06, 2022
Optimize Your Machine Learning Development And Serving With The Open Source Vector Database Milvus
Aug 06, 2022
Interactive Exploratory Data Analysis On Petabyte Scale Data Sets With Arkouda
Jul 31, 2022
What "Data Lineage Done Right" Looks Like And How They're Doing It At Manta
Jul 31, 2022
Writing The Book That Offers A Single Reference For The Fundamentals Of Data Engineering
Jul 24, 2022
Re-Bundling The Data Stack With Data Orchestration And Software Defined Assets Using Dagster
Jul 24, 2022
Making The Total Cost Of Ownership For External Data Manageable With Crux
Jul 17, 2022
Joe Reis Flips The Script And Interviews Tobias Macey About The Data Engineering Podcast
Jul 17, 2022
Charting the Path of Riskified's Data Platform Journey
Jul 10, 2022
Maintain Your Data Engineers' Sanity By Embracing Automation
Jul 10, 2022
Be Confident In Your Data Integration By Quickly Validating Matching Records With data-diff
Jul 03, 2022
The View From The Lakehouse Of Architectural Patterns For Your Data Platform
Jul 03, 2022
Bring Geospatial Analytics Across Disparate Datasets Into Your Toolkit With The Unfolded Platform
Jun 27, 2022
Strategies And Tactics For A Successful Master Data Management Implementation
Jun 27, 2022
Combining The Simplicity Of Spreadsheets With The Power Of Modern Data Infrastructure At Canvas
Jun 19, 2022
Level Up Your Data Platform With Active Metadata
Jun 19, 2022
Discover And De-Clutter Your Unstructured Data With Aparavi
Jun 13, 2022
Hire And Scale Your Data Team With Intention
Jun 13, 2022
Simplify Data Security For Sensitive Information With The Skyflow Data Privacy Vault
Jun 06, 2022
Bringing The Modern Data Stack To Everyone With Y42
Jun 06, 2022
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore
May 30, 2022
Data Cloud Cost Optimization With Bluesky Data
May 30, 2022
Unlocking The Value Of Data Across The Organization Through User Friendly Data Tools With Prophecy
May 23, 2022
Cloud Native Data Orchestration For Machine Learning And Data Engineering With Flyte
May 23, 2022
Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way
May 16, 2022
Designing And Deploying IoT Analytics For Industrial Applications At Vopak
May 16, 2022
Scaling Analysis of Connected Data And Modeling Complex Relationships With The TigerGraph Graph Database
May 09, 2022
Exploring The Insights And Impact Of Dan Delorey's Distinguished Career In Data
May 09, 2022
Leading The Charge For The ELT Data Integration Pattern For Cloud Data Warehouses At Matillion
May 02, 2022
Evolving And Scaling The Data Platform at Yotpo
May 02, 2022
Operational Analytics At Speed With Minimal Busy Work Using Incorta
Apr 24, 2022
Gain Visibility Into Your Entire Machine Learning System Using Data Logging With WhyLogs
Apr 24, 2022
Connecting To The Next Frontier Of Computing With Quantum Networks
Apr 18, 2022
What Does It Really Mean To Do MLOps And What Is The Data Engineer's Role?
Apr 16, 2022
DataOps As A Service For Your Data Integration Workflows With Rivery
Apr 11, 2022
Synthetic Data As A Service For Simplifying Privacy Engineering With Gretel
Apr 10, 2022
Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder
Apr 03, 2022
Repeatable Patterns For Designing Data Platforms And When To Customize Them
Apr 03, 2022
Eliminate The Bottlenecks In Your Key/Value Storage With SpeeDB
Mar 27, 2022
Building A Data Governance Bridge Between Cloud And Datacenters For The Enterprise At Privacera
Mar 27, 2022
Exploring Incident Management Strategies For Data Teams
Mar 20, 2022
Accelerate Your Embedded Analytics With Apache Pinot
Mar 20, 2022
Taking A Multidimensional Approach To Data Observability At Acceldata
Mar 14, 2022
Accelerating Adoption Of The Modern Data Stack At 5X Data
Mar 14, 2022
Move Your Database To The Data And Speed Up Your Analytics With DuckDB
Mar 05, 2022
Developer Friendly Application Persistence That Is Fast And Scalable With HarperDB
Mar 05, 2022
Reflections On Designing A Data Platform From Scratch
Feb 28, 2022
Manage Your Unstructured Data Assets Across Cloud And Hybrid Environments With Komprise
Feb 28, 2022
Build Your Python Data Processing Your Way And Run It Anywhere With Fugue
Feb 21, 2022
Understanding The Immune System With Data At ImmunAI
Feb 21, 2022
Bring Your Code To Your Streaming And Static Data Without Effort With The Deephaven Real Time Query Engine
Feb 14, 2022
Build Your Own End To End Customer Data Platform With Rudderstack
Feb 14, 2022
Scale Your Spatial Analysis By Building It In SQL With Syntax Extensions
Feb 07, 2022
Scalable Strategies For Protecting Data Privacy In Your Shared Data Sets
Feb 06, 2022
A Reflection On Learning A Lot More Than 97 Things Every Data Engineer Should Know
Jan 31, 2022
Effective Pandas Patterns For Data Engineering
Jan 31, 2022
Building And Managing Data Teams And Data Platforms In Large Organizations With Ashish Mrig
Jan 23, 2022
The Importance Of Data Contracts As The Interface For Data Integration With Abhi Sivasailam
Jan 23, 2022
Automated Data Quality Management Through Machine Learning With Anomalo
Jan 15, 2022
An Introduction To Data And Analytics Engineering For Non-Programmers
Jan 15, 2022
Open Source Reverse ETL For Everyone With Grouparoo
Jan 08, 2022
Data Observability Out Of The Box With Metaplane
Jan 08, 2022
Creating Shared Context For Your Data Warehouse With A Controlled Vocabulary
Jan 02, 2022
A Reflection On The Data Ecosystem For The Year 2021
Jan 02, 2022
Exploring The Evolving Role Of Data Engineers
Dec 27, 2021
Revisiting The Technical And Social Benefits Of The Data Mesh
Dec 27, 2021
Fast And Flexible Headless Data Analytics With Cube.JS
Dec 21, 2021
Building A System Of Record For Your Organization's Data Ecosystem At Metaphor
Dec 20, 2021
Building Auditable Spark Pipelines At Capital One
Dec 13, 2021
Deliver Personal Experiences In Your Applications With The Unomi Open Source Customer Data Platform
Dec 12, 2021
Data Driven Hiring For Data Professionals With Alooba
Dec 04, 2021
Experimentation and A/B Testing For Modern Data Teams With Eppo
Dec 04, 2021
Creating A Unified Experience For The Modern Data Stack At Mozart Data
Nov 27, 2021
Doing DataOps For External Data Sources As A Service at Demyst
Nov 27, 2021
Exploring Processing Patterns For Streaming Data Integration In Your Data Lake
Nov 20, 2021
Laying The Foundation Of Your Data Platform For The Era Of Big Complexity With Dagster
Nov 20, 2021
Data Quality Starts At The Source
Nov 14, 2021
Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata
Nov 10, 2021
Business Intelligence Beyond The Dashboard With ClicData
Nov 06, 2021
Exploring The Evolution And Adoption of Customer Data Platforms and Reverse ETL
Nov 05, 2021
Removing The Barrier To Exploratory Analytics with Activity Schema and Narrator
Oct 29, 2021
Streaming Data Pipelines Made SQL With Decodable
Oct 29, 2021
Data Exploration For Business Users Powered By Analytics Engineering With Lightdash
Oct 23, 2021
Completing The Feedback Loop Of Data Through Operational Analytics With Census
Oct 21, 2021
Bringing The Power Of The DataHub Real-Time Metadata Graph To Everyone At Acryl Data
Oct 16, 2021
How And Why To Become Data Driven As A Business
Oct 14, 2021
Make Your Business Metrics Reusable With Open Source Headless BI Using Metriql
Oct 08, 2021
Adding Support For Distributed Transactions To The Redpanda Streaming Engine
Oct 06, 2021
Building Real-Time Data Platforms For Large Volumes Of Information With Aerospike
Oct 02, 2021
Delivering Your Personal Data Cloud With Prifina
Sep 30, 2021
Digging Into Data Reliability Engineering
Sep 26, 2021
Massively Parallel Data Processing In Python Without The Effort Using Bodo
Sep 25, 2021
Declarative Machine Learning Without The Operational Overhead Using Continual
Sep 19, 2021
An Exploration Of The Data Engineering Requirements For Bioinformatics
Sep 19, 2021
Setting The Stage For The Next Chapter Of The Cassandra Database
Sep 12, 2021
A View From The Round Table Of Gartner's Cool Vendors
Sep 09, 2021
Designing And Building Data Platforms As A Product
Sep 04, 2021
Presto Powered Cloud Data Lakes At Speed Made Easy With Ahana
Sep 02, 2021
Do Away With Data Integration Through A Dataware Architecture With Cinchy
Aug 28, 2021
Decoupling Data Operations From Data Infrastructure Using Nexla
Aug 25, 2021
Let Your Analysts Build A Data Lakehouse With Cuelake
Aug 21, 2021
Migrate And Modify Your Data Platform Confidently With Compilerworks
Aug 18, 2021
Prepare Your Unstructured Data For Machine Learning And Computer Vision Without The Toil Using Activeloop
Aug 15, 2021
Build Trust In Your Data By Understanding Where It Comes From And How It Is Used With Stemma
Aug 10, 2021
Data Discovery From Dashboards To Databases With Castor
Aug 07, 2021
Charting A Path For Streaming Data To Fill Your Data Lake With Hudi
Aug 03, 2021
Adding Context And Comprehension To Your Analytics Through Data Discovery With SelectStar
Jul 31, 2021
Building a Multi-Tenant Managed Platform For Streaming Data With Pulsar at Datastax
Jul 28, 2021
Bringing The Metrics Layer To The Masses With Transform
Jul 23, 2021
Strategies For Proactive Data Quality Management
Jul 20, 2021
Low Code And High Quality Data Engineering For The Whole Organization With Prophecy
Jul 16, 2021
Exploring The Design And Benefits Of The Modern Data Stack
Jul 13, 2021
Democratize Data Cleaning Across Your Organization With Trifacta
Jul 09, 2021
Stick All Of Your Systems And Data Together With SaaSGlue As Your Workflow Manager
Jul 05, 2021
Leveling Up Open Source Data Integration With Meltano Hub And The Singer SDK
Jul 03, 2021
A Candid Exploration Of Timeseries Data Analysis With InfluxDB
Jun 29, 2021
Lessons Learned From The Pipeline Data Engineering Academy
Jun 26, 2021
Make Database Performance Optimization A Playful Experience With OtterTune
Jun 23, 2021
Bring Order To The Chaos Of Your Unstructured Data Assets With Unstruk
Jun 18, 2021
Accelerating ML Training And Delivery With In-Database Machine Learning
Jun 15, 2021
Taking A Tour Of The Google Cloud Platform For Data And Analytics
Jun 12, 2021
Make Sure Your Records Are Reliable With The BookKeeper Distributed Storage Layer
Jun 09, 2021
Build Your Analytics With A Collaborative And Expressive SQL IDE Using Querybook
Jun 03, 2021
Making Data Pipelines Self-Serve For Everyone With Shipyard
Jun 02, 2021
Paving The Road For Fast Analytics On Distributed Clouds With The Yellowbrick Data Warehouse
May 28, 2021
Easily Build Advanced Similarity Search With The Pinecone Vector Database
May 25, 2021
A Holistic Approach To Data Governance Through Self Reflection At Collibra
May 21, 2021
Unlocking The Power of Data Lineage In Your Platform with OpenLineage
May 18, 2021
Building Your Data Warehouse On Top Of PostgreSQL
May 14, 2021
Making Analytical APIs Fast With Tinybird
May 11, 2021
Making Spark Cloud Native At Data Mechanics
May 07, 2021
The Grand Vision And Present Reality of DataOps
May 04, 2021
Self Service Data Exploration And Dashboarding With Superset
Apr 27, 2021
Moving Machine Learning Into The Data Pipeline at Cherre
Apr 20, 2021
Exploring The Expanding Landscape Of Data Professions with Josh Benamram of Databand
Apr 13, 2021
Put Your Whole Data Team On The Same Page With Atlan
Apr 06, 2021
Data Quality Management For The Whole Team With Soda Data
Mar 30, 2021
Real World Change Data Capture At Datacoral
Mar 23, 2021
Managing The DoorDash Data Platform
Mar 16, 2021
Leave Your Data Where It Is And Automate Feature Extraction With Molecula
Mar 09, 2021
Bridging The Gap Between Machine Learning And Operations At Iguazio
Mar 02, 2021
Self Service Open Source Data Integration With AirByte
Feb 23, 2021
Building The Foundations For Data Driven Businesses at 5xData
Feb 16, 2021
How Shopify Is Building Their Production Data Warehouse Using DBT
Feb 09, 2021
System Observability For The Cloud Native Era With Chronosphere
Feb 02, 2021
Making It Easier To Stick B2B Data Integration Pipelines Together With Hotglue
Jan 26, 2021
Using Your Data Warehouse As The Source Of Truth For Customer Data With Hightouch
Jan 19, 2021
Enabling Version Controlled Data Collaboration With TerminusDB
Jan 11, 2021
Bringing Feature Stores and MLOps to the Enterprise at Tecton
Jan 05, 2021
Off The Shelf Data Governance With Satori
Dec 28, 2020
Low Friction Data Governance With Immuta
Dec 21, 2020
Building A Self Service Data Platform For Alternative Data Analytics At YipitData
Dec 15, 2020
Proven Patterns For Building Successful Data Teams
Dec 07, 2020
Streaming Data Integration Without The Code at Equalum
Nov 30, 2020
Keeping A Bigeye On The Data Quality Market
Nov 23, 2020
Self Service Data Management From Ingest To Insights With Isima
Nov 17, 2020
Building A Cost Effective Data Catalog With Tree Schema
Nov 10, 2020
Add Version Control To Your Data Lake With LakeFS
Nov 03, 2020
Cloud Native Data Security As Code With Cyral
Oct 26, 2020
Better Data Quality Through Observability With Monte Carlo
Oct 19, 2020
Rapid Delivery Of Business Intelligence Using Power BI
Oct 12, 2020
Self Service Real Time Data Integration Without The Headaches With Meroxa
Oct 05, 2020
Speed Up And Simplify Your Streaming Data Workloads With Red Panda
Sep 29, 2020
Cutting Through The Noise And Focusing On The Fundamentals Of Data Engineering With The Data Janitor
Sep 22, 2020
Distributed In Memory Processing And Streaming With Hazelcast
Sep 15, 2020
Simplify Your Data Architecture With The Presto Distributed SQL Engine
Sep 07, 2020
Building A Better Data Warehouse For The Cloud At Firebolt
Sep 01, 2020
Metadata Management And Integration At LinkedIn With DataHub
Aug 25, 2020
Exploring The TileDB Universal Data Engine
Aug 17, 2020
Closing The Loop On Event Data Collection With Iteratively
Aug 10, 2020
A Practical Introduction To Graph Data Applications
Aug 04, 2020
Build More Reliable Distributed Systems By Breaking Them With Jepsen
Jul 28, 2020
Making Wind Energy More Efficient With Data At Turbit Systems
Jul 21, 2020
Open Source Production Grade Data Integration With Meltano
Jul 13, 2020
DataOps For Streaming Systems With Lenses.io
Jul 06, 2020
Data Collection And Management To Power Sound Recognition At Audio Analytic
Jun 30, 2020
Bringing Business Analytics To End Users With GoodData
Jun 23, 2020
Accelerate Your Machine Learning With The StreamSQL Feature Store
Jun 15, 2020
Data Management Trends From An Investor Perspective
Jun 08, 2020
Building A Data Lake For The Database Administrator At Upsolver
Jun 02, 2020
Mapping The Customer Journey For B2B Companies At Dreamdata
May 25, 2020
Power Up Your PostgreSQL Analytics With Swarm64
May 18, 2020
StreamNative Brings Streaming Data To The Cloud Native Landscape With Pulsar
May 11, 2020
Enterprise Data Operations And Orchestration At Infoworks
May 04, 2020
Taming Complexity In Your Data Driven Organization With DataOps
Apr 28, 2020
Building Real Time Applications On Streaming Data With Eventador
Apr 20, 2020
Making Data Collection In Your Code Easy With Rookout
Apr 14, 2020
Building A Knowledge Graph Of Commercial Real Estate At Cherre
Apr 07, 2020
The Life Of A Non-Profit Data Professional
Mar 30, 2020
Behind The Scenes Of The Linode Object Storage Service
Mar 23, 2020
Building A New Foundation For CouchDB
Mar 17, 2020
Scaling Data Governance For Global Businesses With A Data Hub Architecture
Mar 09, 2020
Easier Stream Processing On Kafka With ksqlDB
Mar 02, 2020
Shining A Light on Shadow IT In Data And Analytics
Feb 25, 2020
Data Infrastructure Automation For Private SaaS At Snowplow
Feb 18, 2020
Data Modeling That Evolves With Your Business Using Data Vault
Feb 09, 2020
The Benefits And Challenges Of Building A Data Trust
Feb 03, 2020
Pay Down Technical Debt In Your Data Pipeline With Great Expectations
Jan 27, 2020
Replatforming Production Dataflows
Jan 20, 2020
Planet Scale SQL For The New Generation Of Applications With YugabyteDB
Jan 13, 2020
Change Data Capture For All Of Your Databases With Debezium
Jan 06, 2020
Building The DataDog Platform For Processing Timeseries Data At Massive Scale
Dec 30, 2019
Building The Materialize Engine For Interactive Streaming Analytics In SQL
Dec 23, 2019
Solving Data Lineage Tracking And Data Discovery At WeWork
Dec 16, 2019
SnowflakeDB: The Data Warehouse Built For The Cloud
Dec 09, 2019
Organizing And Empowering Data Engineers At Citadel
Dec 03, 2019
Building A Real Time Event Data Warehouse For Sentry
Nov 26, 2019
Escaping Analysis Paralysis For Your Data Platform With Data Virtualization
Nov 18, 2019
Designing For Data Protection
Nov 11, 2019
Automating Your Production Dataflows On Spark
Nov 04, 2019
Build Maintainable And Testable Data Applications With Dagster
Oct 28, 2019
Data Orchestration For Hybrid Cloud Analytics
Oct 22, 2019
Keeping Your Data Warehouse In Order With DataForm
Oct 15, 2019
Fast Analytics On Semi-Structured And Structured Data In The Cloud
Oct 08, 2019
Ship Faster With An Opinionated Data Pipeline Framework
Oct 01, 2019
Open Source Object Storage For All Of Your Data
Sep 23, 2019
Navigating Boundless Data Streams With The Swim Kernel
Sep 18, 2019
Building A Reliable And Performant Router For Observability Data
Sep 10, 2019
Building A Community For Data Professionals at Data Council
Sep 02, 2019
Building Tools And Platforms For Data Analytics
Aug 26, 2019
A High Performance Platform For The Full Big Data Lifecycle
Aug 19, 2019
Digging Into Data Replication At Fivetran
Aug 12, 2019
Solving Data Discovery At Lyft
Aug 05, 2019
Simplifying Data Integration Through Eventual Connectivity
Jul 29, 2019
Straining Your Data Lake Through A Data Mesh
Jul 22, 2019
Data Labeling That You Can Feel Good About With CloudFactory
Jul 15, 2019
Scale Your Analytics On The Clickhouse Data Warehouse
Jul 08, 2019
Stress Testing Kafka And Cassandra For Real-Time Anomaly Detection
Jul 02, 2019
The Workflow Engine For Data Engineers And Data Scientists
Jun 25, 2019
Maintaining Your Data Lake At Scale With Spark
Jun 17, 2019
Managing The Machine Learning Lifecycle
Jun 10, 2019
Evolving An ETL Pipeline For Better Productivity
Jun 04, 2019
Data Lineage For Your Pipelines
May 27, 2019
Build Your Data Analytics Like An Engineer With DBT
May 20, 2019
Using FoundationDB As The Bedrock For Your Distributed Systems
May 07, 2019
Running Your Database On Kubernetes With KubeDB
Apr 29, 2019
Unpacking Fauna: A Global Scale Cloud Native Database
Apr 22, 2019
Index Your Big Data With Pilosa For Faster Analytics
Apr 15, 2019
Serverless Data Pipelines On DataCoral
Apr 08, 2019
Why Analytics Projects Fail And What To Do About It
Apr 01, 2019
Building An Enterprise Data Fabric At CluedIn
Mar 25, 2019
A DataOps vs DevOps Cookoff In The Data Kitchen
Mar 18, 2019
Customer Analytics At Scale With Segment
Mar 04, 2019
Deep Learning For Data Engineers
Feb 25, 2019
Speed Up Your Analytics With The Alluxio Distributed Storage System
Feb 19, 2019
Machine Learning In The Enterprise
Feb 11, 2019
Cleaning And Curating Open Data For Archaeology
Feb 04, 2019
Managing Database Access Control For Teams With strongDM
Jan 29, 2019
Building Enterprise Big Data Systems At LEGO
Jan 21, 2019
TimescaleDB: The Timeseries Database Built For SQL And Scale - Episode 65
Jan 14, 2019
Performing Fast Data Analytics Using Apache Kudu - Episode 64
Jan 07, 2019
Simplifying Continuous Data Processing Using Stream Native Storage In Pravega with Tom Kaitchuck - Episode 63
Dec 31, 2018
Continuously Query Your Time-Series Data Using PipelineDB with Derek Nelson and Usman Masood - Episode 62
Dec 24, 2018
Advice On Scaling Your Data Pipeline Alongside Your Business with Christian Heinzmann - Episode 61
Dec 17, 2018
Putting Apache Spark Into Action with Jean Georges Perrin - Episode 60
Dec 10, 2018
Apache Zookeeper As A Building Block For Distributed Systems with Patrick Hunt - Episode 59
Dec 03, 2018
Set Up Your Own Data-as-a-Service Platform On Dremio with Tomer Shiran - Episode 58
Nov 26, 2018
Stateful, Distributed Stream Processing on Flink with Fabian Hueske - Episode 57
Nov 19, 2018
How Upsolver Is Building A Data Lake Platform In The Cloud with Yoni Iny - Episode 56
Nov 11, 2018
Self Service Business Intelligence And Data Sharing Using Looker with Daniel Mintz - Episode 55
Nov 05, 2018
Using Notebooks As The Unifying Layer For Data Roles At Netflix with Matthew Seal - Episode 54
Oct 29, 2018
Of Checklists, Ethics, and Data with Emily Miller and Peter Bull (Cross Post from Podcast.__init__) - Episode 53
Oct 22, 2018
Improving The Performance Of Cloud-Native Big Data At Netflix Using The Iceberg Table Format with Ryan Blue - Episode 52
Oct 15, 2018
Combining Transactional And Analytical Workloads On MemSQL with Nikita Shamgunov - Episode 51
Oct 09, 2018
Building A Knowledge Graph From Public Data At Enigma With Chris Groskopf - Episode 50
Oct 01, 2018
A Primer On Enterprise Data Curation with Todd Walter - Episode 49
Sep 24, 2018
Take Control Of Your Web Analytics Using Snowplow With Alexander Dean - Episode 48
Sep 17, 2018
Keep Your Data And Query It Too Using Chaos Search with Thomas Hazel and Pete Cheslock - Episode 47
Sep 10, 2018
An Agile Approach To Master Data Management with Mark Marinelli - Episode 46
Sep 03, 2018
Protecting Your Data In Use At Enveil with Ellison Anne Williams - Episode 45
Aug 27, 2018
Graph Databases In Production At Scale Using DGraph with Manish Jain - Episode 44
Aug 20, 2018
Putting Airflow Into Production With James Meickle - Episode 43
Aug 13, 2018
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42
Aug 06, 2018
Mobile Data Collection And Analysis Using Ona And Canopy With Peter Lubell-Doughtie - Episode 41
Jul 30, 2018
Ceph: A Reliable And Scalable Distributed Filesystem with Sage Weil - Episode 40
Jul 16, 2018
Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39
Jul 08, 2018
Leveraging Human Intelligence For Better AI At Alegion With Cheryl Martin - Episode 38
Jul 02, 2018
Package Management And Distribution For Your Data Using Quilt with Kevin Moore - Episode 37
Jun 25, 2018
User Analytics In Depth At Heap with Dan Robinson - Episode 36
Jun 17, 2018
CockroachDB In Depth with Peter Mattis - Episode 35
Jun 11, 2018
ArangoDB: Fast, Scalable, and Multi-Model Data Storage with Jan Steeman and Jan Stücke - Episode 34
Jun 04, 2018
The Alooma Data Pipeline With CTO Yair Weinberger - Episode 33
May 28, 2018
PrestoDB and Starburst Data with Kamil Bajda-Pawlikowski - Episode 32
May 21, 2018
Brief Conversations From The Open Data Science Conference: Part 2 - Episode 31
May 14, 2018
Brief Conversations From The Open Data Science Conference: Part 1 - Episode 30
May 07, 2018
Metabase Self Service Business Intelligence with Sameer Al-Sakran - Episode 29
Apr 30, 2018
Octopai: Metadata Management for Better Business Intelligence with Amnon Drori - Episode 28
Apr 23, 2018
Data Engineering Weekly with Joe Crobak - Episode 27
Apr 15, 2018
Defining DataOps with Chris Bergh - Episode 26
Apr 08, 2018
ThreatStack: Data Driven Cloud Security with Pete Cheslock and Patrick Cable - Episode 25
Apr 01, 2018
MarketStore: Managing Timeseries Financial Data with Hitoshi Harada and Christopher Ryan - Episode 24
Mar 25, 2018
Stretching The Elastic Stack with Philipp Krenn - Episode 23
Mar 19, 2018
Database Refactoring Patterns with Pramod Sadalage - Episode 22
Mar 12, 2018
The Future Data Economy with Roger Chen - Episode 21
Mar 05, 2018
Honeycomb Data Infrastructure with Sam Stokes - Episode 20
Feb 26, 2018
Data Teams with Will McGinnis - Episode 19
Feb 19, 2018
TimescaleDB: Fast And Scalable Timeseries with Ajay Kulkarni and Mike Freedman - Episode 18
Feb 11, 2018
Pulsar: Fast And Scalable Messaging with Rajan Dhabalia and Matteo Merli - Episode 17
Feb 04, 2018
Dat: Distributed Versioned Data Sharing with Danielle Robinson and Joe Hand - Episode 16
Jan 29, 2018
Snorkel: Extracting Value From Dark Data with Alex Ratner - Episode 15
Jan 22, 2018
CRDTs and Distributed Consensus with Christopher Meiklejohn - Episode 14
Jan 15, 2018
Citus Data: Distributed PostGreSQL for Big Data with Ozgun Erdogan and Craig Kerstiens - Episode 13
Jan 08, 2018
Wallaroo with Sean T. Allen - Episode 12
Dec 25, 2017
SiriDB: Scalable Open Source Timeseries Database with Jeroen van der Heijden - Episode 11
Dec 18, 2017
Confluent Schema Registry with Ewen Cheslack-Postava - Episode 10
Dec 10, 2017
data.world with Bryon Jacob - Episode 9
Dec 03, 2017
Data Serialization Formats with Doug Cutting and Julien Le Dem - Episode 8
Nov 22, 2017
Buzzfeed Data Infrastructure with Walter Menendez - Episode 7
Nov 14, 2017
Astronomer with Ry Walker - Episode 6
Aug 06, 2017
Rebuilding Yelp's Data Pipeline with Justin Cunningham - Episode 5
Jun 18, 2017
ScyllaDB with Eyal Gutkind - Episode 4
Mar 18, 2017
Defining Data Engineering with Maxime Beauchemin - Episode 3
Mar 05, 2017
Dask with Matthew Rocklin - Episode 2
Jan 22, 2017
Pachyderm with Daniel Whitenack - Episode 1
Jan 14, 2017
Introducing The Show
Jan 08, 2017