Mahdi Esmailoghli
Postdoctoral Researcher, Data Systems Group · University of Waterloo
Email · Google Scholar · GitHub · LinkedIn
I am a postdoctoral researcher in the Data Systems Group at the University of Waterloo, working with Renée J. Miller. Before Waterloo I was a postdoc at Humboldt-Universität zu Berlin with Matthias Weidlich, and I completed my Ph.D. (summa cum laude) at TU Berlin under Ziawasch Abedjan.
I work on data discovery and integration over large, heterogeneous data lakes — how to find the right tables among millions, join and union them, and reason about how their content changes over time. Recently I have been building systems for temporally-valid data discovery and for helping scientists design data-analysis workflows.
Data discovery · Data lakes · Table & dataset search · Data integration · Data preparation · Temporal data · Scientific workflows
News
- Jun 2026Invited talk on FlowPilot at HILDA @ SIGMOD 2026, Bangalore.
- 2026FlowPilot accepted at SIGMOD 2026; Every Data Lake Has a Past at DOLAP 2026.
- May 2026Started as a postdoctoral researcher in Renée J. Miller's Data Systems Group at the University of Waterloo.
- 2026Awarded a DAAD PPP Joint Research Cooperation Grant as Principal Investigator (2026–2027).
Publications
-
Every Data Lake Has a Past: Analytical Exploration of Wikipedia History as a Temporal Data LakeDOLAP 2026
-
The Past Still Matters: A Temporally-Valid Data Discovery SystemPreprint, 2026 · under submission
-
FlowPilot: A Suggestion System for Designing Scientific WorkflowsSIGMOD 2026
-
Data Discovery in Data Lakes: Operations, Indexes, SystemsICDE 2026 · tutorial
-
Data Discovery in Data Lakes: Operations, Indexes, SystemsPVLDB 2025 · tutorial
-
Blend: A Unified Data Discovery SystemICDE 2025
-
Demonstrating MATE and COCOA for Data DiscoverySIGMOD 2023 · demonstration
-
Duplicate Table Discovery with XashBTW 2023
-
MATE: Multi-Attribute Table ExtractionPVLDB 2022
-
COCOA: Correlation Coefficient-Aware Data AugmentationEDBT 2021
-
Combining Programming-by-Example with Transformation Discovery from Large DatabasesBTW 2021
-
CAFE: Constraint-Aware Feature Extraction from Large DatabasesCIDR 2020
-
Particulate Matter Matters — The Data Science Challenge @ BTW 2019Datenbank-Spektrum 2019
Invited Talks
- FlowPilot: A Suggestion System for Designing Scientific Workflows HILDA @ SIGMOD 2026, Bangalore · Jun 2026
- Lost in a Haystack of Data Lakes: Searching for the Needle University of Waterloo & York University, Toronto · Sep 2025
- A Unified Data Discovery System Hasso Plattner Institute (HPI), Potsdam · May 2024
- MATE: Multi-Attribute Table Extraction TaDA @ VLDB 2023, Vancouver · Northeastern University (Khoury), 2023
Teaching & Mentoring
- Programming Lab: Data Systems TU Berlin · SS 2024
- Big-Data Technologies Leibniz University Hannover · WS 2022, WS 2023
- Data Science Foundation Leibniz University Hannover · SS 2022, SS 2023
- Advanced Topics in Database Systems Leibniz University Hannover · SS & WS 2021
- Data Science 1: Essentials of Data Programming TU Berlin · WS 2019, SS 2020
- Data Science Application TU Berlin · SS 2019, SS 2020
- Data Intensive Computing Amirkabir University of Technology · 2017
Co-supervised 15+ B.Sc. and M.Sc. theses in data integration and data discovery, two of which led to peer-reviewed papers.
Awards & Funding
- DAAD PPP Joint Research Cooperation Grant — Principal Investigator 2026–2027, €34,499 · “Data Discovery in the Presence of Temporal Drifts”
- Ph.D. awarded summa cum laude (highest honour) TU Berlin · 2024
- Distinguished Referee CIKM 2023
- GI Data Science Challenge, 1st Prize BTW 2023 (Dresden) & BTW 2019 (Rostock)
- Ranked #4, 19th National Computer Olympiad of Iran 2014
Academic Service
- Program Committee — SIGMOD 2026, VLDB 2026, ICDE 2025, CIKM 2024, CIKM 2023, TaDA 2026
- Journal Reviewer — Information Systems (2025)
- W3 professorship hiring committee (Software Quality), Humboldt-Universität zu Berlin
- Member of three Ph.D. examination committees and a thesis advisory committee, Humboldt-Universität zu Berlin
Contact
mahdi.esmailoghli@uwaterloo.ca
Data Systems Group, University of Waterloo, Waterloo, ON, Canada