Presto Blog - PrestoDB

Understanding Presto UI: A Deep Dive into the Web Interface Architecture

By Saurabh Mahawar December 1, 2025December 1, 2025

Presto UI is a modern, React-based web interface that provides real-time monitoring, query management, and cluster administration capabilities for the Presto distributed SQL query engine. Whether you’re a database administrator, data engineer, or developer, Presto UI offers intuitive tools to visualize query execution, monitor cluster health, and interact with the Presto coordinator. Key Benefits of…

Seamless Integration: Connecting PrestoDB to SingleStore for High-Performance Analytics

By Saurabh Mahawar September 11, 2025November 19, 2025

In today’s data-driven landscape, organization’s are constantly seeking ways to analyze massive datasets quickly and efficiently. PrestoDB, a powerful open-source SQL query engine, and SingleStore, a distributed SQL database, are two technologies that, when combined, offer unparalleled capabilities for high-performance data querying and distributed analytics. This guide provides a hands-on, step-by-step tutorial on how to…

Presto Takes a Leap: Upgrading to Java 17 for Enhanced Performance and Security

By Zachary Blanco August 25, 2025

We’re excited to announce that the core Presto engine is migrating to Java 17. This upgrade reinforces our commitment to providing a robust, high-performance, and secure SQL query engine. This change allows Presto to leverage Java 17’s improvements, bringing enhancements in performance, stability, and security, and laying a strong foundation for future upgrades. Why Java…

Prestissimo Extension for AI Training Data Normalization at Meta: A Deep Dive for Developers (Lightning Talk)

By Saurabh Mahawar August 23, 2025August 23, 2025

At PrestoCon Day 2025, Meta’s Presto team recently unveiled the Prestissimo extension, a powerful enhancement designed to optimize AI training data normalization. This article explores the technical underpinnings and developer-centric features of this extension, providing a comprehensive understanding of how it supports large-scale AI workloads at Meta. Understanding AI Training Data Storage at Meta At…

Presto C++ Unleashed: Dynamically Load Unfenced UDFs, End Rebuilds, and Boost Performance

By Saurabh Mahawar August 22, 2025August 22, 2025

Dynamic loading in Presto C++ is revolutionizing how developers build and deploy user-defined functions (UDFs). At PrestoCon Day 2025 , Soumya Duriseti explained how Presto C++ now supports dynamic loading of unfenced UDFs, eliminating the need for time-consuming static builds and making it easier than ever to add custom logic without rebuilding the entire binary….

Building Connectors in Presto C++: Deep Dive into the TPCDS Connector (Lightning Talk)

By Saurabh Mahawar, Pramod Satya & Pratik Joseph Dabre August 20, 2025August 20, 2025

At PrestoCon Day 2025, engineers from IBM presented a deep dive into how connectors in Presto C++ extend the engine’s modular capabilities, focusing on the newly implemented TPCDS benchmark connector. Connectors are central to Presto’s architecture, enabling the query engine to communicate seamlessly with external systems such as databases, file formats, or benchmark data generators….

Presto’s Intelligent Future: Leveraging RAG and LLM’s for Smarter Query Execution

By Saurabh Mahawar August 12, 2025August 12, 2025

At PrestoCon Day 2025, Satej Sahu (Principal Data Engineer at Zalando SE) introduced the Self-Healing Query Connector for Presto, an AI-powered upgrade designed to make query troubleshooting faster, smarter, and more reliable. By combining Large Language Models with live query data, including logs, explain plans, and schema details it delivers accurate, context-aware solutions that improve…

Revolutionizing Presto C++: Unleashing Native Power with the Sidecar

By Saurabh Mahawar, Pramod Satya & Pratik Joseph Dabre August 10, 2025August 11, 2025

At PrestoCon Day 2025, we unveiled the Presto Sidecar, a powerful enhancement for Presto C++ (Velox) clusters that transforms how coordinators interact with native workers. This innovation removes long-standing blind spots in query planning by giving the coordinator real-time visibility into native worker capabilities – such as supported functions, data types, session properties, and plan…

Unlocking Petabyte-Scale Performance: Uber’s Journey with Presto for Distributed Cache using Alluxio

By Saurabh Mahawar August 8, 2025August 28, 2025

At PrestoCon Day 2025, Uber presented their innovative solution for optimizing petabyte-scale data analytics by deploying a distributed cache using Alluxio for Presto. Their journey was driven by significant challenges during a massive cloud migration, including read slowness and overwhelming HDFS clusters on-premises, and later high GCS egress costs and file access charges in the…

Unleashing Interactivity: Inside Meta’s Presto-Powered Data Warehouse Innovation

By Saurabh Mahawar August 7, 2025August 7, 2025

At this year’s PrestoCon Day, Meta had an awesome session to share the latest on what they’re doing with Presto. As you probably know, Meta has one of the largest data Lakehouse’s in the world and Presto is a critical piece of that data platform. It plays a critical role in serving vast and diverse…

Setting Up Presto with Apache Superset: Hands-On Guide

By Saurabh Mahawar August 7, 2025August 7, 2025

PrestoDB, an open-source distributed SQL query engine, allows you to query data from multiple disparate sources. When combined with Apache Superset, an open-source data visualization and exploration platform, it forms a powerful and flexible analytics solution. This guide provides a step-by-step approach to deploying these components within a Dockerized environment, simplifying setup and management. Pre-Requisites:…

Build Your Open Data Lakehouse: A Step-by-Step ETL Guide with MySQL, OLake, and PrestoDB

By Saurabh Mahawar July 29, 2025August 19, 2025

This tutorial provides a comprehensive guide to building an Open Data Lakehouse from scratch, a modern and flexible data architecture solution. Open Data Lakehouses offer a powerful and scalable method for storing, managing, and querying both structured and semi-structured data, leveraging a suite of robust open-source tools for enhanced control and flexibility. Pre-Requisites: Before commencing…

Leading by Contribution: IBM’s Ongoing Investment in Open-Source Presto

By Anant Aneja, Yabin Ma, Ali LeClerc & Ethan Zhang July 15, 2025July 16, 2025

Note: This is a cross-post from https://community.ibm.com/community/user/blogs/ali-leclerc/2025/07/15/ibms-ongoing-investment-in-presto At IBM, we believe open source is the engine of innovation. Presto, as a fast and flexible SQL engine for interactive analytics, continues to evolve rapidly thanks to community contributions. Over the past year, IBM engineers have focused on driving Presto forward across security, performance, native execution, and…

Setting Up Presto: A Step by Step Installation Guide to Run SQL Queries 🚀

By Saurabh Mahawar July 15, 2025August 19, 2025

In this guide, I’ll walk you through installing Presto and show how to run queries with ease. Pre-Requisites 🎯Before getting started, ensure that the following are installed: Now, let’s see the Step by Step Process to Install Presto. Step – 1: Installing Presto Server 📥 📌 wget – Fetches the specified URL and save the file locally…

How Twilio Scales Presto with Odin: A New Query Gateway

By Ali LeClerc May 20, 2025May 20, 2025

One of my favorite parts of working with the Presto community is seeing how different companies push the project forward in creative ways. Recently, Aakash Pradeep from Twilio shared a great example of this with their development of Odin, a new modular query gateway they built to help scale Presto usage across their organization. I…

Fueling Presto’s Momentum and IBM’s Growing Role in the Open-Source SQL Engine

By Ethan Zhang May 13, 2025June 11, 2025

I’ve now been a part of IBM for 2 years and I’m pretty encouraged with the work this team has put into open-source Presto. So, I wanted to take some time to share in a blog what we’ve been up to for the last 2 years and the growth we’ve seen collectively in the community…

Safeguarding Presto C++ Memory Usage with LinuxMemoryChecker

By Minhan Cao & Christian Zentgraf May 6, 2025May 6, 2025

Problem Running the Presto C++ worker stably in a production environment relies on proper configuration that maximizes stability without sacrificing performance. Presto C++ designed a LinuxMemoryChecker to achieve this goal. The evaluation engine used in Presto C++ is Velox. Velox, the evaluation engine used in Presto C++, implements a MemoryManager that provides several advanced features…

Improving Schema Management in Presto: Passing Catalog Names to the Metastore

By Anurag Dwivedi April 28, 2025May 6, 2025

Managing schemas in Presto just got a lot smarter. Thanks to a new enhancement, Presto can now pass catalog names directly to the metastore, enabling better logical organization, filtering, and schema isolation across multiple catalogs. This improvement significantly enhances the experience for users working with Hive, Hudi, Delta, and Iceberg catalogs. 🔍 The Problem Before …