HPC, Cluster, Visualization, Big Data,Database,Cloud,,

Apache Iceberg gets cloud-based ETL pipeline

Published thu 29 Jan 2026 // 16:14 UTC

Startup Etleap is introducing a cloud-based Extract-Transform-Load (ETL) pipeline for getting data into Apache Iceberg tables.

Apache Iceberg is an open source table format for large-scale datasets in data lakes, sitting above storage systems like Parquet, ORC, and Avro, and cloud object stores such as AWS S3, Azure Blob, and the Google Cloud Store. It brings database-like features to data lakes, such as ACID support, partitioning, time travel, and schema evolution. Iceberg format tables, are used in big data and enable SQL querying. Query engines such as Spark, Trino, Flink, Presto, Hive, Impala, StarRocks, and others can work on the tables simultaneously.

BANDF AD

Christian Romming, Etleap CEO and founder, said: “Iceberg delivers major benefits for enterprises, but to realize them in practice requires a managed pipeline system around it. We believe our Iceberg pipeline platform meets this need, allowing data platform teams to adopt Iceberg without building and operating a custom pipeline stack.”

Etleap was started upon 2013 by Romming and, by data analytics startup standards, is lightly funded, having raised some $3.22 million across startup and seed rounds in 2017 and 2018.

Romming says Iceberg doesn’t ingest or model data, manage table operations, or coordinate changes across systems. Users have to build their own set of pipeline functions to hook up data sources to Iceberg and do this, having “to assemble a patchwork of ingestion tools, dbt Core jobs, orchestrators, and custom Iceberg maintenance.”

Now Etleap will do it for you, courtesy of a SaaS service. It’s unifying ingestion, transformation, orchestration, and Iceberg operations into a single, managed system that runs entirely inside a customer’s Virtual Private Cloud (VPC).

BANDF AD

However supported data sources are limited. Currently only the following are supported as sources for Iceberg pipelines:

CDC-enabled databases (CDC = change data capture)
S3 sources when the “Trigger transformations through events” pipeline source option is enabled
Event Streams
Salesforce CDC entities

There is a limited set of data transforms available, which can be found here. There are CDC, event-triggered, and event stream Iceberg pipeline limitations as well. Many of these should get sorted out in the future.

Etleap has pipelines for AWS Redshift, S3/Glue, and also Snowflake. Its Iceberg pipeline platform is available and being used by customers to run Iceberg pipelines at scale.

Apache Iceberg gets cloud-based ETL pipeline

Xinnor's alternative software RAID filer for AI

Nutanix invited inside Dell’s Private Cloud

StreamFast eSSD and the Open Flash Platform

Simplyblock provides Postgres Git-style branching

IBM refreshes FlashSystem lineup with faster 5600, 7600, and 9600 arrays

Storage news ticker – 9 February 2026

OFP’s data server killers aiming for AI system scalability and efficiency nirvana

StreamFast: Stream arbitrary length data to SSDs with device-assigned addresses and no FTL

VAST Data plans funding round so early stockholders can get cash

Cohesity deepens Google Cloud integration

MinIO plugs Apache Iceberg tables directly into AIStor

DDN appoints vice chairman amid enterprise AI expansion

Intel and Softbank aim ZAM at HBM

Hitachi Vantara may be up for sale

Samsung shipping fast and small PCIe Gen5 bus 4TB mini-gumstick drive