Cross-cloud data sharing is the capability to query data that resides in one cloud provider's object storage (e.g., AWS S3) from a query engine in a different cloud provider (e.g., Azure or GCP), without copying data between clouds. This is increasingly important as organizations adopt multi-cloud strategies, acquire companies running on different clouds, or partner with organizations on different cloud platforms.
Apache Iceberg as the Cross-Cloud Bridge
Apache Iceberg's architecture is cloud-agnostic by design. Iceberg tables in S3 use the same format as Iceberg tables in ADLS Gen2 or Google Cloud Storage. The Iceberg REST Catalog API standardizes how any engine connects to any catalog regardless of cloud provider. This means a Dremio Cloud engine on AWS can query an Iceberg table cataloged in a GCS-backed Polaris deployment, or a Spark cluster on Azure can read from an S3-backed Iceberg catalog, using standard Iceberg REST Catalog connectivity. The main practical challenge is cross-cloud data egress costs and latency, which make cross-cloud queries suitable for metadata-light, infrequent analytical use cases rather than high-volume operational pipelines that should co-locate data and compute.

