To write from Dataflow to Apache Iceberg, use the managed I/O connector.
Dependencies
Add the following dependencies to your project:
Java
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-managed</artifactId>
<version>${beam.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-iceberg</artifactId>
<version>2.56.0</version>
</dependency>
Configuration
The Apache Iceberg connector uses the following configuration parameters:
table
(string). The name of the Apache Iceberg. Example:"db.table1"
.catalog_config
(map). The catalog configuration. Contains the following fields:catalog_name
(string). The name of the catalog. Example:"local"
.catalog_type
(string). The type of catalog. Supported values:"hadoop"
,"hive"
,"rest"
.warehouse_location
(string). The warehouse location. Example:file://path/to/warehouse
.
Example
The following example writes in-memory JSON data to an Apache Iceberg table.
Java
To authenticate to Dataflow, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.