To read from Apache Iceberg to Dataflow, use the managed I/O connector.
Dependencies
Add the following dependencies to your project:
Java
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-managed</artifactId>
<version>${beam.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-iceberg</artifactId>
<version>2.56.0</version>
</dependency>
Configuration
The Apache Iceberg connector uses the following configuration parameters:
table
(string). The name of the Apache Iceberg. Example:"db.table1"
.catalog_config
(map). The catalog configuration. Contains the following fields:catalog_name
(string). The name of the catalog. Example:"local"
.catalog_type
(string). The type of catalog. Supported values:"hadoop"
,"hive"
,"rest"
.warehouse_location
(string). The warehouse location. Example:file://path/to/warehouse
.
Example
The following example reads from an Apache Iceberg table and writes the data to text files.
Java
To authenticate to Dataflow, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.