Apache Beam Java SDK
The Java SDK for Apache Beam provides a simple, powerful API for building both batch and streaming parallel data processing pipelines in Java.
Get Started with the Java SDK
Get started with the Beam Programming Model to learn the basic concepts that apply to all SDKs in Beam.
See the Java API Reference for more information on individual APIs.
Supported Features
The Java SDK supports all features currently supported by the Beam model.
Pipeline I/O
See the Beam-provided I/O Transforms page for a list of the currently available I/O transforms.
Extensions
The Java SDK has the following extensions:
- join-library provides inner join, outer left join, and outer right join functions.
- sorter is an efficient and scalable sorter for large iterables.
- Nexmark is a benchmark suite that runs in batch and streaming modes.
- TPC-DS is a SQL benchmark suite that runs in batch mode.
- euphoria is easy to use Java 8 DSL for BEAM.
In addition several 3rd party Java libraries exist.
Java multi-language pipelines quickstart
Apache Beam lets you combine transforms written in any supported SDK language and use them in one multi-language pipeline. To learn how to create a multi-language pipeline using the Java SDK, see the Java multi-language pipelines quickstart.