The MediaPipe Image Segmenter task lets you divide images into regions based on predefined categories for applying visual effects such as background blurring. These instructions show you how to use the Image Segmenter with Android apps. The code example described in these instructions is available on GitHub. For more information about the capabilities, models, and configuration options of this task, see the Overview.
Code example
The MediaPipe Tasks code example contains two simple implementations of a Image Segmenter app for Android:
The examples use the camera on a physical Android device to perform image segmentation on a live camera feed, or you can choose images and videos from the device gallery. You can use the apps as a starting point for your own Android app, or refer to them when modifying an existing app. The Image Segmenter example code is hosted on GitHub.
The following sections refer to the Image Segmenter with a category mask app.
Download the code
The following instructions show you how to create a local copy of the example code using the git command line tool.
To download the example code:
- Clone the git repository using the following command:
git clone https://github.com/google-ai-edge/mediapipe-samples
- Optionally, configure your git instance to use sparse checkout,
so you have only the files for the Image Segmenter example app:
cd mediapipe git sparse-checkout init --cone git sparse-checkout set examples/image_segmentation/android
After creating a local version of the example code, you can import the project into Android Studio and run the app. For instructions, see the Setup Guide for Android.
Key components
The following files contain the crucial code for this image segmentation example application:
- ImageSegmenterHelper.kt - Initializes the Image Segmenter task and handles the model and delegate selection.
- CameraFragment.kt - Provides the user interface and control code for a camera.
- GalleryFragment.kt - Provides the user interface and control code for selecting image and video files.
- OverlayView.kt - Handles and formats the segmentation results.
Setup
This section describes key steps for setting up your development environment and code projects to use Image Segmenter. For general information on setting up your development environment for using MediaPipe tasks, including platform version requirements, see the Setup guide for Android.
Dependencies
Image Segmenter uses the com.google.mediapipe:tasks-vision
library. Add this
dependency to the build.gradle
file of your
Android app development project. Import the required dependencies with
the following code:
dependencies {
...
implementation 'com.google.mediapipe:tasks-vision:latest.release'
}
Model
The MediaPipe Image Segmenter task requires a trained model that is compatible with this task. For more information on available trained models for Image Segmenter, see the task overview Models section.
Select and download the model, and then store it within your project directory:
<dev-project-root>/src/main/assets
Use the BaseOptions.Builder.setModelAssetPath()
method to specify the path
used by the model. This method is referred to in the code example in the next
section.
In the Image Segmenter
example code,
the model is defined in the ImageSegmenterHelper.kt
class in the setupImageSegmenter()
function.
Create the task
You can use the createFromOptions
function to create the task. The
createFromOptions
function accepts configuration options including mask output
types. For more information on task configuration, see
Configuration options.
The Image Segmenter task supports the following input data types: still images, video files, and live video streams. You must specify the running mode corresponding to your input data type when creating the task. Choose the tab for your input data type to see how to create that task.
Image
ImageSegmenterOptions options = ImageSegmenterOptions.builder() .setBaseOptions( BaseOptions.builder().setModelAssetPath("model.tflite").build()) .setRunningMode(RunningMode.IMAGE) .setOutputCategoryMask(true) .setOutputConfidenceMasks(false) .build(); imagesegmenter = ImageSegmenter.createFromOptions(context, options);
Video
ImageSegmenterOptions options = ImageSegmenterOptions.builder() .setBaseOptions( BaseOptions.builder().setModelAssetPath("model.tflite").build()) .setRunningMode(RunningMode.VIDEO) .setOutputCategoryMask(true) .setOutputConfidenceMasks(false) .build(); imagesegmenter = ImageSegmenter.createFromOptions(context, options);
Live stream
ImageSegmenterOptions options = ImageSegmenterOptions.builder() .setBaseOptions( BaseOptions.builder().setModelAssetPath("model.tflite").build()) .setRunningMode(RunningMode.LIVE_STREAM) .setOutputCategoryMask(true) .setOutputConfidenceMasks(false) .setResultListener((result, inputImage) -> { // Process the segmentation result here. }) .setErrorListener((result, inputImage) -> { // Process the segmentation errors here. }) .build() imagesegmenter = ImageSegmenter.createFromOptions(context, options)
The Image Segmenter example code implementation allows the user to switch between
processing modes. The approach makes the task creation code more complicated and
may not be appropriate for your use case. You can see this code in the
ImageSegmenterHelper
class by the setupImageSegmenter()
function.
Configuration options
This task has the following configuration options for Android apps:
Option Name | Description | Value Range | Default Value |
---|---|---|---|
runningMode |
Sets the running mode for the task. There are three
modes: IMAGE: The mode for single image inputs. VIDEO: The mode for decoded frames of a video. LIVE_STREAM: The mode for a livestream of input data, such as from a camera. In this mode, resultListener must be called to set up a listener to receive results asynchronously. |
{IMAGE, VIDEO, LIVE_STREAM } |
IMAGE |
outputCategoryMask |
If set to True , the output includes a segmentation mask
as a uint8 image, where each pixel value indicates the winning category
value. |
{True, False } |
False |
outputConfidenceMasks |
If set to True , the output includes a segmentation mask
as a float value image, where each float value represents the confidence
score map of the category. |
{True, False } |
True |
displayNamesLocale |
Sets the language of labels to use for display names provided in the
metadata of the task's model, if available. Default is en for
English. You can add localized labels to the metadata of a custom model
using the TensorFlow Lite Metadata Writer API |
Locale code | en |
resultListener |
Sets the result listener to receive the segmentation results
asynchronously when the image segmenter is in the LIVE_STREAM mode.
Can only be used when running mode is set to LIVE_STREAM |
N/A | N/A |
errorListener |
Sets an optional error listener. | N/A | Not set |
Prepare data
Image Segmenter works with images, video file and live stream video. The task handles the data input preprocessing, including resizing, rotation and value normalization.
You need to convert the input image or frame to a
com.google.mediapipe.framework.image.MPImage
object before passing it to the
Image Segmenter.
Image
import com.google.mediapipe.framework.image.BitmapImageBuilder; import com.google.mediapipe.framework.image.MPImage; // Load an image on the user’s device as a Bitmap object using BitmapFactory. // Convert an Android’s Bitmap object to a MediaPipe’s Image object. Image mpImage = new BitmapImageBuilder(bitmap).build();
Video
import com.google.mediapipe.framework.image.BitmapImageBuilder; import com.google.mediapipe.framework.image.MPImage; // Load a video file on the user's device using MediaMetadataRetriever // From the video’s metadata, load the METADATA_KEY_DURATION and // METADATA_KEY_VIDEO_FRAME_COUNT value. You’ll need them // to calculate the timestamp of each frame later. // Loop through the video and load each frame as a Bitmap object. // Convert the Android’s Bitmap object to a MediaPipe’s Image object. Image mpImage = new BitmapImageBuilder(frame).build();
Live stream
import com.google.mediapipe.framework.image.MediaImageBuilder; import com.google.mediapipe.framework.image.MPImage; // Create a CameraX’s ImageAnalysis to continuously receive frames // from the device’s camera. Configure it to output frames in RGBA_8888 // format to match with what is required by the model. // For each Android’s ImageProxy object received from the ImageAnalysis, // extract the encapsulated Android’s Image object and convert it to // a MediaPipe’s Image object. android.media.Image mediaImage = imageProxy.getImage() Image mpImage = new MediaImageBuilder(mediaImage).build();
In the Image Segmenter example code, the data preparation is handled in the
ImageSegmenterHelper
class by the segmentLiveStreamFrame()
function.
Run the task
You call a different segment
function based on the running mode you are using.
The Image Segmenter function returns the identified segment regions within the
input image or frame.
Image
ImageSegmenterResult segmenterResult = imagesegmenter.segment(image);
Video
// Calculate the timestamp in milliseconds of the current frame. long frame_timestamp_ms = 1000 * video_duration * frame_index / frame_count; // Run inference on the frame. ImageSegmenterResult segmenterResult = imagesegmenter.segmentForVideo(image, frameTimestampMs);
Live stream
// Run inference on the frame. The segmentations results will be available via // the `resultListener` provided in the `ImageSegmenterOptions` when the image // segmenter was created. imagesegmenter.segmentAsync(image, frameTimestampMs);
Note the following:
- When running in the video mode or the live stream mode, you must also provide the timestamp of the input frame to the Image Segmenter task.
- When running in the image or the video mode, the Image Segmenter task blocks the current thread until it finishes processing the input image or frame. To avoid blocking the user interface, execute the processing in a background thread.
- When running in the live stream mode, the Image Segmenter task doesn’t block
the current thread but returns immediately. It will invoke its result
listener with the detection result every time it has finished processing an
input frame. If the
segmentAsync
function is called when the Image Segmenter task is busy processing another frame, the task ignores the new input frame.
In the Image Segmenter example code, the segment
functions are defined in the
ImageSegmenterHelper.kt
file.
Handle and display results
Upon running inference, the Image Segmenter task returns an ImageSegmenterResult
object which contains the results of the segmentation task. The content of the
output depends on the outputType
you set when you
configured the task.
The following sections show examples of the output data from this task:
Category confidence
The following images show a visualization of the task output for a category
confidence mask. The confidence mask output contains float values between
[0, 1]
.
Original image and category confidence mask output. Source image from the Pascal VOC 2012 dataset.
Category value
The following images show a visualization of the task output for a category
value mask. The category mask range is [0, 255]
and each pixel value
represents the winning category index of the model output. The winning category
index is has the highest score among the categories the model can recognize.
Original image and category mask output. Source image from the Pascal VOC 2012 dataset.