What is OneFormer?

OneFormer is a state-of-the-art multi-task image segmentation framework that is implemented using transformers.

About the model

Here is an overview of the

OneFormer

model:

Date of Release Nov 10, 2022
Model Type Instance Segmentation
Architecture Transformers
Framework Used PyTorch
Annotation Format
Stars on GitHub 380+

OneFormer is a new segmentation model that earned 5x state-of-the-art badges from Papers with Code. It beat former SOTA solutions — MaskFormer and Mask2Former, and it is now ranked number one in the instance, semantic and panoptic segmentation. OneFormer is based on transformers and built using Detectron2.

OneFormer is the first multi-task image segmentation framework. This means the model  only needs to be trained once with universal architecture and a single dataset. Previously, even if the model scored high in all three segmentation tasks, it needed to be trained individually on the semantic, instance, or panoptic datasets.

OneFormer introduces a task-conditional joint training strategy. The model uniformly samples training examples from different ground truth domains. As a result, the model architecture is task-guided for training and task-dynamic for inference, all with a single model.

Check out YOLOv8, defining a new state-of-the-art in computer vision

YOLOv8 is here, setting a new standard for performance in object detection and image segmentation tasks. Roboflow has developed a library of resources to help you get started with YOLOv8, covering guides on how to train YOLOv8, how the model stacks up against v5 and v7, and more.

Learn about YOLOv8

Check out YOLOv8, defining a new state-of-the-art in computer vision

YOLOv8 is here, setting a new standard for performance in object detection and image segmentation tasks. Roboflow has developed a library of resources to help you get started with YOLOv8, covering guides on how to train YOLOv8, how the model stacks up against v5 and v7, and more.

Learn about YOLOv8

Check out YOLOv8, defining a new state-of-the-art in computer vision

YOLOv8 is here, setting a new standard for performance in object detection and image segmentation tasks. Roboflow has developed a library of resources to help you get started with YOLOv8, covering guides on how to train YOLOv8, how the model stacks up against v5 and v7, and more.

Learn about YOLOv8

Check out YOLOv8, defining a new state-of-the-art in computer vision

YOLOv8 is here, setting a new standard for performance in object detection and image segmentation tasks. Roboflow has developed a library of resources to help you get started with YOLOv8, covering guides on how to train YOLOv8, how the model stacks up against v5 and v7, and more.

Learn about YOLOv8

Model Performance

ADE20K

Method Backbone Crop Size PQ AP mIoU
(s.s)
mIoU
(ms+flip)
#params config Checkpoint
OneFormer Swin-L 640×640 48.6 35.9 57.0 57.7 219M config model
OneFormer Swin-L 896×896 50.2 37.6 57.4 58.3 219M config model
OneFormer ConvNeXt-L 640×640 48.7 36.2 56.6 57.4 220M config model
OneFormer DiNAT-L 640×640 49.1 36.0 57.8 58.4 223M config model
OneFormer DiNAT-L 896×896 50.0 36.8 58.1 58.6 223M config model
OneFormer ConvNeXt-XL 640×640 48.9 36.3 57.4 58.8 372M config model

Cityscapes

Method Backbone PQ AP mIoU
(s.s)
mIoU
(ms+flip)
#params config Checkpoint
OneFormer Swin-L 67.2 45.6 83.0 84.4 219M config model
OneFormer ConvNeXt-L 68.5 46.5 83.0 84.0 220M config model
OneFormer DiNAT-L 67.6 45.6 83.1 84.0 223M config model
OneFormer ConvNeXt-XL 68.4 46.7 83.6 84.6 372M config model

COCO

Method Backbone PQ PQTh PQSt AP mIoU #params config Checkpoint
OneFormer Swin-L 57.9 64.4 48.0 49.0 67.4 219M config model
OneFormer DiNAT-L 58.0 64.3 48.4 49.2 68.1 223M config model
† denotes the backbones were pretrained on ImageNet-22k.

Explore this model on Roboflow

No items found.

OneFormer Annotation Format

OneFormer

uses the

annotation format. If your annotation is in a different format, you can use Roboflow's annotation conversion tools to get your data into the right format.

Convert data between formats

Deploy a computer vision model today

Join 100k developers curating high quality datasets and deploying better models with Roboflow.

Get started