IBM  Cloud Annotations
IBM Developer

Getting started

Note: Guides are a work in progress and coming soon! Until they are completed check out some of the workshops.

Cloud Annotations makes labeling images and training machine learning models easy. Whether you’ve never touched a line of code in your life or you’re a TensorFlow ninja, these docs will help you build what you need. Let’s get started!

Sign up for IBM Cloud

Cloud Annotations is built on top of IBM Cloud Object Storage. Using a cloud object storage offering provides a reliable place to store training data. It also opens up the potential for collaboration, letting a team to simultaneously annotate the dataset in real-time.

IBM Cloud offers a lite tier of object storage, which includes 25 GB of free storage.

Before you start, sign up for a free IBM Cloud account.

Preparing training data

To train a computer vision model you need a lot of images. Cloud Annotations supports uploading both photos and videos. However, before you start snapping, there’s a few limitations to consider.

Training data best practices

  • Object Type The model is optimized for photographs of objects in the real world. They are unlikely to work well for x-rays, hand drawings, scanned documents, receipts, etc.

  • Object Environment The training data should be as close as possible to the data on which predictions are to be made. For example, if your use case involves blurry and low-resolution images (such as from a security camera), your training data should be composed of blurry, low-resolution images. In general, you should also consider providing multiple angles, resolutions, and backgrounds for your training images.

  • Difficulty The model generally can’t predict labels that humans can’t assign. So, if a human can’t be trained to assign labels by looking at the image for 1-2 seconds, the model likely can’t be trained to do it either.

  • Label Count We recommend at least 50 labels per object category for a usable model, but using 100s or 1000s would provide better results.

  • Image Dimensions The model resizes the image to 300x300 pixels, so keep that in mind when training the model with images where one dimension is much longer than the other.

  • Object Size The object of interests size should be at least ~5% of the image area to be detected. For example, on the resized 300x300 pixel image the object should cover ~60x60 pixels.

Set up Cloud Annotations

To use Cloud Annotations just navigate to and click Continue with IBM Cloud.

Once logged, if you don’t have an object storage instance, it will prompt you to create one. Click Get started to be directed to IBM Cloud, where you can create a free object storage instance.

You might need to re-login to IBM Cloud to create a resource.

Choose a pricing plan and click Create, then Confirm on the following popup.

Once your object storage instance has been provisioned, navigate back to and refresh the page.

The files and annotations will be stored in a bucket, You can create one by clicking Start a new project.

Give the bucket a unique name.

Object detection or classification?

A classification model can tell you what an image is and how confident it is about it’s decision. An object detection model can provide you with much more information:

  • Location The coordinates and area of where the object is in the image.
  • Count The number of objects found in the image.
  • Size How large the object is with respect to the image dimensions.

If an object detection model gives us this extra information, why would we use classification?

  • Labor Cost An object detection model requires humans to draw boxes around every object to train. A classification model only requires a simple label for each image.
  • Training Cost It can take longer and require more expensive hardware to train an object detection model.
  • Inference Cost An object detection model can be much slower than real-time to process an image on low-end hardware.

Object detection

After your bucket is created and named, it will prompt you to choose an annotation type. Choose Localization, this enables bounding box drawing.

Labeling the data

  1. Upload a video or many images
  2. Create the desired labels
  3. Start drawing bounding boxes


Documentation coming soon

Labeling with a team

Documentation coming soon

Uploading images/labels via API

Documentation coming soon

Exporting annotations via GUI

Documentation coming soon

Exporting annotations via API

Documentation coming soon

Training overview

Documentation coming soon

Training via GUI

Once you have labeled a sufficient amount of photos, click Train Model. A dialog message will appear, prompting you to select your Watson Machine Learning instance. If none are available, it will guide you to create a new one (You may need to refresh your Cloud Annotations window for the new instance to appear, but don’t worry, your labels will be saved).

Click Train. Your training job will not be added to the queue.

You will see it listed as pending until the training starts (this could take several minutes).

Once your training job starts, the status will change and you will see a graph of the training steps running.

Once the job is completed, you’re all set!

Installing the Cloud Annotations CLI (cacli)

To train our model we need to install the Cloud Annotation CLI.

Homebrew (macOS)

If you are on macOS and using Homebrew, you can install cacli with the following:

$ brew install cacli

Shell script (Linux / macOS)

If you are on Linux or macOS, you can install cacli with the following:

$ curl -sSL | sudo sh

Binary (Windows / Linux / macOS)

Download the appropriate version for your platform from the releases page. Once downloaded, the binary can be run from anywhere. You don’t need to install it into a global location. This works well for shared hosts and other systems where you don’t have a privileged account.

Ideally, you should install it somewhere in your PATH for easy use. /usr/local/bin is the most probable location.

Training via CLI

Documentation coming soon

Training with Google Colab

Google Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning, data analysis and education. More technically, Colab is a hosted Jupyter notebook service that requires no setup to use, while providing free access to computing resources including GPUs.

To use Google Colab all you need is a standard Google Account.

Note: These steps assume you have already labeled a dataset for object detection.

Exporting annotations

To train a model in Google Colab, it expects the annotations to be located in Google Drive. You can export your data from Cloud Annotations via the following steps:

  1. Choose File > Export as Create ML

Uploading to Google Drive

Once exported, you should have a file named <bucket-name>.zip. Unzip the downloaded folder and upload it to Google Drive.

Using Google Colab

Open in Colab

Non-interactive training (useful for CI)

Documentation coming soon

Custom training scripts

Documentation coming soon

Downloading a model via GUI

From an existing project, select Training runs > View all

Select a completed training job from the lefthand side, click Download. A zip file will be created containing your trained model files.

Downloading a model via CLI

With the Cloud Annotations CLI installed, we will download our trained model. First we need the model ID. This can be obtained by running cacli list to list all training runs, find the ID of the model that you would like to download. Then run:

cacli download <model id>.

Using a model

Documentation coming soon


Documentation coming soon