IBM  Cloud Annotations
IBM Developer
GitHub
workshops

Preparing training data

Cloud Annotations makes labeling images and training machine learning models easy. Whether you’ve never touched a line of code in your life or you’re a TensorFlow ninja, these docs will help you build what you need. Let’s get started!

Sign up for IBM Cloud

Cloud Annotations is built on top of IBM Cloud Object Storage. Using a cloud object storage offering provides a reliable place to store training data. It also opens up the potential for collaboration, letting a team to simultaneously annotate the dataset in real-time.

IBM Cloud offers a lite tier of object storage, which includes 25 GB of free storage.

Before you start, sign up for a free IBM Cloud account.

Training data best practices

To train a computer vision model you need a lot of images. Cloud Annotations supports uploading both photos and videos. However, before you start snapping, there’s a few limitations to consider.

  • Object Type The model is optimized for photographs of objects in the real world. They are unlikely to work well for x-rays, hand drawings, scanned documents, receipts, etc.

  • Object Environment The training data should be as close as possible to the data on which predictions are to be made. For example, if your use case involves blurry and low-resolution images (such as from a security camera), your training data should be composed of blurry, low-resolution images. In general, you should also consider providing multiple angles, resolutions, and backgrounds for your training images.

  • Difficulty The model generally can't predict labels that humans can't assign. So, if a human can't be trained to assign labels by looking at the image for 1-2 seconds, the model likely can't be trained to do it either.

  • Label Count We recommend at least 50 labels per object category for a usable model, but using 100s or 1000s would provide better results.

  • Image Dimensions The model resizes the image to 300x300 pixels, so keep that in mind when training the model with images where one dimension is much longer than the other.

  • Object Size The object of interests size should be at least ~5% of the image area to be detected. For example, on the resized 300x300 pixel image the object should cover ~60x60 pixels.

Set up Cloud Annotations

To use Cloud Annotations just navigate to cloud.annotations.ai and click Continue with IBM Cloud.

Once logged, if you don’t have an object storage instance, it will prompt you to create one. Click Get started to be directed to IBM Cloud, where you can create a free object storage instance.

You might need to re-login to IBM Cloud to create a resource.

Choose a pricing plan and click Create, then Confirm on the following popup.

Once your object storage instance has been provisioned, navigate back to cloud.annotations.ai and refresh the page.

The files and annotations will be stored in a bucket, You can create one by clicking Start a new project.

Give the bucket a unique name.

After your bucket is created and named, it will prompt you to choose an annotation type. Choose Classification.

Labeling the data

  1. Create the desired labels
  2. Upload a video or some images
  3. Select images then choose Label > DESIRED_LABEL

 

📁 Sample Training Data

Back
Next