Detectron2 0.5 Docker image available

A new Docker image for building models with version 0.5 of Detectron2 is available:

More information on the Docker image is available from Github:

github.com/waikato-datamining/pytorch/tree/master/detectron2/0.5

This CUDA 10.2 based image supersedes the 0.4 version, as that image was never used in production.

Unfortunately, CUDA 11.1 is still not possible to use (and therefore cannot run on a 3090 Ti) due to a, so far, unresolved bug in PyTorch.

video-frame-selector library released

The video-frame-selector Python 3 library is now available:

github.com/waikato-datamining/video-frame-selector

With this library you can present frames obtained from video files or a webcam to an image analysis framework, such as detectron2, and react to the generated predictions. For instance, you might want to trawl through a video from a trail camera and generate either JPG images or a shortened video that contains only the frames that show actual animal. Or you might only look for a specific animal, rather than all of them (e.g., only NZ pests such as rats or stoats). Frames that are to be kept can also be cropped to the smallest area that encompasses all the detected bounding boxes (you can enforce a margin around the cropped content and/or a minimum width/height).

Docker for Data Scientists website launched

Deep learning has opened a lot of new avenues in many domains for data scientists. However... The pain and suffering in setting up environments with the correct versions of CUDA, cuDNN, numpy and the actual deep learning framework is, unfortunately, all too real. Replicating an existing test environment on a production server can prove challenging. For quite some time now, we resorted to using docker to ease the pain of reproducing the same setup on various servers for running experiments. Docker will not solve the problem of having to figure out the right combination of libraries, but at least you can easily reuse docker images and build pipelines with frameworks that would otherwise have conflicting requirements in terms of libraries.

Long story short: in order to make it easier for data scientists to start their journey with docker, we compiled a little mkdocs website:

www.data-mining.co.nz/docker-for-data-scientists/

On this website, you will learn the docker basics and find also steps for creating (and using) a docker image that uses PyTorch's image classification facilities.

The website's source code is available from github and released under CC-BY-SA 4.0:

https://github.com/waikato-datamining/docker-for-data-scientists

simple-confusion-matrix library released

The simple-confusion-matrix Python 3 library has been released today:

github.com/waikato-datamining/simple-confusion-matrix

It is a simple library that can generate confusion matrices from CSV files or lists of actual and predicted labels. It can output the generated matrix either in plain text or CSV, as string or to a file.

Rather than just using counts, it can generate also:

  • percentages (all cells sum up to 1)

  • percentages per row (all cells in a row sum up to 1)

The latter is useful when dealing with imbalanced datasets, giving you a good idea of how well each label is being predicted.

PyTorch image classification available

Today, a new library for performing image classification has made its debut:

wai.pytorchimageclass

The library is based on the PyTorch example code for imagenet. For ResNet-based networks, you can finetune pretrained models on your own data rather than just using the imagenet dataset. In addition, you can make predictions (single and batch/continuous), output information on built models and export trained models to TorchScript.

The library is also available via Docker images, one for GPU-based machines and one for CPU-only ones. However, the latter one should only be used for inference and not training, as it is simply too slow.

More information on the library and the Docker images is available from Github:

github.com/waikato-datamining/pytorch/tree/master/image-classification

simple-file-poller library released

The simple-file-poller Python 3 library has been released this week:

github.com/waikato-datamining/simple-file-poller

This library is aimed at Python projects that perform continuous processing of files, e.g., deep learning models that locate objects in images. These projects typically pick up files from one directory, make predictions, write the output in some format to another directory and then either move the input files to the output directory or simply delete them.

Instead of having to write this code for polling and moving over and over again, the simple-file-poller library allows you to plug in your file processing code via a function that you supply to a Poller object. Furthermore, you can also supply a function that can check whether files are valid and can be processed (e.g., image files).

The Poller class supports two polling modes: time-based and watchdog-based. The former waits for a specified number of seconds between polls (if there were no files present). This simple approach can be used when the file processing is not time critical. The latter approach watches the input directory for files being created and then reacts to that immediately. This approach allows for very low latency processing, especially useful for processing pipelines.

Another feature is the ability to write any output to a temporary directory first, before moving it into the output directory. This avoids race conditions with other processes that further process the generated output files, as the files are guaranteed to have been fully written.

The following frameworks make use of the simple-file-poller now (and more will follow):

Keras image segmentation Docker image available

A new Docker image is available for training Keras image segmentation models using a GPU backend. The image is based on TensorFlow 1.14 and Divam Gupta's code, plus additional tools for converting indexed PNGs into RGB ones and continuously processing images with a model.

More information on the Docker image is available from Github:

github.com/waikato-datamining/tensorflow/tree/master/image-segmentation-keras