paper link: https://www.researchsquare.com/article/rs-3832815/v1


General Goal

Develop a mechanism that can help detect and flag images containing species which are not present in the data a model was trained on.

Motivation

Identification-based ML systems rely usually on the data i.e. species that they are trained on and thus will attribute a wrong label to any new species fed into the system.

Taichi Workflow Summary

Steps as follows:

  1. Create a database of know specimens along with an ML model trained on their images. Photos are taken at a set of multiple different angles for all specimen. Aligned images across different spceis are ones that have been taken at exact same angle. To create CNN, chosen datasets had CNNs fit onto them that could ID the species.
  2. An AI barcode is created for a given specimen. This is just the probability of

Untitled

Data Used

Two beetle datasets were used where each contained a certain number of known species along with some unknown species. Each species further had multiple specimens to understand the variation in that species. Both of these are image datasets consisting of 224x224 pixel images.

These consisted of photos taken from different views by using a Olympus E-M5 Mark II and rotating each specimen at different angles to get all angles on their body. The beetles were actually kept on a platform and rotated for this. This isn’t how they would appear in the wild but are being imaged in a very controlled environment.

Why this is interesting

Creating Barcodes

This is a fully experimental concept and not something that is industry standard