Link: https://www.nature.com/articles/s41598-022-19939-2
What
A publicly available labeled image dataset built on FAIR data principles for detecting marine creatures and other underwater objects (geological features, equipments, etc.), especially for pushing AI and CV developments around this domain.
Data is composed of images with organisms/objects of interest bounded by boxes. Images contain metadata to indicate information collected by ROVs and imaging devices.
APIs to access and submit data, web interface to access and view data, verifying/reviewing capabilities for addition of external data, etc. have been created and shared.
They have open-sourced tutorials, code, and ML models trained on the database. Website is a little buggy. Takes time to render results.
FathomNet website: https://fathomnet.org/fathomnet/#/
Pre-trained Models: https://huggingface.co/FathomNet
Github: https://github.com/fathomnet
Tutorial to get data using python: https://medium.com/fathomnet/how-to-download-images-and-bounding-boxes-from-fathomnet-using-python-283ff32c975f
Why
What does the data look like?
As of July 2022, FathomNet had 84,454 images, 175,873 localizations for 2244 concepts (concepts=class). Benthic images might not be fully annotated and might have missing concepts. Midwater images tend to more completely annotated.
Do I need permissions to use this data?
Where does the data come from?