The Field Museum is home to one of the world's largest natural history collections, including several million insect specimens. This immense collection, built by thousands of researchers over hundreds of years, is an invaluable resource for scientists. However, its sheer size made it difficult for researchers and curators to utilize the data for large-scale studies.
To solve this challenge, the museum launched an initiative to digitize its massive collection of pinned insects and their associated data. This project was led by Bruno de Medeiros, Assistant Curator of Insects at the Field Museum’s Negaunee Integrative Research Center. As he explained, "To research the collection, we needed to better understand what was in it. For example, one of our team members wanted to analyze how nature creates millions of color patterns across species. To conduct this and other kinds of research, we needed a better solution for digitizing specimens and organizing the data."
The insect specimens are stored in drawers, each with labels containing essential information like collection time and location. Traditionally, imaging the specimens and transcribing the associated data was a slow, manual process.
This prompted the Field Museum team, along with researchers from several other institutions, to develop custom software called DrawerDissect. This solution uses a combination of computer vision and large language models to automate the process.
The software takes a single image of a drawer and transforms it into individual specimen images, each with its own structured metadata. Beyond just pairing images with text, DrawerDissect allows for automated analysis of the specimens themselves, making it possible to record color diversity, size, and shape at a speed and scale that was previously impossible.
To build the software, the team used Roboflow to train their custom computer vision models. As Elizabeth Postema, the Postdoctoral Researcher leading the software development for DrawerDissect, noted, "Roboflow allowed our team to develop these accurate vision models, based on real-world data from the collection, even without formal training in machine learning."
The team first used Roboflow's annotation tools to prepare thousands of images from the specimen drawers. They then trained their custom models in the cloud, avoiding the need for expensive local hardware. The models were integrated into the DrawerDissect software via an API, allowing for seamless operation while processing drawers and conducting research analyses.
The DrawerDissect project has significantly advanced the Field Museum's goal of making its insect collection more accessible for research.
One impressive example of the software's power was the digitization of the museum’s entire collection of over 13,000 tiger beetles in just four weeks. This task, which would have been prohibitively time-consuming with traditional methods, allowed the team to create a separate vision model to assist researchers with species identification.
As Postema said, "With the computer vision models we developed in Roboflow, it’s possible to process thousands of specimens very quickly, which is unlocking macroevolutionary analyses at a scale that was previously not feasible."
To learn more about the project, you can read the full research paper here, learn more about the DrawerDissect software here, and view the computer vision datasets here.
Located on Chicago’s iconic Lake Michigan shore, the Field Museum first opened in 1894 and boasts a growing collection of nearly 40 million artifacts and specimens. The museum employs more than 150 scientists and researchers who travel to the far corners of the world in search of new discoveries and clues to what life was like hundreds, thousands, and millions of years ago.