ImageNet Roulette is a provocation designed to help us see into the ways that humans are classified in machine learning systems. It uses a neural network trained on the “Person” categories from the ImageNet dataset which has over 2,500 labels used to classify images of people.
Starting Friday, September 27th this application will no longer be available online.
ImageNet Roulette was launched earlier this year as part of a broader project to draw attention to the things that can – and regularly do – go wrong when artificial intelligence models are trained on problematic training data.
ImageNet Roulette: An Experiment in Classification
ImageNet is one of the most important and historically significant training sets in artificial intelligence. In the words of its creators, the idea behind ImageNet was to “map out the entire world of objects.” After its initial launch in 2009, ImageNet grew enormous: the development team scraped a collection of many millions of images from the Internet and briefly became the world's largest academic user of Amazon’s Mechanical Turk, using an army of piecemeal workers to sort an average of 50 images each minute into thousands of categories. When it was finished, ImageNet consisted of over 14 million labelled images organized into more than twenty thousand categories.
The underlying structure of ImageNet is based on the semantic structure of Wordnet, a database of word classifications developed at Princeton University in the 1980s.
The ImageNet dataset is typically used for object recognition. But as part of the research for the forthcoming “Excavating AI” project by Trevor Paglen and Kate Crawford, we were interested to see what would happen if we trained an AI model exclusively on its “Person” categories. ImageNet contains 2833 sub-categories under the top-level category “Person.” The sub-category with the most associated pictures is “gal” (with 1664 images) followed by “grandfather” (1662), “dad” (1643), and “chief executive officer” (1614). ImageNet classifies people into a huge range of types including race, nationality, profession, economic status, behavior, character, and even morality.
The result of that experiment is ImageNet Roulette.
ImageNet Roulette uses an open source Caffe deep learning framework trained on the images and labels in the “person” categories (which are currently ‘down for maintenance’). Proper nouns and categories with less than 100 pictures were removed.
When a user uploads a picture, the application first runs a face detector to locate any faces. If it finds any, it sends them to the Caffe model for classification. The application then returns the original images with a bounding box showing the detected face and the label the classifier has assigned to the image. If no faces are detected, the application sends the entire scene to the Caffe model and returns an image with a label in the upper left corner.
ImageNet contains a number of problematic, offensive and bizarre categories - all drawn from WordNet. Some use misogynistic or racist terminology. Hence, the results ImageNet Roulette returns will also draw upon those categories. That is by design: we want to shed light on what happens when technical systems are trained on problematic training data. AI classifications of people are rarely made visible to the people being classified. ImageNet Roulette provides a glimpse into that process – and to show the ways things can go wrong.
ImageNet Roulette does not store the photos people upload.
ImageNet Roulette is currently on view at the Fondazione Prada Osservertario museum in Milan as part of the Training Humans exhibition.
A project by Trevor Paglen using images from ImageNet “From Apple to Anomaly (Pictures and Words)” opens at the Barbican Center in London on Sept. 25.