researchers found and examined 21 sets of training material data that the ai had received, which together included more than 100,000 pictures. but they realized a lot was missing in how the ai was trained: few photos were taken in the highly specific way that would be most helpful (one photo of the potentially cancerous lesion, along with a photo taken with a special hand-held magnifier). and there was no information provided about how or why the images were chosen to be included.
and, more significantly, they didn’t have sufficient racial data for the program to recognize what skin abnormalities might look like on a diverse range of skin tones.
most of the photos — 14 of the 21 data sets — provided information about the country the photos came from, but few of them were specific about the patients’ skin colour or ethnicity. and the images that did specify skin colour were overwhelmingly white: out of 2,436 photos with that information, only 10 were of people with brown skin, and there was only one single photo of dark brown or black skin. even fewer photos (1,585) — specified ethnicity. there were no photos of people whose ethnicities were african, afro-caribbean or south asian.
when it comes to something as specific as skin cancer, that’s a significant problem, wen explained.