Real archaeological fieldwork is seldom as exciting as it looks in the movies. You tend to get fewer reanimated mummies, deadly booby traps, and dramatic shootouts with Nazis. Instead, you’ll see pieces of broken pottery—a lot of them. Potsherds are ubiquitous at archaeological sites, and that’s true for pretty much every culture since people invented pottery. In the US Southwest in particular, museums have collected sherds by the tens of thousands.
Although all those broken bits may not look like much at first glance, they’re often the key to piecing together the past.
“[Potsherds] provide archaeologists with critical information about the time a site was occupied, the cultural group with which it was associated, and other groups with whom they interacted,” said Northern Arizona University archaeologist Chris Downum, who co-authored a new study with Leszek Pawlowicz.
Members of different cultures have always made their own container types, using their own techniques and decorating in their own ways. And within each culture, those styles and techniques have changed over time. That’s why archaeologists can often look at a site’s potsherds to tell who lived there and how long ago. They’re what archaeologists call diagnostic artifacts.
But getting that information requires sorting and classifying the potsherds, usually based on the small details of how they’re made or decorated. At some sites, archaeologists in the lab find themselves sorting hundreds or even thousands of potsherds. It’s “hundreds of hours of tedious, painstaking, eye-straining work,” as Pawlowicz put it, and it can take years to learn to do it reliably and well. Even then, archaeologists don’t always agree on what’s what, which can impact how they tell the story of the past.
A high-tech matching game
Pawlowicz and Downum recently turned to machine-learning for a faster way to sort through all those mountains of potsherds.
Between 825 and 1300 CE, people living in the canyons and mesas of northeast Arizona stored their food and water in hand-shaped containers that were elaborately decorated with dark brown or black geometric patterns on a white background. Today, we know these artisans as the Kayenta Branch of the Ancestral Pueblo culture—a group of indigenous Americans who were the ancestors of the modern Hopi people. Their pottery, now called Tusayan White Ware, varied over time and between places, and archaeologists have sorted it further into a handful of smaller categories.
That’s exactly what Pawlowicz and Downum asked four experienced archaeologists to do with 3,000 potsherd photos taken at museums in northeastern Arizona. The pieces that archaeologists agreed came from a specific subtype (roughly 2,400 of them) became the dataset used to train a computer program called a Convolutional Neural Network, or CNN. Sometimes the images were randomly shrunk, enlarged, or rotated to ensure that the program could deal with those variations.
CNNs have been used to sort through image search results or look for signs of pathology in medical X-ray images. CNNs are good at analyzing visual information. Show one enough labeled pictures of dogs, for instance, and it will eventually learn to tell the difference between a beagle and a mastiff.
When pitted against the four expert archaeologists in a final potsherd sorting showdown, the neural network outperformed two of the humans and tied with the other two.
The experiment’s result suggests that neural networks may be useful tools for future archaeologists, especially if there is a lot of potsherd sorting to get done. And it’s not the first result of its kind; a different team of archaeologists trained a CNN to sort medieval French potsherds based on 3D scans, and the program was about 96 percent accurate. That’s not an improvement over human accuracy, but it could offer a more efficient way to deal with the sheer number of potsherds some sites offer up.
“This will free up time and effort for archaeologists to concentrate on the meaning of the results,” wrote Pawlowicz and Downum.
Someday, the researchers suggest, a mobile or web application could connect archaeologists in the field or the lab to a CNN that could classify potsherd photos on the fly, link to similar sherds, and even offer metadata about the site. That, of course, would depend on convincing archaeologists to upload their own photos and data to the central database for everyone’s benefit—which may be harder than programming and training the neural networks.
Proof of concept
For now, Pawlowicz and Downum’s recent study is a proof of concept. They chose a pottery type, Tusayan White Ware, that is especially easy for a computer to sort based on photos, because its patterns contrast so strongly with the background. A neural network would likely do reasonably well at sorting other types of decorated pottery, but so-called plainware—ceramics without any visible decoration or markings—would probably be a bridge too far.
There are some things humans may always do better than any of our electronic creations. On the other hand, neural networks do have some advantages. Archaeologists arguing about potsherd classifications often struggle to explain why they’ve put a potsherd in a particular category, for example.
“An archaeologist experienced in decorated ceramics is often capable of assigning a type to a sherd in a fraction of a second, without consciously thinking of all the design rules for that type,” wrote Pawlowicz and Downum. Their CNN, on the other hand, color-coded specific features on the photos that explained its choices. By combining that ability with the more intuitive work of human archaeologists, future work could help sort out some artifacts that might otherwise go unclassified.
In other words, the tedious and meticulous work of sorting potsherds may one day be a joint effort between people and our most advanced artifacts.