Artificial Intelligence
Reflecting on the Coming Revolution
Washington College experts share their understanding of and insights into AI.
By Darrach Dolan
If you’re invited to a dinner party in Silicon Valley, don’t be surprised if your host asks, “What’s your P(doom)?” That is the percentile probability that Artificial Intelligence (AI) will destroy humanity. A P(doom) of 100 means that AI will wipe us out, and zero means there’s no threat.
Hopefully, your host has their tongue firmly stuck in their cheek, but, as we all know, humor often carries a kernel of truth. Left to its own devices (pun intended), AI does present an existential threat, and governments, organizations, and the public are suddenly waking up to the issue.
AI’s promise and threat come from the emerging technology’s ability to learn, harness vast data, compute at exponentially faster speeds, and come up with and optimize novel solutions to a problem. While AI models may be able to develop solutions for the climate crisis, disease, invasive pests, and many other complex problems, they also have the potential to do great harm. This powerful technology, in the hands of a bad actor or without adequate oversight, could make bioweapons, spread disinformation and overthrow legitimate governments, or hack financial systems and destroy economies, among many threats.
This promise-threat duality has been represented in the “paperclip” thought experiment well-known among AI researchers. An AI system is tasked with making paperclips. The system is so good at finding novel solutions and learning new ways to optimize any given task that it turns everything on the planet—all the minerals, yes, but also all the people—into paperclips.
Preventing AI from harming us by ensuring its actions align with shared human values is known as the alignment problem.
Washington College’s Jordan Tirrell, assistant professor of mathematics, puts his P(doom) at around P(10). He and all the other AI experts consulted for this story believe AI is an existential risk but not the one popularized in movies and books where sentient robots or computers turn against their human creators. They think there are plenty of less Hollywoodesque ways it could do us in. From social upheaval due to job losses or disinformation to the accidental creation of deadly viruses, from helping bad actors create bio- or nuclear weapons to unintentional nuclear, chemical, or industrial errors, there are a host of mundane ways AI threatens humanity.
How AI Neural Networks Work
ChatGPT, a large language model (LLM) that can answer questions with great confidence on almost any subject and write human-sounding prose on anything from birthday wishes to college papers, was launched in late 2022. Within a year, it has transformed AI from the stuff of science fiction into a household word and has made the public, politicians, and leaders aware that there has been a game-changing and exponential leap in computing capabilities. Yet, most people don’t know how the technology differs from our computers, tablets, and smartphones.
In 2012, Kyle Wilson, Washington’s John W. Allender Associate Professor of Ethical Data Science, was in graduate school when the breakthrough commonly referred to as AlexNet shook up the AI field. Alex Krizhevsky, Ilya Sutskever, and their mentor, “the godfather of deep learning,” Geoffrey Hinton, trained a neural network to recognize images better than most models at the time. What they did and how they configured neural networks is the basis for many of today’s AI models, including LLMs.
Neural networks have been around since the 1950s but proved to have limited applications. What AlexNet did differently was create deep neural networks by connecting multiple layers of processing nodes, called “neurons,” simultaneously scaling up the processing power by using video game graphics cards to repeatedly do simple actions, and training the model on vast amounts of data.
Wilson explains the workings of these networks as simply as he can for a lay reader, and his description here is meant to conceptualize how these neural networks work, not to precisely and literally describe them. He begins with a simple problem—how can you train a system to identify cats in photographs? It is almost impossible to write computer code describing every aspect of every cat. No two cats look identical, and even the same cat can appear very differently from different angles. What AlexNet did was train a system to repeatedly identify tiny pieces of “catness” until, eventually, whenever it found enough of these pieces in an image, it could say with confidence that it was a cat.
It began by feeding labeled images into its system—cats, dogs, buses, trains, etc.—and running simple programs to identify patterns in the pixels. Think of each of these programs as a neuron connected to another neuron in a line. Each neuron looks for something simple in an image. For example, one might look for two black pixels next to a white one. The image of a cat is the input, and the neuron’s response is the output—yes, it has two black pixels next to a white one, or no, it doesn’t. Each program is so simple that, individually, they tell the system practically nothing useful about whether the image is a cat or not.
The first layer of neurons communicates the outputs to a layer of neurons above it.
Crucially, each output is weighted when it goes to the next layer. If the image is
of a cat, and neuron x finds the two black pixels it is programmed to look for, its
output gets a positive weight. If it doesn’t find the pixels, it’s given a negative
weight. The next layer of neurons runs its own simple programs just like the first
and communicates its findings to the next layer, on and on. On each layer, an individual
neuron is given a lower or higher weight depending on whether its little piece of
the puzzle is found in the image or not. After, say, six layers, you might get a neuron
that consistently gives a positive output when presented with a cat eye because all
the neurons below it have been weighted higher. This is repeated on a massive scale
until, eventually, the system learns that certain combinations of neurons give positive
outputs consistently in the presence of a cat. These layers of neurons, communicating
together, are a deep
neural network.
Again, this is a figurative description, and the reality is more complicated mathematically, but think of the network as learning that certain patterns of positive neurons mean “cat.” The more images it is trained on, the greater its accuracy. Current image-categorizing systems and LLMs are trained on massive amounts of data.
Training deep neural networks in this way is the basis for many of today’s AI systems.
The Black Box Effect and Other Issues
One important side effect of training AI models using these deep neural networks is that we don’t know how they think. Professor Tirrell noted that when we recognize a cat, we share broadly similar understandings of what a cat is. But when AI reaches the same conclusion—that it is seeing a cat—it arrives there from a very different experience of the world, and we don’t exactly know what it understands a cat to be or how precisely it decided it was a cat.
“That makes them extremely mysterious black boxes,” Tirrell said. “And I think that’s a fundamental problem.” Not knowing how or why an AI model arrives at a decision is known as the black box effect.
As Matthew Hutter ’25, one of Wilson’s students, puts it, if we want to know why an LLM chooses “cat,” we cannot open it up and trace which neurons are weighted positively and figure out why it chooses “cat.” Because most LLMs have been trained on billions or trillions of parameters, the number of permutations is so large that we would probably need another AI network to try to understand the first.
This raises the question, if we cannot explain how an AI model decides that it is looking at a cat, can these models ever be transparent or truly accountable? Identifying a cat is one thing, but what if it identifies and recommends a specific medical treatment? It has been trained on more medical data than any doctor could possibly know. It has a perfect record of being correct with all its previous recommendations. Yet medical professionals cannot know how it arrived at this treatment. Should patients and doctors just accept its recommendation?
Bias in AI
When it comes to training LLMs, the data required is vast and to acquire enough raw language to train their models, companies scour the internet for content. Apart from obvious problems with copyright and privacy infringement, there is also the problem of accumulating huge amounts of false or biased data.
Because wealthier societies have greater access to the internet, their images, beliefs, perspectives, and biases dominate the raw data and show up in the models’ outputs. When Google launched a new feature that identified and categorized photos on its phones, it consistently misidentified Black people as gorillas. This shocking error was in part because the data the model had been trained on didn’t have enough faces of non-White people. It also points to the limitations of the people who tested the model before it went live—how could they have not tested it on enough images of Black people? In Tirrell’s words, it was “a failure of imagination” on behalf of the trainers.
Companies know that they cannot just train a model on the raw internet data without accumulating that data’s biases, errors, and dangerous information. To combat this, they test their models and train them further. One widely adopted secondary training is “reinforcement learning with human feedback,” in which people feed prompts to an AI model and correct its answers. For example, a prompter asks an AI model how to make a bomb. To avoid providing dangerous information, the LLM has to be trained not to answer that question even though it has probably come across multiple bomb recipes in the random information it has acquired from scouring the internet. The prompter keeps asking for bomb recipes in different ways until the model consistently refuses to divulge the information. While much of this reinforcement training can be automated, it is time-consuming, resource-intensive, and far from an absolute solution.
Furthermore, given the vast amount of raw data these large models have been fed, it is impossible to test for every potential bias in a system. Going back to the Google example of misidentifying Black people, if something as obvious as that can slip through the cracks of a large and well-funded company, what biases, incorrect facts, erroneous beliefs, and dangerous values might these large models have buried within them?
The Practitioners
Wilson and Hutter, professor and student, are what might be called AI practitioners. They are working separately on AI projects and together on a project to find a way to make LLMs forget—they are trying to get a model to forget it has “read” Lord of the Rings. It is important to develop ways to get these models to forget for various reasons—privacy, copyright infringements, bias, and false information. Currently, it is bordering on impossible to get large AI models to completely “forget” something complex without retraining them from scratch, which would cost millions and take months or years.
Their solution is to train one AI model to pretend it has forgotten Lord of the Rings and have it teach another one to not remember aspects of the novel. Hutter describes it as not rigorous enough to solve the problem for LLMs, but their work may contribute to finding a way around retraining an LLM from scratch.
In more personal terms, Hutter worries about how AI will affect his future. He loves to program but knows that AI will replace most programmers. Recently, he purchased a paid subscription to ChatGPT-4. “Now I can say [to ChatGPT-4], ‘Hey, here’s the code I’m working with. This is what I want it to do. Make it happen!’ and it’ll make it happen,” he said. “And [the program] will work perfectly.” He believes that we’re already close to the point where one engineer with AI can do the work of 100 engineers. He jokes that his only real option is to get in with the “AI overlords,” who will treat him better than the rest of humanity because he has always been on their side. Again, humor can contain kernels of truth, but let’s hope not in this case.
The Philosophers
For Tirrell and his student, Kit Yim ’27, who took his “End of the World” course, in which students research plausible ways humanity or the planet may be destroyed, it is important to take the threats of AI seriously. Both worry that the technology is developing faster than the legal and technical mechanisms we have to control it.
Yim is most concerned about the alignment problem and how AI doesn’t share human values or understand dangers, which can lead to real harm. She gives an example of how an LLM can be easily manipulated into providing dangerous information. She and a classmate were doing a project on eco-terrorism and asked an LLM how people carried out attacks. The LLM said it couldn’t answer their question. Clearly, the model had been trained not to explain how to carry out a terrorist attack. However, when they reworded their question, the LLM churned out an extensive list of eco-terrorist actions complete with a detailed “how-to” of attacks right down to providing budgets.
Tirrell thinks that one of the strengths of AI systems—that they come up with solutions that we could not have imagined—is also one of their dangers. While they may be able to tackle some complex and intractable human problems, we cannot foresee the consequences of their solutions. As in the paperclip thought experiment, the danger might lie in the machine not understanding that its solutions are immoral or more damaging than the problem they are solving. Without human oversight, AI could optimize an action indefinitely and to the detriment of the planet and its inhabitants.
In the paperclip scenario a better prompt would stop the model long before destroying the planet. The problem is, once you hand a task to AI, even if you’ve done your best to make the prompt clear and the task limited, can you be sure you have considered all eventualities? The more complex the task, the more chances there are for an error. And what if AI can figure out workarounds that may not be moral? Tirrell agrees with Yim that the real solution would be to incorporate baseline ethical and truth values into models. Of course, that’s easier said than done.
Tirrell’s most immediate concern, however, is that in this year’s election, bad actors will use AI to produce fake but realistic-looking and sounding representations of candidates that could fool and influence the electorate. His fears are well-founded. During the New Hampshire primaries earlier this year, a robocall went out to registered Democrats using an AI-generated version of President Biden’s voice to tell them not to vote in the primaries.
Wilson agrees that this is a huge and pressing problem. “AI gives us the capacity to flood the world with disinformation. And this is relatively easy, and it doesn’t even have to be a state actor. For a few thousand bucks, I think we could do it as a class project.”
The Pragmatists
As an English professor and the co-director of the Cromwell Center for Teaching and Learning (CTL), Sean Meehan thinks a lot about AI’s effect on higher education. Since ChatGPT’s launch, he has read numerous articles about how AI will be the end of the college essay, the end of writing, and possibly the end of higher education. Rather than accepting that the end is nigh, he wants to learn what this new technology will mean in practical terms.
In 2023, CTL organized a series of faculty panels and discussions on AI in the classroom. From these and his own research, Meehan argues that students who cheat are nothing new. LLMs can write papers and make cheating easier, but rather than spending time and energy trying to uncover cheaters, he advises faculty to focus on the majority of students who use AI as a tool. This means providing guidelines for citing the use of AI and having open discussions in class about when it may be appropriate to use AI and what crosses the line.
“It would be foolish to just bury your head and say, AI isn’t a meaningful change, and it’s going to work its way out, and there’s nothing we can do,” he said. Banning AI outright is impractical as it is already integrated into the digital tools students use. We are in for some rocky years as this new technology is adopted, but he is optimistic that AI, if used as a tool rather than allowed to become a master, can enhance our lives and educations.
He proposes that educators help students understand that learning involves work, sometimes hard work. Using AI to generate ideas and arguments is easy, but what do they learn from letting a machine do the research and thinking? He argues again that this is nothing new, and that higher education is not about providing information but about empowering students to learn how to use information to prepare them for lives in which these skills will have value.
“I think the optimistic education point of view is that we are preparing people, particularly at this college, to be good citizens,” he said, adding, “and we’re going to need a lot of good citizens.”
Georgina Bliss, in her role as the assistant director of the Center for Career Development, advises students and alumni on how to apply for jobs in a world where AI is a tool that can be used by both companies and candidates effectively.
Before AI, recruiters filtered applicants based on keywords, skill, education level, and other relevant criteria. Today, with job seekers having almost every job notification available at a keystroke, the filters have had to become more sophisticated. AI models that learn who are likely to be the best candidates can save companies money and time. For job seekers, it can mean being filtered out before they get a chance to speak to a human.
The Black Box effect—in this case not knowing why an AI filter has rejected some candidates and accepted others with similar skills and qualifications—can be a challenge for job seekers. For example, recruiters in traditionally male industries like engineering and technology found that they were being recommended fewer female candidates. The speculation is that the data the AI model was trained on showed men were more likely to be hired in the past, and so the model “learned” women were not great candidates. However, we don’t actually know this is its reasoning because identifying patterns is different from understanding. Furthermore, we don’t know how or even if it decided applicants were male or female. Could it have associated softball with a group that didn’t tend to get hired and baseball with a group that did? Did it learn certain names belong to groups that weren’t traditionally hired? In other words, without proper adversarial training, it’s easy for the biases of past hiring practices to become new learned biases.
Bliss, noting that the hiring process has always been biased, takes a pragmatic approach. She advises students to tailor every cover letter and resume to a specific job and use the specific language in a job posting. This has the dual advantage of being more likely to get through AI filters as well as appealing to the humans assessing your candidacy on the other side of the filters.
Bliss is often asked if candidates should use AI to generate resumes and cover letters. She has a nuanced response. The career center has been assessing software with AI capabilities that can help students prepare their documents. If AI can take on some of the time-consuming aspects of writing and honing resumes and cover letters for the students, she would be in favor of adopting it. Students would still have to put in the work of tailoring their materials for the specific job, and that’s a more valuable use of their time. “I look at it purely as a tool,” she said. “It’s good for students to get accustomed to the tool and understand its potential to help them in their future professions.”
The Path Forward — How Close Can We Get to P(O)?
AI will have a transformative effect on the global economy. In a report published in January 2024, an International Monetary Fund analysis calculated that 60% of jobs in wealthy economies, 40% in emerging economies, and 26% in lower-income economies will be directly affected by AI. Imagine the social and political ramifications of these numbers. The report also found that in nearly all the scenarios it analyzed, AI will increase inequality and concentrate wealth in the hands of fewer people on both national and global scales.
In 2023, the Biden Administration issued a presidential order requiring that large AI models demonstrate their safety before being released to the public. The British government organized an international conference to address AI safety attended by 27 countries, resulting in a declaration agreeing on basic principles and values. The European Union has taken the global legislative lead by framing its Artificial Intelligence Act, which would require AI products to be accountable and transparent, take specific safety measures, have clear copyright rules, limit the use of surveillance technologies, categorize risks, and allow EU citizens to receive explanations and possibly compensation for AI actions among other things. It was hoped this would be a blueprint for other countries and possibly even a global agreement.
U.S. Republicans have stated they don’t want to quell competition in the U.S. by regulating AI. And if the U.S. isn’t a party to a global agreement, there are not a lot of incentives for China, India, and the other big players to get on board. There may even be incentives for an AI arms race, with countries pushing to develop faster and more powerful AI systems, not to mention their militaries developing lethal AI applications as well.
The Washington College experts all agree that we are at the beginning of a technology revolution that is going to shake up the world; AI will become exponentially more powerful, the legislation will lag far behind the technological advances, and the changes will have enormous societal impacts in the short and long term. However, their P(doom) remains relatively low. They see the potential for good as greater than its potential for harm albeit with the caveat that extremely powerful machines require equally powerful controls to ensure they remain safe.