What is ChatGPT, and how did it come about?

OpenAI’s breakout hit, ChatGPT, has taken the world by storm. Released at the end of November 2022 as a web app by the San Francisco-based firm OpenAI, the chatbot has exploded into the mainstream almost overnight. According to some estimates, it is the fastest-growing internet service ever, reaching 100 million users in January 2023, just two months after launch. Through OpenAI’s $10 billion deal with Microsoft, the tech is now being built into Office software and the Bing search engine. Stung into action by its newly awakened onetime rival in the battle for search, Google is fast-tracking the rollout of its own chatbot, LaMDA. Even my family’s WhatsApp is filled with ChatGPT chat.

But OpenAI’s breakout hit did not come out of anywhere. The chatbot is the most polished iteration to date in a line of large language models going back years. This is how we got here.

1980s–’90s: Recurrent Neural Networks
ChatGPT is a version of GPT-3, a large language model also developed by OpenAI. Language models are a type of neural network that has been trained on lots and lots of text. (Neural networks are software inspired by the way neurons in animal brains signal one another.) Because the text is made up of sequences of letters and words of varying lengths, language models require a type of neural network that can make sense of that kind of data. Recurrent neural networks, invented in the 1980s, can handle sequences of words, but they are slow to train and can forget previous words in a sequence.

In 1997, computer scientists Sepp Hochreiter and Jürgen Schmidhuber fixed this by inventing LSTM (Long Short-Term Memory) networks, recurrent neural networks with special components that allowed past data in an input sequence to be retained for longer. LSTMs could handle strings of text several hundred words long, but their language skills were limited.

2017: Transformers
The breakthrough behind today’s generation of large language models came when a team of Google researchers invented transformers, a kind of neural network that can track where each word or phrase appears in a sequence. The meaning of words often depends on the meaning of other words that come before or after. By tracking this contextual information, transformers can handle longer strings of text and capture the meanings of words more accurately. For example, “hot dog” means very different things in the sentences “Hot dogs should be given plenty of water” and “Hot dogs should be eaten with mustard.”

2018–2019: GPT and GPT-2
OpenAI’s first two large language models came just a few months apart. The company wants to develop multi-skilled, general-purpose AI and believes that large language models are a key step toward that goal. GPT (short for Generative Pre-trained Transformer) planted a flag, beating state-of-the-art benchmarks for natural-language processing at the time.

GPT combined transformers with unsupervised learning, a way to train machine-learning models on data (in this case, lots and lots of text) that hasn’t been annotated beforehand. This lets the software figure out patterns in the data by itself, without having to be told what it’s looking at. Many previous successes in machine-learning had relied on supervised learning and annotated data, but labeling data by hand is slow work and thus limits the size of the data sets available for training.

But it was GPT-2 that created the bigger buzz. OpenAI claimed to be so concerned people would use GPT-2 “to generate deceptive, biased, or abusive language” that it would not be releasing the full model. How times change.

In December 2020, the AI industry was facing a reckoning over the failure to curb toxic tendencies in AI. Large language models like GPT-3 have the potential to generate false and hateful text. Google’s AI ethics team co-director, Timnit Gebru, coauthored a paper that highlighted the potential harms associated with large language models, including high computing costs. However, the paper was not welcomed by senior managers inside the company, and Gebru was pushed out of her job.

In January 2022, OpenAI attempted to address the issue by using reinforcement learning to train a version of GPT-3, known as InstructGPT. The model was trained on the preferences of human testers and produced less offensive language, less misinformation, and fewer mistakes overall. InstructGPT was better at following the instructions of people using it, a concept known as “alignment” in AI jargon. The result was a GPT-3 model that was less likely to generate toxic text.

A common criticism of large language models is that the cost of training them makes it difficult for all but the richest labs to build one. This raises concerns that powerful AI is being developed by small corporate teams behind closed doors, without proper scrutiny and without the input of a wider research community. In response, collaborative projects have developed large language models and released them for free to any researcher who wants to study and improve the technology.

In May-July 2022, Meta built and released OPT, a reconstruction of GPT-3. Hugging Face led a consortium of around 1,000 volunteer researchers to build and release BLOOM, another large language model. These models were made available for free to researchers, enabling wider scrutiny and input from the research community.

In December 2022, OpenAI launched ChatGPT, a GPT-3 model that was trained to master the game of conversation. OpenAI used reinforcement learning to train ChatGPT on feedback from human testers who scored its performance as a fluid, accurate, and inoffensive interlocutor. ChatGPT quickly became popular, with millions of people engaging in conversations with the AI language model. OpenAI was blown away by how well ChatGPT was received, as it was originally intended to be an incremental update to InstructGPT.

The development of large language models like GPT-3 has been accompanied by concerns about the potential for these models to generate toxic text. However, recent developments, such as InstructGPT, OPT, BLOOM, and ChatGPT, have shown that it is possible to mitigate these concerns by training language models on the preferences of human testers and making these models available for wider scrutiny and input from the research community. As AI technology continues to develop, it will be important to address these concerns to ensure that AI is developed responsibly and ethically.

What is ChatGPT, and how did it come about?

4 min

0 Comments

Leave a ReplyCancel reply

Like it? Share with your friends!

0 Comments

Leave a ReplyCancel reply