Zero-shot, One-shot and Few-shot learnings

Hello, everyone, I mentioned the terms "zero-shot," "one-shot," and "few-shot" in the "What is ChatGPT?" video, but due to the time constraints of the video, I only briefly mentioned them without much explanation. Here, I will further elaborate on what these three terms mean.

GPT learning

"Zero-shot," "one-shot," and "few-shot" in GenAI, also known as GPT learning, refer to how machine learning models learn and handle new tasks with a small number of examples. "Shot" can be simply understood as an "example" or an “attempt” provided to a model for training. Therefore, "zero-shot," "one-shot," and "few-shot" refer to the number of examples used.

Zero-shot

The model can predict new tasks without specific samples or training data. For example, give the model a classification label it has never seen before. It can use its pre-trained language understanding to infer the meaning from the context you provide and predict how to respond.

One of the most common applications of zero-shot learning is "language translation." For instance, the model may have needed to be trained with a specialized knowledge base for the electronics manufacturing industry. When you ask it to translate the Chinese sentence "這批電路板的交期可能會受颱風影響" into English, it can accurately respond with "The delivery schedule of this batch of circuit boards may be affected by the typhoon." This demonstrates the zero-shot learning concept, as it responds based solely on understanding the two languages.

One-shot

One-Shot Learning. The model learns how to complete a task from a single example and generalizes that example to similar situations. For instance, I provide ChatGPT with a Q&A sample consisting of two sentences: “Q: What in the hell are you doing?” followed by “A: Angry.” My intention is for it to detect the emotion from my question and respond using the format "A:" followed by an emotion description. Next, I say, "Q: Great, I just won 100 dollars." The model responds based on the pattern learned from the previous example, answering with “A: Happy/Excited.” This is what we call one-shot learning.

Few-shot

Few-Shot Learning. The model learns how to complete tasks based on a few examples, typically two or three. These examples serve as prompts, guiding the model to answer or solve similar problems. It's quite similar to one-shot learning, except you provide the model with a few more examples, making it clearer how you want it to respond. For example, I provide it with two sets of Q&A. In the first set, the question is "Where is the United States of America?" And the answer is "North America." In the second set, the question is "Where is Japan?" And the answer is "Northeast Asia." After that, I asked, "Where is Taiwan?" And it responded, "A: East Asia." This is called few-shot learning.

Why using shot learning:

Since GPT already stands for Generative Pretrained Transformer and has undergone pretraining, why is shot learning still necessary? The core purpose of sample-based learning is to leverage the model's pretraining capabilities and its general knowledge of language to solve new problems with limited data. For large language models like GPT, techniques such as zero-shot, one-shot, and few-shot learning—known as "prompt engineering"—enable the model to flexibly handle a wide range of tasks without requiring extensive, specialized training for each new task.