Expert-Level Prompt Engineering Techniques for Better LLM Results

Improve your models output with these tips.

Jun 03, 2025

Expert-Level Prompt Engineering Techniques for Better LLM Results

Prompt engineering is the practice of carefully crafting input text to guide a large language model’s (LLM’s) behavior. It uses the model’s pre-existing capabilities (from pretraining and instruction fine-tuning) without any additional weight updates. For example, it is shown that simply providing a few input–output examples in the prompt can enable LLM to perform new tasks without fine-tuning.

Broadly, prompt engineering techniques trade off the high cost and data requirements of fine-tuning for quick and flexible control via text instructions. It’s noted that prompt engineering is “far faster” and “more resource-friendly” than fine-tuning, while preserving the model’s broad knowledge.

In practice, effective prompts often involve a combination of clear instructions, examples, role definitions, and structured formatting. That’s why this article will present a few techniques that will improve your prompt engineering expertise.

Curious about it? Let’s get into it!

Referral Recommendation

Business English for Programmers that gets you hired, promoted, and heard - trusted by 20k+ readers at Meta, Revolut, and Amazon.

Check it out—and support my work—through this referral link.

1. Master Zero-Shot and Few-Shot Prompting

In zero-shot prompting, the user provides only the task instruction or question (with no examples). For example, the prompt is as follows:

Classify the sentiment of the following review:  
"I absolutely love this product!"  
Sentiment:

In few-shot prompting, the user provide a small set of example input–output pairs (often 3–10) illustrating the desired behavior. Few-shot examples serve as in-context training that steer the model’s outputs.

Google’s Gemini guide recommends always including a few-shot example to make the task clear, as the well-chosen examples can even replace lengthy written instructions. For example, a few-shot example prompt is as follows:

Classify the sentiment of the following reviews:  
Review: "The product broke after one use."  
Sentiment: Negative  
Review: "Excellent quality and fast shipping."  
Sentiment: Positive  
Review: "Not what I expected."  
Sentiment: Negative  
Review: "Great value for the price."  
Sentiment:

2. Chain-of-Thought (CoT) Prompting

To create multi-step reasoning, we can prompt the model to think aloud by generating intermediate steps. This technique (often called chain-of-thought prompting) can dramatically boost performance on complex tasks.

As Wei et al. (2022) showed, providing a few exemplar reasoning chains in the prompt significantly improves performance on math and logic problems. Even a simple zero-shot phrase like “Let’s think step by step” can encourage the model to output a reasoning chain before answering.

You can use the prompts like an example below:

Question: If a train travels at 60 miles per hour for 2.5 hours, how far does it travel?  
Let's think step by step:

We can boost reliability by sampling multiple CoT outputs and selecting the majority answer, which we call self-consistency.

3. Assign a Role with System or Role Prompts

Defining a Persona or system-level instruction often results in more coherent and task-focused output. In chat-based APIs such as OpenAI’s or Anthropic’s, we can prepare a system message that sets the scene (e.g. “You are a data scientist.). By giving a role, we can influence the model's behavior fully.

Role prompts also boost accuracy and provide appropriate style. The method can be done by simply naming that role within the prompts.

The prompt example is shown below:

You are a cybersecurity expert. List the top 3 OWASP API security risks in bullet points.

Or in OpenAI, you can assign the system prompts like the following:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
        {"role": "user", "content": "Who were the founders of Microsoft?"}
    ]
)

By defining the system prompt, we can control our output model much better.

4. Enforce Structured Output Formatting

We can instruct the model to format its output in a specific way, which is crucial for downstream parsing. Many professionals explicitly request JSON, XML, bullet lists, tables, etc., as the format for the output.

For example, Amazon’s AWS blog on Claude recommends asking the model to start outputting the information right away in a JSON format when needed. For example:

Text: "John Doe, 35, software engineer in NY, joined in 2015."  
Format as JSON:

The output will follow the JSON format, which we already requested.

5. Prompt Chaining

For multi-step tasks that might exceed a single prompt’s scope, we can chain prompts together. Prompt chaining breaks a complex task into sequential subtasks, feeding the model’s output into the next prompt.

The benefits are accuracy and clarity: each sub-prompt is simpler, and errors can be isolated and corrected step by step.

For example, in a multi-stage data analysis:

Prompt one might extract key facts,
Prompt two analyzes those facts,
Prompt three formats the final report.

An advanced variant is a self-correction chain: after generating an answer, you feed it back to the model with a prompt like “Review your answer and fix any errors.”

6. Apply Psychological and Linguistic Priming

Psychological and linguistic priming involves using specific language cues or context-setting phrases in our prompts to shape the style, tone, and internal context that the LLM uses to generate a response.

Because LLMs learn from billions of documents, including cultural styles, linguistic patterns, and domain-specific jargon, even small phrasings can activate different knowledge areas or rhetorical styles.

Mentioning a certain role (“As a CISO”), tone (Write humorously”), or audience (“For high school students”) primes the LLM to retrieve relevant patterns from its training data.

For example, you can use the prompt like this:

You are a Chief Information Security Officer (CISO) at a fintech startup. Provide a risk assessment for storing customer data in an on-premises database with an academic tone suitable for a scientific paper to a high school student with no background in data science.

It’s similar to the above techniques we have just learn, but we can combine them.

7. Compare and Iterate Across Multiple LLMs

No single LLM is perfect for every task. Different models like GPT-4, Claude, Gemini, or open-source LLaMA have different strengths, quirks, and failure modes.

Comparing their outputs with the same prompts helps you:

Find the best fit for your specific use case.
Identify gaps or biases in one model’s response.
Spot hallucinations or errors faster.

This is because each model has unique:

Training data (which affects factual accuracy).
Output style (verbosity, tone, adherence to prompts).
Policy filters (some models refuse certain content).
Parsing tendencies (especially important for structured outputs like JSON).

By iterating across models, we can build a composite view of what’s possible and refine our final prompt for the best outcome.

Love this article? Comment and share them with Your Network!

Non-Brand Data

Discussion about this post