The Difference Between System Messages and User Messages in Prompt Engineering

After setting the model, the next field to configure in an API request—for essentially every model provider—is the role. But what's the real difference between a system message and a user message? Where should you place certain pieces of context, and does it even matter, especially in non-chat applications?

Understanding not just the definitions but also how to effectively use system and user roles can lead to better and more aligned outputs.

In this article, we'll clear up the differences between system and user messages. We'll dig into what type of information should live in each and why it matters, even in non-chat experiences.

‍

Hey everyone, how's it going? Dan here from PromptHub. Hope you all had a great Thanksgiving for those of you in the US. Today, we'll be talking about the differences between system messages and user messages, also known as prompts.

Introduction

Starting from the top, I'll say there isn't a lot of hard evidence in terms of detailed or empirical studies about where context should be placed—system message versus user message. Even the LLM providers don't touch on this too much and sometimes offer conflicting information. We'll dive in to give a holistic view and provide tools to help you make informed decisions.

That said, we did tests on this over a year ago—testing with and without system messages, placing context in system messages versus solely in the prompt. I'll link those results below. Today, we’ll discuss the roles, best practices, why it’s good to leverage both, and other examples.

Understanding Roles in API Requests

When you make an API request to an LLM, you'll typically interact with three roles:

System Role: Defines the overarching context and behavior of the interaction. This is like setting the "rules" for the assistant.
User Role: Represents the user's input. This is essentially the prompt requesting the model's response.
Assistant Role: Represents the LLM’s response based on the provided system and user inputs.

Best Practices for System and User Messages

We’ve observed the following best practices for using these roles:

System Role: Best for high-level context, such as setting tone, behavior guidelines, and operational constraints. Use it to establish the general rules for interaction.
User Role: Focus on the specific task at hand. Include precise instructions, context, examples, and desired response structures here.

Why Not Combine Everything in One Role?

There are benefits to separating high-level and low-level context:

It reduces ambiguity and conflicting instructions.
It makes it easier for teams to iterate and improve prompts.
It aligns better with how LLMs process instructions.

We often say that each prompt should do one thing and do it well. This principle applies to system and user messages—both should be clear and focused.

Insights from System Messages Examples

We’ve examined system messages from major AI providers like OpenAI and Anthropic. Some insights:

OpenAI has introduced a system instructions generator in their Playground, which can be a useful reference.
Anthropic provides templates for system prompts, which are openly documented and available in PromptHub for testing and iteration.

However, many of these templates are overly long, which can lead to inefficiencies and confusion. Shorter, more concise system messages tend to perform better.

Wrapping Up

System messages should be used for high-level context, such as setting personas, tone, and constraints, while user messages should focus on specific tasks and immediate interactions. Clarity is key—separating high-level from low-level information improves performance and simplifies iteration.

Be mindful of overly lengthy system messages, as they can reduce efficiency and clarity. Regular testing and iteration are essential to refine your approach.

That’s it for today! Thanks for tuning in, and let me know if you have questions. See you next time!

‍

Different role types when using LLMs

Generally, most LLMs have three roles:

System Role: Sets the context and guidelines for the LLM's behavior.
User Role: Represents the user's input, including questions or commands
Assistant Role: The LLM's responses

Let’s dive into each.

System role best practices

The system role is what sets the high-level context for the model. Usually, it is the first instruction that the model reads, and it helps establish the overall context.

Purpose and functionality

Generally speaking, the system role—also referred to as “system instructions” or “system messages”—should focus on high-level instructions. Here are a few concrete examples of the types of instructions and best practices that are worth testing in the system message.

Context setting: Defines the overall scenario or environment; could include setting a persona.
Example: "You are an experienced travel advisor specializing in eco-friendly tourism."

Behavioral guidelines: Sets certain rules, such as avoiding certain topics, maintaining professionalism, or sticking to a certain code of conduct.
Example: "Avoid discussing political opinions and ensure all responses are unbiased."

Response style: This could be wrapped up in the persona, but it doesn’t have to be. Response style can exist independently of the persona and can influence the tone and format of the LLM’s responses.
Example: "Use a friendly and conversational tone suitable for a general audience."

Operational constraints: The system message can also establish certain constraints, specifying how the LLM should act in situations where inputs are unclear or provocative.
Example:"If you are unsure about a question, make sure you ask for clarification before moving forward”

User role best practices

The user role, also generally referred to as “the prompt” or the “user prompt,” is where the specific question or task lives. It usually comes after the system message and lays out the specifics of the task at hand.

Purpose and functionality

Generally speaking, the user role, is where you give lower-level information. Here are a few concrete examples of the types of instructions that are worth testing in the user message.

Specific questions: The queries themselves should live in the user message.
Example:"What are some eco-friendly travel destinations in South America?"

Specific contextual information: Contextual details that give more information about the circumstances should be included in the user message.
Example: "I'm planning a trip in June and prefer destinations with hikes."

Few shot examples: Few-shot examples, also known as in-context learning, should live in the user message, as they are generally pretty specific to the task at hand
Example: "Here are a few example outputs to give you a better understanding of structure and style: {{example_1}} {{example_2}}"

Response structure: While output structure could be seen as high-level information, I’ve personally seen that it is more effective to have format guidelines in the user message.
Example:"Please provide the information in a list format with brief descriptions."

The assistant role

The assistant role is used to designate the LLM’s response.

To put it briefly, the system role should be used to set clear guidelines and context, while the user role is where you get more specific.

‍

Why not just use one message type?

Before going any further, it's worth mentioning that, aside from anecdotal evidence—which is powerful—it's hard to know the effect of placing certain context in the system message versus the user message. We ran some tests more than a year ago if you're interested in some tangible evidence: System Messages: Best Practices, Real-world Experiments & Prompt Injections.

None of the model providers give much information on this, and since their models are more or less a black box, it's hard to know exactly what is going on behind the scenes.

So for non-chat purposes, why not just include all the information in a single user prompt? It’s worth testing based on your use case, but I believe there is still value in separating high-level and low-level information.

Separating messages into distinct roles—system and user messages—can also make it easier for your team to manage prompts. Smaller pieces are easier to understand compared to a single large one.

The limitations of a single large prompt

We’re big believers that each prompt should do one thing and do it well. One easy way to boost performance is to break down complex prompts into a series, or chain, of simpler prompts. If you’re interested in learning more about prompt chaining, check out our guide: Prompt Chaining Guide.

‍

Using one big prompt to convey both high-level instructions and specific queries can create several issues:

Ambiguity and confusion: Prompt performance tends to decrease when users try to include multiple instructions. Combining overarching guidelines with specific tasks often leads to prompts that are hard to follow.‍
Reduced clarity: While it isn’t 100% clear how the LLM handles different roles, it may be safe to assume that the roles are there for a reason and provide some benefit via their structure.‍
More difficult to iterate: The bigger your prompt, the more challenging it is to find areas for improvement and iterate on it.

System message examples

A little while back, we used prompt injections to unveil the system messages for a few AI tools: What We Can Learn from OpenAI, Perplexity, TLDraw, and Vercel's System Prompts.

More recently, we were able to get the system message behind OpenAI’s automated System Instructions generator—the system message for the system that generates system messages.

‍

OpenAI system instructions generator prompt template in PromptHub

‍

Additionally, Anthropic has published the system messages that power the Claude.ai interface, which you can check out in their documentation or in our templates, if you’d like to test them out.

The caveat with these system prompts is that I believe they are all trying to do too much. For example, the Claude 3.5 Sonnet System Prompt is over 5,000 tokens. While they need to cover a wide range of use cases, I think there are a number of ways to shorten this system message and use multiple, more specific prompts.

Conclusion

I wish we had more guidance on this topic from the major model providers, but until then, hopefully this guide can help your testing process. These things always end up being very use-case specific, so make sure you test appropriately to figure out what works best for you.

Here is the TL;DR:

System Messages: Use these to set the foundational context and other high-level information like a persona guidelines, and boundaries for the LLM to follow. They establish the role, tone, and constraints that persist throughout the conversation.
User Messages (’prompts’): These drive the immediate interaction, and should be much more low-level. They should focus on specifics.
Best Practices:
- Clarity is key: As with any prompt engineering problem, being specific and concise is the most important rule, for user and system messages.
- Structure (probably) matters: Separating high-level instructions from specific queries enhances the AI's ability to provide accurate and relevant responses.
- Avoid common pitfalls: Be careful of extremely lengthy system messages that could overload the model and contain contradicting information.

Dan Cleary

Founder

The Difference Between System Messages and User Messages in Prompt Engineering

Introduction

Understanding Roles in API Requests

Best Practices for System and User Messages

Why Not Combine Everything in One Role?

Insights from System Messages Examples

Wrapping Up

Different role types when using LLMs

System role best practices

Purpose and functionality

User role best practices

Purpose and functionality

The assistant role

Why not just use one message type?

The limitations of a single large prompt

System message examples

Conclusion

Get the week's best prompt engineering and AI content

Join thousands of AI builders

More from the PromptHub Blog

How to Automatically Pick the Right Model for the Right Job

Why LLMs Fail in Multi-Turn Conversations (And How to Fix It)

When ‘Thinking’ Models Stop Thinking