Prompt design and engineering stands out as one of the most approachable methods to drive meaningful output from a Large Language Model (LLM). However, prompting large language models can feel like navigating a complex maze.
Designing a prompt is a relatively new discipline with several techniques that need to be explored. To get an idea, check the prompt engineering guide. In addition, to obtain the best results from an LLM, you must experiment with various combinations of instructions and examples to achieve the desired output. Moreover, even if you find the ideal prompt template, there is no guarantee that the prompt will continue to achieve the task for a different LLM. As a result, you end up spending more time migrating or translating a prompt template from one model to another.
To mitigate "prompt fatigue" one might experience while building LLM-based applications, we are announcing Vertex AI Prompt Optimizer in Public Preview. In this blog, you will learn how to get started with Vertex AI Prompt Optimizer using the Vertex AI SDK for Python. By the end of this article, you will have a better understanding of Vertex AI Prompt Optimizer and how it helps save you time and effort in prompt engineering while ensuring high-performing prompts ready for your GenAI applications.
Vertex AI Prompt Optimizer is a prompt optimization service that helps users find the best prompt (instruction and demonstrations) for any preferred model on Vertex AI, where Instructions include the system instruction, context, and task of your prompt template and Demonstrations are the few-shot examples you provide in your prompt to elicit a specific style or tone from the model response. Vertex AI Prompt Optimizer is based on Google Research’s paper on automatic prompt optimization (APO) methods (accepted by NeurIPS 2024).
Imagine that you want to solve this math problem as the one below. You need clear instructions and examples to help solve it. The instructions tell us the rules for solving the problems (e.g. how to handle negative numbers). The examples demonstrate how to apply the rules. That’s the idea behind Vertex AI Prompt Optimizer.
To find best instructions and examples, Vertex AI Prompt Optimizer employs an iterative LLM-based optimization algorithm where the optimizer model and evaluator model work together to generate and evaluate candidate prompts and subsequently selects the best instructions and demonstrations based on the evaluation metrics the user wants to optimize against. Below you can see an illustration of how Vertex AI Prompt Optimizer works.
With just a few labeled examples (input and ground truth output pair) and optimization set-up, Vertex AI Prompt Optimizer finds the best prompt (instruction and demonstrations) for the target model, significantly saving time and effort for users. Ultimately, the product streamlines the process of prompt design and prompt engineering and enhances overall quality of LLM-based applications. Users can now craft a new prompt for a particular task or translate a prompt from one model to another model on Vertex AI with ease.
Now that you have a better understanding of how Vertex AI Prompt Optimizer works, let’s see how to enhance a prompt to use it with a Google model on Vertex AI.
Imagine that you build a simple AI cooking assistant that provides suggestions on how to cook healthier dishes. For example, you ask “How do you create healthy desserts that are still delicious and satisfying, while minimizing added sugars and unhealthy fats?”. And the AI cooking assistant answers: “Here are some tips on how to achieve this balance in your recipe, minimizing added sugars and unhealthy fats: …”. Below you have an example of a generated answer.
The initial version of the AI cooking assistant uses an LLM with the following simple prompt template:
Given a question with some context, provide the correct answer to the question. \nQuestion: {{question}}\nContext:{{context}}\nAnswer: {{target}}
Based on the Q&A evaluation dataset you collected and the Q&A evaluation metrics calculated using Vertex AI GenAI Evaluation, the initial version of your AI cooking assistant can generate high-quality and contextually relevant answers. Here's a summary of the evaluation metrics report.
Not bad. But there is room for improvement in the quality of generated answers with respect to associated questions. Let’s imagine that you want to use Gemini 1.5 Flash as more efficient LLMs for your assistant, but you don’t have previous experience with the Gemini model family to find a more performing prompt template to complete the task with Gemini 1.5 Flash. This is where Vertex AI Prompt Optimizer comes into play.
To use Vertex AI Prompt Optimizer for enhancing your prompt template, you follow these steps:
To start, prepare the prompt template you want to optimize. Vertex AI Prompt Optimizer expects a prompt with both the instruction template which is a fixed part of the prompt template shared across all queries for a given task and context and task template which is the dynamic part of the prompt template that changes based on the task. Below you can see the original template you prepare to use with Vertex AI Prompt Optimizer in a Q&A task.
INSTRUCTION_TEMPLATE = """
Given a question with some context, provide the correct answer to the question.
"""
CONTEXT_TASK_TEMPLATE = """
Question: {{question}}
Answer: {{target}}
"""
Next, Vertex AI Prompt optimizer requires a CSV or JSONL file containing labeled samples (input, ground truth output pairs) they are going to be used during the optimization process. In this use case, it is recommended to label examples from the source models that the target model struggles with. This would help identify areas of improvement. Below you can find an example of the labeled sample you upload to Google Cloud bucket.
{"target":"Here\'s how to tackle those delicious red meats and pork while keeping things healthy:\\n\\n**Prioritize Low and Slow:**\\n\\n* **Braising and Stewing:** These techniques involve gently simmering meat in liquid over low heat for an extended period. This breaks down tough collagen, resulting in incredibly tender and flavorful meat. Plus, since the cooking temperature is lower, it minimizes the formation of potentially harmful compounds associated with high-heat cooking. \\n\\n* **Sous Vide:** This method involves sealing meat in a vacuum bag and immersing it in a precisely temperature-controlled water bath...","question":"What are some techniques for cooking red meat and pork that maximize flavor and tenderness while minimizing the formation of unhealthy compounds? \\n\\nnContext:\\nRed meat and pork should be cooked to an internal temperature of 145\\u00b0F (63\\u00b0C) to ensure safety. \\nMarinating meat in acidic ingredients like lemon juice or vinegar can help tenderize it by breaking down tough muscle fibers. \\nHigh-heat cooking methods like grilling and pan-searing can create delicious browning and caramelization, but it\'s important to avoid charring, which can produce harmful compounds. \\n"}
To run the prompt optimization job, Vertex AI prompt optimizer also requires configuring the optimization settings. Vertex AI Prompt Optimizer job runs as Vertex AI Training Custom Job. It supports any Google models supported by the Vertex LLM API and a wide range of evaluation metrics, computation based, LLM based or even the ones defined by the users. This is because Vertex AI Prompt Optimizer is integrated with Vertex Rapid Evaluation Service. In order to pass these configurations, Vertex AI Prompt Optimizer accepts either a list of arguments or the Google Cloud Bucket file path of a JSON configuration file. Here are some examples of basic configurations in Vertex AI Prompt Optimizer.
params = {
'num_steps': OPTIMIZATION_STEPS,
'system_instruction': SYSTEM_INSTRUCTION,
'prompt_template': PROMPT_TEMPLATE,
'target_model': TARGET_MODEL,
'eval_metrics_types': EVALUATION_METRICS,
'optimization_mode': OPTIMIZATION_MODE,
'num_template_eval_per_step': OPTIMIZATION_PROMPT_PER_STEPS,
'num_demo_set_candidates': DEMO_OPTIMIZATION_STEPS,
'demo_set_size': DEMO_OPTIMIZATION_PROMPT_PER_STEPS,
'input_data_path': INPUT_DATA_FILE_URI,
'output_data_path': OUTPUT_DATA_FILE_URI,
}
Vertex AI Prompt Optimizer allows you to optimize prompts by optimizing instructions only, demonstration only, or both (optimization_mode
), and after you set the system instruction, prompt templates that will be optimized (system_instruction
, prompt_template
), and the model you want to optimize for (target_model
), it allows to condition the optimization process by setting evaluation metrics, number of iterations used to improve the prompt and more. Check out the documentation to know more about supported optimization parameters.
Once you have both your samples and your configuration, you upload them on Google Cloud bucket as shown below.
from etils import epath
# upload configuration
with epath.Path(CONFIG_FILE_URI).open('w') as config_file:
json.dump(args, config_file)
config_file.close()
# upload prompt opt dataset
prepared_prompt_df.to_json(INPUT_DATA_FILE_URI, orient="records", lines=True)
At this point, everything is ready to run your first Vertex AI Prompt optimizer job using the Vertex AI SDK for Python.
WORKER_POOL_SPECS = [{
'machine_spec': {
'machine_type': 'n1-standard-4',
},
'replica_count': 1,
'container_spec': {
'image_uri' : APD_CONTAINER_URI,
'args': ["--config=" + CONFIG_FILE_URI]
}}]
custom_job = aiplatform.CustomJob(
display_name=PROMPT_OPTIMIZATION_JOB,
worker_pool_specs=WORKER_POOL_SPECS,
)
custom_job.run()
Notice how the Vertex AI Prompt Optimizer runs as a Vertex AI Training Custom job using the Vertex AI Prompt Optimizer container. The fact that this service leverages both Vertex AI Training and Vertex AI GenAI Evaluation is a proof of how Vertex AI provides a platform to run GenAI, even those who come directly from research as in this case.
After submitting the Vertex AI Prompt optimizer job, you can monitor it from the Vertex AI Training custom jobs view as shown here.
After the optimization job successfully runs, you can find either optimized instructions or demonstrations or both as json files in the output Cloud Storage bucket. Thanks to some helper functions, you can get the following output indicating the optimization step when you get the best instruction according to the metrics you define.
Same result you get for the optimized demonstrations.
Finally, you can generate the new responses with the optimized output. Below you can see an example of a generated response using the optimized system instructions template.
And if you use them to run a new round of evaluation with Vertex AI GenAI Evaluation, you might get an output like the one below where the optimized prompt overperforms the previous model with the previous prompt template respective to the evaluation metrics you selected.
Prompt engineering is one of the most important yet challenging steps of the process to operationalize LLM-based applications. To help craft your prompt template, Vertex AI Prompt Optimizer finds the best prompt (instruction and demonstrations) for any preferred model on Vertex AI.
This article showed one example of how you can use Vertex AI Prompt Optimizer to enhance your prompt template for a Gemini model using the Vertex AI SDK for Python. You can also use Vertex AI Prompt Optimizer via the UI notebook here.
In summary, Vertex AI Prompt Optimizer can save you time and effort in prompt engineering while ensuring you have high-performing prompts for your GenAI applications.
Thanks for reading!
Do you want to learn more about Vertex AI Prompt Optimizer and how to use it? Check out the following resources: