strong_reject.generate#
Utilities for calling LLM APIs.
- strong_reject.generate.convert_to_messages(prompt: str | list[str] | list[dict[str, str]], system_prompt: str | None = None) list[dict[str, str]] #
Convert a “prompt” to messages format for generating a response.
- Parameters:
prompt (Union[str, list[str], list[dict[str, str]]]) – Input to convert to messages format. If a string, this is treated as the user content. If a list of strings, these are treated as alternating user and assistant content. The input may also already be in messages format.
system_prompt (str, optional) – System prompt. Defaults to None.
- Returns:
List of messages, where a message is
{"role": <role>, "content": <content>}
.- Return type:
list[dict[str, str]]
- strong_reject.generate.generate(prompt: str | list[str] | list[dict[str, str]], model: str | TextGenerationPipeline, system_prompt: str | None = None, num_retries: int = 5, delay: int = 0, **kwargs) str #
Call an LLM to generate a response.
Currently supported models are:
Any OpenAI model (requires an OpenAI API key)
Llama-3.1 70B (requires a Perplexity API key)
Dolphin-2.6 Mixtral-8x7B (requires a DeepInfra API key)
- Parameters:
prompt (Union[str, list[str]]) – Prompt to respond to. See
convert_to_messages()
for valid formats.model (Union[str, TextGenerationPipeline]) – Model. See LiteLLM docs for supported models.
system_prompt (str, optional) – System prompt. Defaults to None.
num_retries (int, optional) – Number of retries if an error is encountered (e.g., rate limit). Defaults to 5.
delay (int, optional) – Initial delay before calling the API. Defaults to 0.
- Returns:
Response.
- Return type:
str
- strong_reject.generate.generate_to_dataset(dataset: Dataset, models: list[str | TextGenerationPipeline], target_column: str = 'prompt', decode_responses: bool = True, batch_size: int = 8, num_processes: int = 2, **kwargs) Dataset #
Generate responses to a HuggingFace dataset of prompts.
- Parameters:
dataset (Dataset) – Dataset with prompts.
models (list[Union[str, TextGenerationPipeline]]) – Models used to generate responses.
target_column (str, optional) – Column that the models should use as the prompt. Defaults to “prompt”.
decode_responses (bool, optional) – Decode the raw responses. Defaults to True.
batch_size (int, optional) – Batch size for text generation pipelines. Defaults to 8.
num_processes (int, optional) – Number of processes to use for generating responses. Defaults to the number of CPUs.
- Returns:
Dataset with responses.
- Return type:
Dataset