strong_reject.generate#

Utilities for calling LLM APIs.

strong_reject.generate.convert_to_messages(prompt: str | list[str] | list[dict[str, str]], system_prompt: str | None = None) list[dict[str, str]]#

Convert a “prompt” to messages format for generating a response.

Parameters:
  • prompt (Union[str, list[str], list[dict[str, str]]]) – Input to convert to messages format. If a string, this is treated as the user content. If a list of strings, these are treated as alternating user and assistant content. The input may also already be in messages format.

  • system_prompt (str, optional) – System prompt. Defaults to None.

Returns:

List of messages, where a message is {"role": <role>, "content": <content>}.

Return type:

list[dict[str, str]]

strong_reject.generate.generate(prompt: str | list[str] | list[dict[str, str]], model: str | TextGenerationPipeline, system_prompt: str | None = None, num_retries: int = 5, delay: int = 0, **kwargs) str#

Call an LLM to generate a response.

Currently supported models are:

  • Any OpenAI model (requires an OpenAI API key)

  • Llama-3.1 70B (requires a Perplexity API key)

  • Dolphin-2.6 Mixtral-8x7B (requires a DeepInfra API key)

Parameters:
  • prompt (Union[str, list[str]]) – Prompt to respond to. See convert_to_messages() for valid formats.

  • model (Union[str, TextGenerationPipeline]) – Model. See LiteLLM docs for supported models.

  • system_prompt (str, optional) – System prompt. Defaults to None.

  • num_retries (int, optional) – Number of retries if an error is encountered (e.g., rate limit). Defaults to 5.

  • delay (int, optional) – Initial delay before calling the API. Defaults to 0.

Returns:

Response.

Return type:

str

strong_reject.generate.generate_to_dataset(dataset: Dataset, models: list[str | TextGenerationPipeline], target_column: str = 'prompt', decode_responses: bool = True, batch_size: int = 8, num_processes: int = 2, **kwargs) Dataset#

Generate responses to a HuggingFace dataset of prompts.

Parameters:
  • dataset (Dataset) – Dataset with prompts.

  • models (list[Union[str, TextGenerationPipeline]]) – Models used to generate responses.

  • target_column (str, optional) – Column that the models should use as the prompt. Defaults to “prompt”.

  • decode_responses (bool, optional) – Decode the raw responses. Defaults to True.

  • batch_size (int, optional) – Batch size for text generation pipelines. Defaults to 8.

  • num_processes (int, optional) – Number of processes to use for generating responses. Defaults to the number of CPUs.

Returns:

Dataset with responses.

Return type:

Dataset