Version: 0.7.0 (alpha)

Large Language Models

This page provides an introduction to large language model (LLM) parameters and an overview of all the default, maximum and minimum values of these parameters for each available model on Seaplane.

LLMs are a class of machine learning models trained on vast amounts of text data to understand and generate human-like text. These models, typically based on architectures like GPT (Generative Pre-trained Transformer), are capable of tasks such as text completion, summarization, translation, and even dialogue generation. They learn patterns in language by processing large corpora of text, enabling them to generate coherent and contextually relevant text in response to prompts or queries. Seaplane supports over 40 large language models which are all accessible through our model DAG (directed acyclic graph). You can learn more about the Seaplane Model DAG here.

Parameters Definitions

Most LLMs support the same range of parameters, although not all parameters are available for all models.

Prompt - A prompt refers to the initial text or input provided to the model to generate a response. This text outlines the context or question for the model to base its response on and can vary in length and complexity.
System Prompt - A system prompt refers specifically to the initial text or input provided within a system or platform environment to prompt the model for a response. This differs from a generic prompt in that it is tailored to the specific requirements and functionalities of the system or platform. While a prompt may serve as a general guideline for generating text, a system prompt is more focused and may include additional instructions or constraints. For example, you might add only return the following JSON {"key" : VALUE} to constrain the model to only return valid JSON output and nothing else.
Temperature - refers to a parameter that controls the randomness of the generated text. It influences the diversity and creativity of responses produced by the model. A lower temperature results in more deterministic outputs, where the model is more likely to choose high-probability words, leading to more conservative and predictable responses. Conversely, a higher temperature increases randomness, allowing the model to explore a wider range of possibilities and generate more varied and creative responses.
Top-P - top-p refers to a sampling technique used during text generation. Also known as nucleus sampling or probabilistic sampling, it involves selecting from the top probabilities of the model's output distribution until the cumulative probability exceeds a certain threshold denoted as "p". This approach ensures diversity in the generated text by dynamically adjusting the subset of tokens considered for selection based on their probabilities. Higher values of "p" lead to a larger subset of tokens considered, allowing for more diverse and exploratory responses during generation, while lower values result in more conservative outputs. Adjusting the "top-p" parameter enables fine-tuning of the balance between coherence and novelty in the generated text.
Top-K - top-k refers to a sampling technique used during text generation. Also known as top-k sampling or top-K nucleus sampling, it involves restricting the model to consider only the top "k" most likely tokens at each step of generation. This method ensures diversity in the generated text by limiting the selection to a fixed number of tokens with the highest probabilities. By constraining the model's choices to a smaller set of tokens, top-k sampling encourages more focused and coherent responses, avoiding the generation of highly unlikely or nonsensical sequences. Adjusting the value of "k" allows for fine-tuning the balance between exploration and exploitation during text generation, with higher values promoting more diverse outputs and lower values leading to more conservative and predictable responses
Frequency Penalty - frequency penalty" is a parameter used during text generation to penalize the repetition of tokens based on their frequency in the generated text. It is employed to encourage diversity and reduce monotony in the output by discouraging the model from repeating tokens that occur frequently. By penalizing the occurrence of high-frequency tokens, the frequency_penalty parameter helps to promote the generation of more varied and interesting text.
Presence Penalty - Presence penalty is a parameter utilized during text generation to penalize the absence of specific tokens or entities in the generated text. This penalty is applied to encourage the model to include desired tokens or entities that are important for the given task or context. By penalizing the absence of these tokens, the presence_penalty parameter helps ensure that the generated text contains relevant information and adheres to specific requirements or constraints.
Repeat Penalty - The repeat penalty serves as a parameter during text generation to discourage the repetition of tokens within the generated output. By penalizing repeated tokens, this parameter aims to enhance the diversity and coherence of the text by reducing redundancy. Adjusting the value of the repeat_penalty parameter enables fine-tuning of the balance between encouraging novelty in the generated text and maintaining coherence and relevance to the given context or task. This parameter is particularly useful for ensuring that the generated text remains engaging and informative, without unnecessary repetition of content.
Max Tokens - Max tokens is a parameter that determines the maximum number of tokens allowed in the generated output during text generation. This parameter controls the length of the generated text and helps prevent the model from producing excessively long responses.
Minimum Tokens - Minimum tokens refers to a parameter used during text generation to specify the minimum number of tokens required in the generated output. This parameter ensures that the model produces responses of a certain length, preventing it from generating overly short or incomplete text.
seed - The "seed" refers to the initial value or input used to initialize the internal state of the model before generating text. This initialization influences the sequence of random numbers generated by the model during the text generation process. By setting a specific seed, the model's behavior becomes deterministic, meaning that given the same seed, the model will produce the same sequence of text every time it's run. This deterministic behavior is crucial for reproducibility in text generation tasks, as it allows researchers and developers to obtain consistent results and debug their models effectively. Additionally, controlling the seed enables users to explore variations in the generated text by changing the seed value while keeping other parameters constant.

Llama Model Family

Chat Models

Seaplane supports three Llama chat models. Available under the following model names:

llama-2-7b-chat
llama-2-13b-chat
llama-2-70b-chat

parameter	type	defaults for 7b	defaults for 13b and 70b	required
`model`	`str`	no default	no default	yes
`prompt`	`str`	no default	no default	yes
`system_prompt`	`str`	You are a helpful, respectful and honest assistant.	You are a helpful, respectful and honest assistant.	no
`temperature`	`float`	0.7, min=0.01, max=5.0	0.75, min=0.01, max=5.0	no
`max_new_tokens`	`int`	128, min=1	128, min=1	no
`min_new_tokens`	`int`	-1, min=-1	-1, min=-1	no
`top_p`	`float`	0.95, max=1	0.9, max=1	no
`top_k`	`int`	-1, min=-1	50	no
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	not supported by this model
`frequency_penalty`	not suported by this model
`repeat_penalty`	`float`	1.15	1.15	no
`seed`	`int`	no default	no default	no
`use_prompt_template`	`bool`	`true`	`true`	no

Code and Instruct Models

Seaplane supports 8 Llama code and instruct models. Available under the following model names:

codellama-7b-instruct
codellama-7b-python
codellama-13b-instruct
codellama-34b-instruct
codellama-34b-python
codellama-70b
codellama-70b-instruct
codellama-70b-python

parameters	type	defaults for all code models	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	`float`	0.8	no
`max_new_tokens`	`int`	500	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	0.95	no
`top_k`	`int`	10	no
`repitition_penalty`	`float`	1.1, min=0, max=2	no
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	`float`	0, min=0, max=2	no
`frequency_penalty`	`float`	0, min=0, max=2	no
`repeat_penalty`	not supported by this model
`seed`	not supported by this model
`use_prompt_template`	not supported by this model

Claude Model Family

Seaplane supports four Claude models, available under the following model names:

predictions-aws-anthropic-claude21
predictions-aws-anthropic-claude-instant12
predictions-aws-anthropic-claude3-haiku-20240307
predictions-aws-anthropic-claude3-sonnet-20240229

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	`str`	no default	no
`temperature`	`float`	1, min=0, max=1	no
`max_new_tokens`	`int`	512, min=0	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	0.999, min=0, max=1	no
`top_k`	`int`	disabled, min=0, max=100,000,000	no
`stop_sequences`	`str`	no default	no
`length_penalty`	not supported by this model
`presence_penalty`	not supported by this model
`frequency_penalty`	not suported by this model
`repeat_penalty`	not supported by this model
`seed`	not supported by this model
`use_prompt_template`	not supported by this model

OpenAI Model Family

Seaplane supports two OpenAI models. Available under the following model names:

chat-azure-openai-gpt35-turbo16k
chat-azure-openai-gpt4

parameter	type	default	required
model	`str`	no default	yes
prompt	`str`	no default	yes
system_prompt	`str`	no default	no
temperature	`float`	1, min=0, max=2	no
max_new_tokens	`int`	null	no
min_new_tokens	not supported by this model
top_p	`float`	1	no
top_k	not supported by this model
stop_sequences	not supported by this model
length_penalty	`str`	no default supports a maximum of 4 stop sequences in list	no
presence_penalty	`float`	0, min=-2, max=2	no
frequency_penalty	`float`	0, min=-2, max=2	no
repeat_penalty	not supported by this model
seed	`int`	no default	no
use_prompt_template	not supported by this model		no

Other Models

Zephyr 7b beta

Zephyr 7b beta is available under the following model name: zephyr-7b-beta

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	`str`	no default	no
`temperature`	`float`	0.8, min=0.01, max=5.0	no
`max_new_tokens`	`int`	128	no
`min_new_tokens`	not supported by this model		no
`top_p`	`float`	0.95, min=0.01,max=1	no
`top_k`	`int`	50	no
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	`float`	1, min=0.01, max=5.0	no
`frequency_penalty`	not supported by this model
`repeat_penalty`	not supported by this model
`seed`	not supported by this model
`use_prompt_template`	`bool`	`true`	no

Mistral Model Family

Seaplane supports two mistral models. Available under the following model names:

mistral-7b-instruct-v0.1
mistral-7b-instruct-v0.2

parameter	type	default V0.1	default V0.2	required
`model`	`str`	no default	no default	yes
`prompt`	`str`	no default	no default	yes
`system_prompt`	`str`	not supported by this model	no default	no
`temperature`	`float`	0.7, min=0.01, max=5	0.7, min=0.0, max=5	no
`max_new_tokens`	`int`	128, min=1	128, min=1	no
`min_new_tokens`	`int`	-1, min=-1	-1, min=-1	no
`top_p`	`float`	0.95, min=0, max=1	0.95, min=0, max=1	no
`top_k`	`int`	-1, min=-1	-1, min=-1	no
`stop_sequences`	`str`	no default, example:`'<end>,<stop>'`	no default, example:`'<end>,<stop>'`	no
`length_penalty`	`float`	not supported by this model	1, min=0, max=5	no
`presence_penalty`	`float`	not supported by this model	0	no
`frequency_penalty`	not supported by this model
`repeat_penalty`	`float`	1.15	1.15	no
`seed`	`int`	no default	no default	no
`use_prompt_template`	`bool`	not supported by this model	`true`	no

Mixtral 8x7b

Mixtral 8x7b is available under the following model name: mixtral-8x7b-instruct-v0.1.

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	`str`	no default	no
`temperature`	`float`	Not supported by this model
`max_new_tokens`	`int`	128	no
`min_new_tokens`	`int`	min=1, to disable set to -1	no
`top_p`	`float`	0.95, min=0, max=1	no
`top_k`	`int`	min=-1	no
`stop_sequences`	`str`	no default, example:`'<end>,<stop>'`	no
`length_penalty`	`float`	1, min=0, max=5	no
`presence_penalty`	`float`	no default	no
`frequency_penalty`	`float`	Not supported by this model
`repeat_penalty`	`float`	Not supported by this model
`seed`	`int`	no default	no
`use_prompt_template`	`bool`	Not supported by this model

Starling LM 7b Alpha

Startling LM 7B Alpha is available under the following model name: starling-lm-7b-alpha.

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	`float`	.8, min=0.01, max=5	no
`max_new_tokens`	`int`	128	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	0.95, min=0.01, max=1	no
`top_k`	`int`	no default	no
`stop_sequences`	`str`	no default, example:`'<end>,<stop>'`	no
`length_penalty`	not supported by this model
`presence_penalty`	`float`	no default, min=-5, max=5	no
`frequency_penalty`	`float`	no default, min=-5, max=5	no
`repeat_penalty`	not supported by this model
`seed`	not supported by this model
`use_prompt_template`	not supported by this model

Yi 34B Chat

Yi 34B Chat is available under the following model name: yi-34b-chat.

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	`float`	0.3	no
`max_new_tokens`	`int`	1024	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	0.8	no
`top_k`	`int`	50	no
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	not supported by this model
`frequency_penalty`	not supported by this model
`repeat_penalty`	`float`	1.2	no
`seed`	not supported by this model
`use_prompt_template`	`bool`	`true`	no

Yi 6B

Yi 6B is available under the following model name: yi-6b.

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	`float`	0.8	no
`max_new_tokens`	`int`	512	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	0.95	no
`top_k`	`int`	50	no
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	`float`	no default	no
`frequency_penalty`	`float`	no default	no
`repeat_penalty`	not supported by this model
`seed`	not supported by this model
`use_prompt_template`	not supported by this model

Falcon 40B Instruct

Falcon 40B Instruct is available under the following model name: falcon-40b-instruct.

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	`float`	0.75, min=0.01,max=5	no
`max_new_tokens`	`int`	500	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	1, min=0.01, max=1	no
`top_k`	not supported by this model
`stop_sequences`	`str`	`'<end>,<stop>'`	no
`length_penalty`	`int`	1, min=0.01, max=5	no
`presence_penalty`	not supported by this model
`frequency_penalty`	not supported by this model
`repeat_penalty`	`float`	1, min=0.01, max=5	no
`seed`	`int`	-1, min=-1	no
`use_prompt_template`	not supported by this model

Vicuna 13B

Vicuna 13B is available under the following model name: vicuna-13b.

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	`float`	0.75, min=0.01, max=5	no
`max_new_tokens`	`int`	500	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	1, min=0.01, max=1	no
`top_k`	not supported by this model
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	not supported by this model
`frequency_penalty`	not supported by this model
`repeat_penalty`	`float`	1, min=0.01, max=5	no
`seed`	`int`	-1	no
`use_prompt_template`	not supported by this model

Phi 2

Phi 2 is available under the following model name: phi-2.

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	not supported by this model
`max_new_tokens`	`int`	200, min=0, max=2048	no
`min_new_tokens`	not supported by this model
`top_p`	not supported by this model
`top_k`	not supported by this model
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	not supported by this model
`frequency_penalty`	not supported by this model
`repeat_penalty`	not supported by this model
`seed`	not supported by this model
`use_prompt_template`	not supported by this model

Olmo 7B

Olmo 7B is available under the following model name: olmo-7b

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	not supported by this model
`max_new_tokens`	`int`	100	no
`min_new_tokens`	not supported by this model		no
`top_p`	`float`	0.95, min=0.01, max=1	no
`top_k`	`int`	50	no
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	not supported by this model
`frequency_penalty`	not supported by this model
`repeat_penalty`	not supported by this model
`seed`	not supported by this model
`use_prompt_template`	not supported by this model

Wizard Coder 34B V1.0

Wizard Coder 34B V1.0 is available under the following model name: wizardcoder-34b-v1.0

parameter	type	default	required
`model`	`str`	no default	yes
`prompt`	`str`	no default	yes
`system_prompt`	not supported by this model
`temperature`	`float`	0.75, min=0.01, max=5	no
`max_new_tokens`	`int`	256	no
`min_new_tokens`	not supported by this model
`top_p`	`float`	1, min 0.01, max=5	no
`top_k`	not supported by this model
`stop_sequences`	not supported by this model
`length_penalty`	not supported by this model
`presence_penalty`	not supported by this model
`frequency_penalty`	not supported by this model
`repeat_penalty`	`float`	1.1, min=0.01, max=5	no
`seed`	not supported by this model
`use_prompt_template`	not supported by this model

Parameters Definitions​

Llama Model Family​

Chat Models​

Code and Instruct Models​

Claude Model Family​

OpenAI Model Family​

Other Models​

Zephyr 7b beta​

Mistral Model Family​

Mixtral 8x7b​

Starling LM 7b Alpha​

Yi 34B Chat​

Yi 6B​

Falcon 40B Instruct​

Vicuna 13B​

Phi 2​

Olmo 7B​

Wizard Coder 34B V1.0​