Large Language Models
This page provides an introduction to large language model (LLM) parameters and an overview of all the default, maximum and minimum values of these parameters for each available model on Seaplane.
LLMs are a class of machine learning models trained on vast amounts of text data to understand and generate human-like text. These models, typically based on architectures like GPT (Generative Pre-trained Transformer), are capable of tasks such as text completion, summarization, translation, and even dialogue generation. They learn patterns in language by processing large corpora of text, enabling them to generate coherent and contextually relevant text in response to prompts or queries. Seaplane supports over 40 large language models which are all accessible through our model DAG (directed acyclic graph). You can learn more about the Seaplane Model DAG here.
Parameters Definitions​
Most LLMs support the same range of parameters, although not all parameters are available for all models.
- Prompt - A prompt refers to the initial text or input provided to the model to generate a response. This text outlines the context or question for the model to base its response on and can vary in length and complexity.
- System Prompt - A system prompt refers specifically to the initial text
or input provided within a system or platform environment to prompt the model
for a response. This differs from a generic prompt in that it is tailored to the
specific requirements and functionalities of the system or platform. While a
prompt may serve as a general guideline for generating text, a system prompt is
more focused and may include additional instructions or constraints. For
example, you might add
only return the following JSON {"key" : VALUE}
to constrain the model to only return valid JSON output and nothing else. - Temperature - refers to a parameter that controls the randomness of the generated text. It influences the diversity and creativity of responses produced by the model. A lower temperature results in more deterministic outputs, where the model is more likely to choose high-probability words, leading to more conservative and predictable responses. Conversely, a higher temperature increases randomness, allowing the model to explore a wider range of possibilities and generate more varied and creative responses.
- Top-P - top-p refers to a sampling technique used during text generation. Also known as nucleus sampling or probabilistic sampling, it involves selecting from the top probabilities of the model's output distribution until the cumulative probability exceeds a certain threshold denoted as "p". This approach ensures diversity in the generated text by dynamically adjusting the subset of tokens considered for selection based on their probabilities. Higher values of "p" lead to a larger subset of tokens considered, allowing for more diverse and exploratory responses during generation, while lower values result in more conservative outputs. Adjusting the "top-p" parameter enables fine-tuning of the balance between coherence and novelty in the generated text.
- Top-K - top-k refers to a sampling technique used during text generation. Also known as top-k sampling or top-K nucleus sampling, it involves restricting the model to consider only the top "k" most likely tokens at each step of generation. This method ensures diversity in the generated text by limiting the selection to a fixed number of tokens with the highest probabilities. By constraining the model's choices to a smaller set of tokens, top-k sampling encourages more focused and coherent responses, avoiding the generation of highly unlikely or nonsensical sequences. Adjusting the value of "k" allows for fine-tuning the balance between exploration and exploitation during text generation, with higher values promoting more diverse outputs and lower values leading to more conservative and predictable responses
- Frequency Penalty - frequency penalty" is a parameter used during text generation to penalize the repetition of tokens based on their frequency in the generated text. It is employed to encourage diversity and reduce monotony in the output by discouraging the model from repeating tokens that occur frequently. By penalizing the occurrence of high-frequency tokens, the frequency_penalty parameter helps to promote the generation of more varied and interesting text.
- Presence Penalty - Presence penalty is a parameter utilized during text generation to penalize the absence of specific tokens or entities in the generated text. This penalty is applied to encourage the model to include desired tokens or entities that are important for the given task or context. By penalizing the absence of these tokens, the presence_penalty parameter helps ensure that the generated text contains relevant information and adheres to specific requirements or constraints.
- Repeat Penalty - The repeat penalty serves as a parameter during text generation to discourage the repetition of tokens within the generated output. By penalizing repeated tokens, this parameter aims to enhance the diversity and coherence of the text by reducing redundancy. Adjusting the value of the repeat_penalty parameter enables fine-tuning of the balance between encouraging novelty in the generated text and maintaining coherence and relevance to the given context or task. This parameter is particularly useful for ensuring that the generated text remains engaging and informative, without unnecessary repetition of content.
- Max Tokens - Max tokens is a parameter that determines the maximum number of tokens allowed in the generated output during text generation. This parameter controls the length of the generated text and helps prevent the model from producing excessively long responses.
- Minimum Tokens - Minimum tokens refers to a parameter used during text generation to specify the minimum number of tokens required in the generated output. This parameter ensures that the model produces responses of a certain length, preventing it from generating overly short or incomplete text.
- seed - The "seed" refers to the initial value or input used to initialize the internal state of the model before generating text. This initialization influences the sequence of random numbers generated by the model during the text generation process. By setting a specific seed, the model's behavior becomes deterministic, meaning that given the same seed, the model will produce the same sequence of text every time it's run. This deterministic behavior is crucial for reproducibility in text generation tasks, as it allows researchers and developers to obtain consistent results and debug their models effectively. Additionally, controlling the seed enables users to explore variations in the generated text by changing the seed value while keeping other parameters constant.
Llama Model Family​
Chat Models​
Seaplane supports three Llama chat models. Available under the following model names:
llama-2-7b-chat
llama-2-13b-chat
llama-2-70b-chat
parameter | type | defaults for 7b | defaults for 13b and 70b | required |
---|---|---|---|---|
model | str | no default | no default | yes |
prompt | str | no default | no default | yes |
system_prompt | str | You are a helpful, respectful and honest assistant. | You are a helpful, respectful and honest assistant. | no |
temperature | float | 0.7, min=0.01, max=5.0 | 0.75, min=0.01, max=5.0 | no |
max_new_tokens | int | 128, min=1 | 128, min=1 | no |
min_new_tokens | int | -1, min=-1 | -1, min=-1 | no |
top_p | float | 0.95, max=1 | 0.9, max=1 | no |
top_k | int | -1, min=-1 | 50 | no |
stop_sequences | not supported by this model | |||
length_penalty | not supported by this model | |||
presence_penalty | not supported by this model | |||
frequency_penalty | not suported by this model | |||
repeat_penalty | float | 1.15 | 1.15 | no |
seed | int | no default | no default | no |
use_prompt_template | bool | true | true | no |
Code and Instruct Models​
Seaplane supports 8 Llama code and instruct models. Available under the following model names:
codellama-7b-instruct
codellama-7b-python
codellama-13b-instruct
codellama-34b-instruct
codellama-34b-python
codellama-70b
codellama-70b-instruct
codellama-70b-python
parameters | type | defaults for all code models | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | float | 0.8 | no |
max_new_tokens | int | 500 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 0.95 | no |
top_k | int | 10 | no |
repitition_penalty | float | 1.1, min=0, max=2 | no |
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | float | 0, min=0, max=2 | no |
frequency_penalty | float | 0, min=0, max=2 | no |
repeat_penalty | not supported by this model | ||
seed | not supported by this model | ||
use_prompt_template | not supported by this model |
Claude Model Family​
Seaplane supports four Claude models, available under the following model names:
predictions-aws-anthropic-claude21
predictions-aws-anthropic-claude-instant12
predictions-aws-anthropic-claude3-haiku-20240307
predictions-aws-anthropic-claude3-sonnet-20240229
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | str | no default | no |
temperature | float | 1, min=0, max=1 | no |
max_new_tokens | int | 512, min=0 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 0.999, min=0, max=1 | no |
top_k | int | disabled, min=0, max=100,000,000 | no |
stop_sequences | str | no default | no |
length_penalty | not supported by this model | ||
presence_penalty | not supported by this model | ||
frequency_penalty | not suported by this model | ||
repeat_penalty | not supported by this model | ||
seed | not supported by this model | ||
use_prompt_template | not supported by this model |
OpenAI Model Family​
Seaplane supports two OpenAI models. Available under the following model names:
chat-azure-openai-gpt35-turbo16k
chat-azure-openai-gpt4
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | str | no default | no |
temperature | float | 1, min=0, max=2 | no |
max_new_tokens | int | null | no |
min_new_tokens | not supported by this model | ||
top_p | float | 1 | no |
top_k | not supported by this model | ||
stop_sequences | not supported by this model | ||
length_penalty | str | no default supports a maximum of 4 stop sequences in list | no |
presence_penalty | float | 0, min=-2, max=2 | no |
frequency_penalty | float | 0, min=-2, max=2 | no |
repeat_penalty | not supported by this model | ||
seed | int | no default | no |
use_prompt_template | not supported by this model | no |
Other Models​
Zephyr 7b beta​
Zephyr 7b beta is available under the following model name: zephyr-7b-beta
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | str | no default | no |
temperature | float | 0.8, min=0.01, max=5.0 | no |
max_new_tokens | int | 128 | no |
min_new_tokens | not supported by this model | no | |
top_p | float | 0.95, min=0.01,max=1 | no |
top_k | int | 50 | no |
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | float | 1, min=0.01, max=5.0 | no |
frequency_penalty | not supported by this model | ||
repeat_penalty | not supported by this model | ||
seed | not supported by this model | ||
use_prompt_template | bool | true | no |
Mistral Model Family​
Seaplane supports two mistral models. Available under the following model names:
mistral-7b-instruct-v0.1
mistral-7b-instruct-v0.2
parameter | type | default V0.1 | default V0.2 | required |
---|---|---|---|---|
model | str | no default | no default | yes |
prompt | str | no default | no default | yes |
system_prompt | str | not supported by this model | no default | no |
temperature | float | 0.7, min=0.01, max=5 | 0.7, min=0.0, max=5 | no |
max_new_tokens | int | 128, min=1 | 128, min=1 | no |
min_new_tokens | int | -1, min=-1 | -1, min=-1 | no |
top_p | float | 0.95, min=0, max=1 | 0.95, min=0, max=1 | no |
top_k | int | -1, min=-1 | -1, min=-1 | no |
stop_sequences | str | no default, example:'<end>,<stop>' | no default, example:'<end>,<stop>' | no |
length_penalty | float | not supported by this model | 1, min=0, max=5 | no |
presence_penalty | float | not supported by this model | 0 | no |
frequency_penalty | not supported by this model | |||
repeat_penalty | float | 1.15 | 1.15 | no |
seed | int | no default | no default | no |
use_prompt_template | bool | not supported by this model | true | no |
Mixtral 8x7b​
Mixtral 8x7b is available under the following model name:
mixtral-8x7b-instruct-v0.1
.
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | str | no default | no |
temperature | float | Not supported by this model | |
max_new_tokens | int | 128 | no |
min_new_tokens | int | min=1, to disable set to -1 | no |
top_p | float | 0.95, min=0, max=1 | no |
top_k | int | min=-1 | no |
stop_sequences | str | no default, example:'<end>,<stop>' | no |
length_penalty | float | 1, min=0, max=5 | no |
presence_penalty | float | no default | no |
frequency_penalty | float | Not supported by this model | |
repeat_penalty | float | Not supported by this model | |
seed | int | no default | no |
use_prompt_template | bool | Not supported by this model |
Starling LM 7b Alpha​
Startling LM 7B Alpha is available under the following model name:
starling-lm-7b-alpha
.
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | float | .8, min=0.01, max=5 | no |
max_new_tokens | int | 128 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 0.95, min=0.01, max=1 | no |
top_k | int | no default | no |
stop_sequences | str | no default, example:'<end>,<stop>' | no |
length_penalty | not supported by this model | ||
presence_penalty | float | no default, min=-5, max=5 | no |
frequency_penalty | float | no default, min=-5, max=5 | no |
repeat_penalty | not supported by this model | ||
seed | not supported by this model | ||
use_prompt_template | not supported by this model |
Yi 34B Chat​
Yi 34B Chat is available under the following model name: yi-34b-chat
.
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | float | 0.3 | no |
max_new_tokens | int | 1024 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 0.8 | no |
top_k | int | 50 | no |
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | not supported by this model | ||
frequency_penalty | not supported by this model | ||
repeat_penalty | float | 1.2 | no |
seed | not supported by this model | ||
use_prompt_template | bool | true | no |
Yi 6B​
Yi 6B is available under the following model name: yi-6b
.
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | float | 0.8 | no |
max_new_tokens | int | 512 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 0.95 | no |
top_k | int | 50 | no |
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | float | no default | no |
frequency_penalty | float | no default | no |
repeat_penalty | not supported by this model | ||
seed | not supported by this model | ||
use_prompt_template | not supported by this model |
Falcon 40B Instruct​
Falcon 40B Instruct is available under the following model name:
falcon-40b-instruct
.
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | float | 0.75, min=0.01,max=5 | no |
max_new_tokens | int | 500 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 1, min=0.01, max=1 | no |
top_k | not supported by this model | ||
stop_sequences | str | '<end>,<stop>' | no |
length_penalty | int | 1, min=0.01, max=5 | no |
presence_penalty | not supported by this model | ||
frequency_penalty | not supported by this model | ||
repeat_penalty | float | 1, min=0.01, max=5 | no |
seed | int | -1, min=-1 | no |
use_prompt_template | not supported by this model |
Vicuna 13B​
Vicuna 13B is available under the following model name: vicuna-13b
.
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | float | 0.75, min=0.01, max=5 | no |
max_new_tokens | int | 500 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 1, min=0.01, max=1 | no |
top_k | not supported by this model | ||
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | not supported by this model | ||
frequency_penalty | not supported by this model | ||
repeat_penalty | float | 1, min=0.01, max=5 | no |
seed | int | -1 | no |
use_prompt_template | not supported by this model |
Phi 2​
Phi 2 is available under the following model name: phi-2
.
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | not supported by this model | ||
max_new_tokens | int | 200, min=0, max=2048 | no |
min_new_tokens | not supported by this model | ||
top_p | not supported by this model | ||
top_k | not supported by this model | ||
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | not supported by this model | ||
frequency_penalty | not supported by this model | ||
repeat_penalty | not supported by this model | ||
seed | not supported by this model | ||
use_prompt_template | not supported by this model |
Olmo 7B​
Olmo 7B is available under the following model name: olmo-7b
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | not supported by this model | ||
max_new_tokens | int | 100 | no |
min_new_tokens | not supported by this model | no | |
top_p | float | 0.95, min=0.01, max=1 | no |
top_k | int | 50 | no |
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | not supported by this model | ||
frequency_penalty | not supported by this model | ||
repeat_penalty | not supported by this model | ||
seed | not supported by this model | ||
use_prompt_template | not supported by this model |
Wizard Coder 34B V1.0​
Wizard Coder 34B V1.0 is available under the following model name:
wizardcoder-34b-v1.0
parameter | type | default | required |
---|---|---|---|
model | str | no default | yes |
prompt | str | no default | yes |
system_prompt | not supported by this model | ||
temperature | float | 0.75, min=0.01, max=5 | no |
max_new_tokens | int | 256 | no |
min_new_tokens | not supported by this model | ||
top_p | float | 1, min 0.01, max=5 | no |
top_k | not supported by this model | ||
stop_sequences | not supported by this model | ||
length_penalty | not supported by this model | ||
presence_penalty | not supported by this model | ||
frequency_penalty | not supported by this model | ||
repeat_penalty | float | 1.1, min=0.01, max=5 | no |
seed | not supported by this model | ||
use_prompt_template | not supported by this model |