Skip to main content
Version: 0.7.0 (alpha)

Large Language Models

This page provides an introduction to large language model (LLM) parameters and an overview of all the default, maximum and minimum values of these parameters for each available model on Seaplane.

LLMs are a class of machine learning models trained on vast amounts of text data to understand and generate human-like text. These models, typically based on architectures like GPT (Generative Pre-trained Transformer), are capable of tasks such as text completion, summarization, translation, and even dialogue generation. They learn patterns in language by processing large corpora of text, enabling them to generate coherent and contextually relevant text in response to prompts or queries. Seaplane supports over 40 large language models which are all accessible through our model DAG (directed acyclic graph). You can learn more about the Seaplane Model DAG here.

Parameters Definitions​

Most LLMs support the same range of parameters, although not all parameters are available for all models.

  • Prompt - A prompt refers to the initial text or input provided to the model to generate a response. This text outlines the context or question for the model to base its response on and can vary in length and complexity.
  • System Prompt - A system prompt refers specifically to the initial text or input provided within a system or platform environment to prompt the model for a response. This differs from a generic prompt in that it is tailored to the specific requirements and functionalities of the system or platform. While a prompt may serve as a general guideline for generating text, a system prompt is more focused and may include additional instructions or constraints. For example, you might add only return the following JSON {"key" : VALUE} to constrain the model to only return valid JSON output and nothing else.
  • Temperature - refers to a parameter that controls the randomness of the generated text. It influences the diversity and creativity of responses produced by the model. A lower temperature results in more deterministic outputs, where the model is more likely to choose high-probability words, leading to more conservative and predictable responses. Conversely, a higher temperature increases randomness, allowing the model to explore a wider range of possibilities and generate more varied and creative responses.
  • Top-P - top-p refers to a sampling technique used during text generation. Also known as nucleus sampling or probabilistic sampling, it involves selecting from the top probabilities of the model's output distribution until the cumulative probability exceeds a certain threshold denoted as "p". This approach ensures diversity in the generated text by dynamically adjusting the subset of tokens considered for selection based on their probabilities. Higher values of "p" lead to a larger subset of tokens considered, allowing for more diverse and exploratory responses during generation, while lower values result in more conservative outputs. Adjusting the "top-p" parameter enables fine-tuning of the balance between coherence and novelty in the generated text.
  • Top-K - top-k refers to a sampling technique used during text generation. Also known as top-k sampling or top-K nucleus sampling, it involves restricting the model to consider only the top "k" most likely tokens at each step of generation. This method ensures diversity in the generated text by limiting the selection to a fixed number of tokens with the highest probabilities. By constraining the model's choices to a smaller set of tokens, top-k sampling encourages more focused and coherent responses, avoiding the generation of highly unlikely or nonsensical sequences. Adjusting the value of "k" allows for fine-tuning the balance between exploration and exploitation during text generation, with higher values promoting more diverse outputs and lower values leading to more conservative and predictable responses
  • Frequency Penalty - frequency penalty" is a parameter used during text generation to penalize the repetition of tokens based on their frequency in the generated text. It is employed to encourage diversity and reduce monotony in the output by discouraging the model from repeating tokens that occur frequently. By penalizing the occurrence of high-frequency tokens, the frequency_penalty parameter helps to promote the generation of more varied and interesting text.
  • Presence Penalty - Presence penalty is a parameter utilized during text generation to penalize the absence of specific tokens or entities in the generated text. This penalty is applied to encourage the model to include desired tokens or entities that are important for the given task or context. By penalizing the absence of these tokens, the presence_penalty parameter helps ensure that the generated text contains relevant information and adheres to specific requirements or constraints.
  • Repeat Penalty - The repeat penalty serves as a parameter during text generation to discourage the repetition of tokens within the generated output. By penalizing repeated tokens, this parameter aims to enhance the diversity and coherence of the text by reducing redundancy. Adjusting the value of the repeat_penalty parameter enables fine-tuning of the balance between encouraging novelty in the generated text and maintaining coherence and relevance to the given context or task. This parameter is particularly useful for ensuring that the generated text remains engaging and informative, without unnecessary repetition of content.
  • Max Tokens - Max tokens is a parameter that determines the maximum number of tokens allowed in the generated output during text generation. This parameter controls the length of the generated text and helps prevent the model from producing excessively long responses.
  • Minimum Tokens - Minimum tokens refers to a parameter used during text generation to specify the minimum number of tokens required in the generated output. This parameter ensures that the model produces responses of a certain length, preventing it from generating overly short or incomplete text.
  • seed - The "seed" refers to the initial value or input used to initialize the internal state of the model before generating text. This initialization influences the sequence of random numbers generated by the model during the text generation process. By setting a specific seed, the model's behavior becomes deterministic, meaning that given the same seed, the model will produce the same sequence of text every time it's run. This deterministic behavior is crucial for reproducibility in text generation tasks, as it allows researchers and developers to obtain consistent results and debug their models effectively. Additionally, controlling the seed enables users to explore variations in the generated text by changing the seed value while keeping other parameters constant.

Llama Model Family​

Chat Models​

Seaplane supports three Llama chat models. Available under the following model names:

  • llama-2-7b-chat
  • llama-2-13b-chat
  • llama-2-70b-chat
parametertypedefaults for 7bdefaults for 13b and 70brequired
modelstrno defaultno defaultyes
promptstrno defaultno defaultyes
system_promptstrYou are a helpful, respectful and honest assistant.You are a helpful, respectful and honest assistant.no
temperaturefloat0.7, min=0.01, max=5.00.75, min=0.01, max=5.0no
max_new_tokensint128, min=1128, min=1no
min_new_tokensint-1, min=-1-1, min=-1no
top_pfloat0.95, max=10.9, max=1no
top_kint-1, min=-150no
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltynot supported by this model
frequency_penaltynot suported by this model
repeat_penaltyfloat1.151.15no
seedintno defaultno defaultno
use_prompt_templatebooltruetrueno

Code and Instruct Models​

Seaplane supports 8 Llama code and instruct models. Available under the following model names:

  • codellama-7b-instruct
  • codellama-7b-python
  • codellama-13b-instruct
  • codellama-34b-instruct
  • codellama-34b-python
  • codellama-70b
  • codellama-70b-instruct
  • codellama-70b-python
parameterstypedefaults for all code modelsrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturefloat0.8no
max_new_tokensint500no
min_new_tokensnot supported by this model
top_pfloat0.95no
top_kint10no
repitition_penaltyfloat1.1, min=0, max=2no
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltyfloat0, min=0, max=2no
frequency_penaltyfloat0, min=0, max=2no
repeat_penaltynot supported by this model
seednot supported by this model
use_prompt_templatenot supported by this model

Claude Model Family​

Seaplane supports four Claude models, available under the following model names:

  • predictions-aws-anthropic-claude21
  • predictions-aws-anthropic-claude-instant12
  • predictions-aws-anthropic-claude3-haiku-20240307
  • predictions-aws-anthropic-claude3-sonnet-20240229
parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptstrno defaultno
temperaturefloat1, min=0, max=1no
max_new_tokensint512, min=0no
min_new_tokensnot supported by this model
top_pfloat0.999, min=0, max=1no
top_kintdisabled, min=0, max=100,000,000no
stop_sequencesstrno defaultno
length_penaltynot supported by this model
presence_penaltynot supported by this model
frequency_penaltynot suported by this model
repeat_penaltynot supported by this model
seednot supported by this model
use_prompt_templatenot supported by this model

OpenAI Model Family​

Seaplane supports two OpenAI models. Available under the following model names:

  • chat-azure-openai-gpt35-turbo16k
  • chat-azure-openai-gpt4
parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptstrno defaultno
temperaturefloat1, min=0, max=2no
max_new_tokensintnullno
min_new_tokensnot supported by this model
top_pfloat1no
top_knot supported by this model
stop_sequencesnot supported by this model
length_penaltystrno default supports a maximum of 4 stop sequences in listno
presence_penaltyfloat0, min=-2, max=2no
frequency_penaltyfloat0, min=-2, max=2no
repeat_penaltynot supported by this model
seedintno defaultno
use_prompt_templatenot supported by this modelno

Other Models​

Zephyr 7b beta​

Zephyr 7b beta is available under the following model name: zephyr-7b-beta

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptstrno defaultno
temperaturefloat0.8, min=0.01, max=5.0no
max_new_tokensint128no
min_new_tokensnot supported by this modelno
top_pfloat0.95, min=0.01,max=1no
top_kint50no
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltyfloat1, min=0.01, max=5.0no
frequency_penaltynot supported by this model
repeat_penaltynot supported by this model
seednot supported by this model
use_prompt_templatebooltrueno

Mistral Model Family​

Seaplane supports two mistral models. Available under the following model names:

  • mistral-7b-instruct-v0.1
  • mistral-7b-instruct-v0.2
parametertypedefault V0.1default V0.2required
modelstrno defaultno defaultyes
promptstrno defaultno defaultyes
system_promptstrnot supported by this modelno defaultno
temperaturefloat0.7, min=0.01, max=50.7, min=0.0, max=5no
max_new_tokensint128, min=1128, min=1no
min_new_tokensint-1, min=-1-1, min=-1no
top_pfloat0.95, min=0, max=10.95, min=0, max=1no
top_kint-1, min=-1-1, min=-1no
stop_sequencesstrno default, example:'<end>,<stop>'no default, example:'<end>,<stop>'no
length_penaltyfloatnot supported by this model1, min=0, max=5no
presence_penaltyfloatnot supported by this model0no
frequency_penaltynot supported by this model
repeat_penaltyfloat1.151.15no
seedintno defaultno defaultno
use_prompt_templateboolnot supported by this modeltrueno

Mixtral 8x7b​

Mixtral 8x7b is available under the following model name: mixtral-8x7b-instruct-v0.1.

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptstrno defaultno
temperaturefloatNot supported by this model
max_new_tokensint128no
min_new_tokensintmin=1, to disable set to -1no
top_pfloat0.95, min=0, max=1no
top_kintmin=-1no
stop_sequencesstrno default, example:'<end>,<stop>'no
length_penaltyfloat1, min=0, max=5no
presence_penaltyfloatno defaultno
frequency_penaltyfloatNot supported by this model
repeat_penaltyfloatNot supported by this model
seedintno defaultno
use_prompt_templateboolNot supported by this model

Starling LM 7b Alpha​

Startling LM 7B Alpha is available under the following model name: starling-lm-7b-alpha.

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturefloat.8, min=0.01, max=5no
max_new_tokensint128no
min_new_tokensnot supported by this model
top_pfloat0.95, min=0.01, max=1no
top_kintno defaultno
stop_sequencesstrno default, example:'<end>,<stop>'no
length_penaltynot supported by this model
presence_penaltyfloatno default, min=-5, max=5no
frequency_penaltyfloatno default, min=-5, max=5no
repeat_penaltynot supported by this model
seednot supported by this model
use_prompt_templatenot supported by this model

Yi 34B Chat​

Yi 34B Chat is available under the following model name: yi-34b-chat.

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturefloat0.3no
max_new_tokensint1024no
min_new_tokensnot supported by this model
top_pfloat0.8no
top_kint50no
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltynot supported by this model
frequency_penaltynot supported by this model
repeat_penaltyfloat1.2no
seednot supported by this model
use_prompt_templatebooltrueno

Yi 6B​

Yi 6B is available under the following model name: yi-6b.

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturefloat0.8no
max_new_tokensint512no
min_new_tokensnot supported by this model
top_pfloat0.95no
top_kint50no
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltyfloatno defaultno
frequency_penaltyfloatno defaultno
repeat_penaltynot supported by this model
seednot supported by this model
use_prompt_templatenot supported by this model

Falcon 40B Instruct​

Falcon 40B Instruct is available under the following model name: falcon-40b-instruct.

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturefloat0.75, min=0.01,max=5no
max_new_tokensint500no
min_new_tokensnot supported by this model
top_pfloat1, min=0.01, max=1no
top_knot supported by this model
stop_sequencesstr'<end>,<stop>'no
length_penaltyint1, min=0.01, max=5no
presence_penaltynot supported by this model
frequency_penaltynot supported by this model
repeat_penaltyfloat1, min=0.01, max=5no
seedint-1, min=-1no
use_prompt_templatenot supported by this model

Vicuna 13B​

Vicuna 13B is available under the following model name: vicuna-13b.

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturefloat0.75, min=0.01, max=5no
max_new_tokensint500no
min_new_tokensnot supported by this model
top_pfloat1, min=0.01, max=1no
top_knot supported by this model
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltynot supported by this model
frequency_penaltynot supported by this model
repeat_penaltyfloat1, min=0.01, max=5no
seedint-1no
use_prompt_templatenot supported by this model

Phi 2​

Phi 2 is available under the following model name: phi-2.

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturenot supported by this model
max_new_tokensint200, min=0, max=2048no
min_new_tokensnot supported by this model
top_pnot supported by this model
top_knot supported by this model
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltynot supported by this model
frequency_penaltynot supported by this model
repeat_penaltynot supported by this model
seednot supported by this model
use_prompt_templatenot supported by this model

Olmo 7B​

Olmo 7B is available under the following model name: olmo-7b

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturenot supported by this model
max_new_tokensint100no
min_new_tokensnot supported by this modelno
top_pfloat0.95, min=0.01, max=1no
top_kint50no
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltynot supported by this model
frequency_penaltynot supported by this model
repeat_penaltynot supported by this model
seednot supported by this model
use_prompt_templatenot supported by this model

Wizard Coder 34B V1.0​

Wizard Coder 34B V1.0 is available under the following model name: wizardcoder-34b-v1.0

parametertypedefaultrequired
modelstrno defaultyes
promptstrno defaultyes
system_promptnot supported by this model
temperaturefloat0.75, min=0.01, max=5no
max_new_tokensint256no
min_new_tokensnot supported by this model
top_pfloat1, min 0.01, max=5no
top_knot supported by this model
stop_sequencesnot supported by this model
length_penaltynot supported by this model
presence_penaltynot supported by this model
frequency_penaltynot supported by this model
repeat_penaltyfloat1.1, min=0.01, max=5no
seednot supported by this model
use_prompt_templatenot supported by this model