Vector Store
Seaplane comes with a built-in vector store to build powerful data science and machine learning applications. The Seaplane SDK hides all the complexity of setting up and connecting to the vector store and hands you a single object to interact with. Authentication is handled automatically by your API key. You can read more about authentication here.
You can add the vector store to any task by importing it from the main Seaplane
package from seaplane.vector import vector_store
.
from seaplane.vector import vector_store
def my_vector_store_task(msg):
# do something with your vector_store
vector_store.create_index('my-index', 768)
Index Operations​
Creating An Index​
In vector stores, a database is known as an index sometimes referred to as a
collection. You can create new indexes by calling create_index()
and passing
two arguments. All vectors inside a single index need to have the same
dimension.
- Python
- Expected Output
Input variables:
- The preferred index name -
type:string
- The index dimension -
type:int
from seaplane.vector import vector_store
def my_vector_store_task(msg):
dimension = 3
vector_store.create_index("my-index", dimension)
Keep in mind that calling create_index
inside a task executes it every time
the task is called. This can cause the following error.
b'{"status":{"error":"Wrong input: Collection `<index-name>` already exists!"},"time":0.012740604}'
Resolve this by checking if the index exists before creating it.
from seaplane.vector import vector_store
def my_vector_store_task(msg):
dimension = 3
index_name = "my_index"
# create index if it does not exist
if index_name not in vector_store.list_indexes():
vector_store.create_index(index_name, dimension)
Alternatively, you can use the recreate_index
method. Keep in mind that
recreating an index deletes all data in any existing index with the same name.
True
if index created successfully -type:bool
- Error
UnexpectedResponse
if unsuccessful. If unsuccessful a raw response with more information about the error is included. For example, creating an index that already exists throws the following error.b'{"status":{"error":"Wrong input: Collection <collection-name> already exists!"},"time":0.040361056}'
-type:error
Deleting An Index​
To delete an index call delete_index()
and pass the following arguments.
- Python
- Expected Output
Input Variable:
- The index name -
type:string
from seaplane.vector import vector_store
def my_vector_store_task(msg):
vector_store.delete_index("my-index")
True
if index deleted successfully -type:bool
False
if unsuccessful -type:bool
Vector Operations​
Creating a new Vector​
Most vector operations require the use of the Vector
types.
- Python
- Expected Output
Input Variables
- A vector (required) -
type:list
- Metadata (optional) -
type:dict
- ID (optional), if no ID is assigned the SDK generates one for you based on
uuid4
-type:string
.
You can construct a new Vector
as follows.
from seaplane.vector import Vector
vector = Vector(vector=[1,2,3], metadata={"foo" : "bar"})
type:Vector
Vector(vector=[1, 2, 3], id='54147196-7dac-4847-890f-010aa3f43c45', metadata={'foo': 'bar'})
Inserting A Vector​
To insert a vector call the insert()
method on the vector_store
object. You
can only insert vectors with the same dimension as the index.
- Python
- Expected Output
Input Variables:
- The index name where to insert the vector -
type:string
- A list of vectors -
type:list
with elements oftype:Vector
containing:- vector (required) -
type:list
- id (optional) -
type:string
- metadata (optional) -
type:dict
- vector (required) -
from seaplane.vector import vector_store, Vector
import uuid
def my_vector_store_task(msg):
vector = Vector(vector=[1,2,3], metadata={"foo" : "bar"}, id=str(uuid.uuid4()))
vector_store.insert('my_index_name', [vector])
type:dict
{'id': ['8006bff9-5466-466d-a1fe-937564768124'], 'status': 'COMPLETED'}
Deleting A Vector​
To delete a vector call the delete_vectors()
method on the vector_store
object.
- Python
- Expected Output
Input Variables
- The index name -
type:string
- The vector IDs -
type:list
with elements oftype:string
from seaplane.vector import vector_store
def my_vector_store_task(msg):
vector_store.delete_vectors('my-index', ['my-vector-id'])
type:string
'COMPLETED'
Updating A Vector​
To update a vector call the update()
method on the vector_store
object. You
can only update vectors with the same dimensions as the underlying index.
Seaplane finds the vectors to update based on the id
as provided in the
Vector
- Python
- Expected Output
Input Variables:
- The index name -
type:string
- A list of vectors to update -
type:list
with elements oftype:Vector
containing:- vector (required) -
type:list
- id (required) -
type:string
- metadata (optional) -
type:dict
- vector (required) -
from seaplane.vector import vector_store, Vector
def my_vector_store_task(msg):
vector = Vector(vector=[1,2,3], metadata={"foo" : "bar"}, id='my-vector-id')
vector_store.update('my-index', [vector])
type:dict
{'id': ['6e296940-a81a-4480-9411-6a30682ecb38'], 'status': 'COMPLETED'}
Search Methods​
K Nearest Neighbor Search (KNN)​
KNN search is a popular search algorithm used for similarity search tasks where the goal is to find the K nearest neighbors to a given query vector (where ). It is commonly used in various domains such as recommendation systems, image recognition, natural language processing, and much more.
- Python
- Expected Output
Input Variables:
- Index name
type:string
- Query vector
type:Vector
- K, the number of neighbors to return
type:int
from seaplane.vector import vector_store, Vector
def my_vector_store_task(msg):
vector = Vector(vector=[1,2,3])
vector_store.knn_search('my_index_name', vector, 3)
type:list
with elements oftype:ScoredPoint
containing:- id -
type:string
- version -
type:int
- score -
type:float
- payload -
type:dict
- vector -
type:list
ortype:None
- id -
[ScoredPoint(id='83870a21-a99e-47e9-aba1-fc60d8a3a254', version=3, score=1.0000001, payload={'foo': 'bar'}, vector=None),
ScoredPoint(id='6e296940-a81a-4480-9411-6a30682ecb38', version=3, score=1.0000001, payload={'foo': 'bar'}, vector=None),
ScoredPoint(id='3485b3b2-0b5f-4135-9119-0d6c7895eccf', version=0, score=1.0000001, payload={'foo': 'bar'}, vector=None)]