This connector materializes Flow collections into namespaces in a Pinecone index.
The connector uses the OpenAI Embedding API to create vector embeddings based on the documents in your collections and inserts these vector embeddings and associated metadata into Pinecone for storage and retrieval.
provides the latest connector image. You can also follow the link in your browser to see past image
To use this connector, you'll need:
- A Pinecone account with an API Key for authentication.
- An OpenAI account with an API Key for authentication.
- A Pinecone Index created to store materialized vector
embeddings. When using the embedding model
text-embedding-ada-002(recommended), the index must have
Dimensionsset to 1536.
The materialization creates a vector embedding for each collection document. Its structure is based on the collection fields.
By default, fields of a single scalar type are including in the embedding: strings, integers, numbers, and booleans. You can include additional array or object type fields using projected fields.
The text generated for the embedding has this structure, with field names and their values separated by newlines:
Pinecone Record Metadata
Pinecone supports metadata fields associated with stored vectors that can be used when performing
vector queries. This materialization will
include the materialized document as a JSON string in the metadata field
flow_document to enable
retrieval of the document from vectors returned by Pinecone queries.
Pinecone indexes all metadata fields by default. To manage memory usage of the index, use selective
metadata indexing to
flow_document metadata field.
|Pinecone Index||Pinecone index for this materialization. Must already exist and have appropriate dimensions for the embedding model used.||string||Required|
|Pinecone Environment||Cloud region for your Pinecone project. Example: us-central1-gcp||string||Required|
|Pinecone API Key||Pinecone API key used for authentication.||string||Required|
|OpenAI API Key||OpenAI API key used for authentication.||string||Required|
|Embedding Model ID||Embedding model ID for generating OpenAI bindings. The default text-embedding-ada-002 is recommended.||string|
|Options for advanced users. You should not typically need to modify these.||object|
|OpenAI Organization||Optional organization name for OpenAI requests. Use this if you belong to multiple organizations to specify which organization is used for API requests.||string|
|Pinecone Namespace||Name of the Pinecone namespace that this collection will materialize vectors into.||string||Required|
This connector operates only in delta updates mode.
Pinecone upserts vectors based on their
for materialized vectors is based on the Flow Collection key.
For collections with a a top-level reduction strategy of merge and a strategy of lastWriteWins for all nested values (this is also the default), collections will be materialized "effectively once", with any updated Flow documents replacing vectors in the Pinecone index if they have the same key.