Skip to main content

Voyage AI Data Source

The Voyage AI connector enables you to generate vector embeddings for text, rerank documents based on relevance to queries, and build AI-powered search and recommendation systems. This connector is particularly useful for applications that need to convert text to embeddings, implement semantic search, build recommendation engines, or improve search relevance through reranking. Follow the instructions below to create a new data flow that ingests data from a Voyage AI source in Nexla.
voyage_ai_api.png

Voyage AI

Create a New Data Flow

  1. To create a new data flow, navigate to the Integrate section, and click the New Data Flow button. Then, select the desired flow type from the list, and click the Create button.

  2. Select the Voyage AI connector tile from the list of available connectors. Then, select the credential that will be used to connect to the Voyage AI API, and click Next; or, create a new Voyage AI credential for use in this flow.

  3. In Nexla, Voyage AI data sources can be created using pre-built endpoint templates, which expedite source setup for common Voyage AI API endpoints. Each template is designed specifically for the corresponding Voyage AI API endpoint, making source configuration easy and efficient.
    • To configure this source using a template, follow the instructions in Configure Using a Template.

    Voyage AI sources can also be configured manually, allowing you to ingest data from Voyage AI API endpoints not included in the pre-built templates or apply further customizations to exactly suit your needs.
    • To configure this source manually, follow the instructions in Configure Manually.

Configure Using a Template

Nexla provides pre-built templates that can be used to rapidly configure data sources to ingest data from common Voyage AI API endpoints. Each template is designed specifically for the corresponding Voyage AI API endpoint, making data source setup easy and efficient.

Endpoint Settings

  • Select the endpoint from which this source will fetch data from the Endpoint pulldown menu. Available endpoint templates are listed in the expandable boxes below. Click on an endpoint to see more information about it and how to configure your data source for this endpoint.

    Generate Embeddings

    This endpoint generates vector embeddings for input text(s) using Voyage AI. Use this endpoint when you need to convert text to embeddings, create vector representations of text, or prepare text for semantic search.

    • Enter the input text in the Input field. This is the text you want to convert to embeddings. You can provide a single text string or multiple texts.
    • Enter the model name in the Model field. Common models include voyage-3.5, voyage-large-2, or other available Voyage AI models. The default is typically voyage-3.5.

    The Generate Embeddings endpoint uses POST requests to send text to the Voyage AI API and returns vector embeddings. The endpoint supports batch processing, allowing you to generate embeddings for multiple texts in a single request. For more information about the Generate Embeddings endpoint, refer to the Voyage AI Embeddings Documentation.

    Rerank Documents

    This endpoint reranks a list of documents based on relevance to a query using Voyage AI. Use this endpoint when you need to improve search relevance, reorder search results, or find the most relevant documents for a query.

    • Enter the query text in the Query field. This is the search query you want to use for reranking documents.
    • Enter the documents array in JSON format in the Documents field. This should be an array of document strings to be reranked.
    • Enter the model name in the Model field. Common models include rerank-1 or other available Voyage AI reranking models.

    The Rerank Documents endpoint uses POST requests to send a query and documents to the Voyage AI API and returns reranked documents with relevance scores. The endpoint helps improve search relevance by reordering documents based on their relevance to the query. For more information about the Rerank Documents endpoint, refer to the Voyage AI Reranker Documentation.

Endpoint Testing

Once the selected endpoint template has been configured, Nexla can retrieve a sample of the data that will be fetched according to the current settings. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Configure Manually

Voyage AI data sources can be manually configured to ingest data from any valid Voyage AI API endpoint. Manual configuration provides maximum flexibility for accessing endpoints not covered by pre-built templates or when you need custom API configurations.

With manual configuration, you can also create more complex Voyage AI sources, such as sources that use chained API calls to fetch data from multiple endpoints or sources that require custom authentication headers or request parameters.

API Method

  1. To manually configure this source, select the Advanced tab at the top of the configuration screen.

  2. Select the API method that will be used for calls to the Voyage AI API from the Method pulldown menu. The most common methods are:

    • POST: For sending embedding or reranking requests to the API (most Voyage AI endpoints use POST)

API Endpoint URL

  1. Enter the URL of the Voyage AI API endpoint from which this source will fetch data in the Set API URL field. This should be the complete URL including the protocol (https://) and any required path parameters. Voyage AI API endpoints typically follow the pattern {base_url}/{api_version}/{endpoint}, where {base_url} is typically https://api.voyageai.com and {api_version} is v1.

Ensure the API endpoint URL is correct and accessible with your current credentials. You can test the endpoint using the Test button after configuring the URL. The endpoint URL should use the base URL and API version configured in your credential. Common Voyage AI endpoints include /{api_version}/embeddings for generating embeddings and /{api_version}/rerank for reranking documents.

Path to Data

Optional

If only a subset of the data that will be returned by API endpoint is needed, you can designate the part(s) of the response that should be included in the Nexset(s) produced from this source by specifying the path to the relevant data within the response. This is particularly useful when API responses contain metadata, pagination information, or other data that you don't need for your analysis.

Path to Data is essential when API responses have nested structures. Without specifying the correct path, Nexla might not be able to properly parse and organize your data into usable records. For Voyage AI API responses, common paths include $.data[*] for arrays of embeddings or reranked documents.

  • To specify which data should be treated as relevant in responses from this source, enter the path to the relevant data in the Set Path to Data in Response field.

    • For responses in JSON format enter the JSON path that points to the object or array that should be treated as relevant data. JSON paths use dot notation (e.g., $.data to access the data array).

Request Headers

Optional
  • If Nexla should include any additional request headers in API calls to this source, enter the headers & corresponding values as comma-separated pairs in the Request Headers field (e.g., header1:value1,header2:value2). Additional headers are often required for API versioning, content type specifications, or custom authentication requirements.

    You do not need to include any headers already present in the credentials. Common headers like Authorization, Content-Type, and Accept are typically handled automatically by Nexla based on your credential configuration. For Voyage AI, the Authorization header with Bearer token is automatically included from your credential, and Content-Type is typically set to application/json for API requests.

Request Body

Optional
  • If the API endpoint requires a request body (which is common for POST requests to Voyage AI), enter the request body in the Request Body field. The request body should be formatted as JSON and include the necessary parameters for the embedding or reranking request, such as the input text, model, and other optional parameters.

    For Voyage AI embedding requests, the request body typically includes an input field (text or array of texts) and a model field (e.g., "voyage-3.5"). For reranking requests, the request body typically includes a query field, a documents array, and a model field. Refer to the Voyage AI API documentation for the complete list of supported parameters.

Endpoint Testing

After configuring all settings for the selected endpoint, Nexla can retrieve a sample of the data that will be fetched according to the current configuration. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Save & Activate the Source

  1. Once all of the relevant steps in the above sections have been completed, click the Create button in the upper right corner of the screen to save and create the new Voyage AI data source. Nexla will now begin ingesting data from the configured endpoint and will organize any data that it finds into one or more Nexsets.