Skip to main content

Cohere AI Data Source

The Cohere AI connector enables you to ingest data from Cohere's language models, allowing you to generate text completions, create embeddings, and interact with Cohere's AI models. This connector is particularly useful for applications that need to extract AI-generated content, analyze embeddings, or process language model responses.
cohere_api.png

Cohere AI

Follow the instructions below to create a new data flow that ingests data from a Cohere AI source in Nexla.

Create a New Data Flow

  1. To create a new data flow, navigate to the Integrate section, and click the New Data Flow button. Then, select the desired flow type from the list, and click the Create button.

  2. Select the Cohere AI connector tile from the list of available connectors. Then, select the credential that will be used to connect to the Cohere AI instance, and click Next; or, create a new Cohere AI credential for use in this flow.

  3. In Nexla, Cohere AI data sources can be created using pre-built endpoint templates, which expedite source setup for common Cohere AI endpoints. Each template is designed specifically for the corresponding Cohere AI endpoint, making source configuration easy and efficient.
    • To configure this source using a template, follow the instructions in Configure Using a Template.

    Cohere AI sources can also be configured manually, allowing you to ingest data from Cohere AI endpoints not included in the pre-built templates or apply further customizations to exactly suit your needs.
    • To configure this source manually, follow the instructions in Configure Manually.

Configure Using a Template

Nexla provides pre-built templates that can be used to rapidly configure data sources to ingest data from common Cohere AI endpoints. Each template is designed specifically for the corresponding Cohere AI endpoint, making data source setup easy and efficient.

Endpoint Settings

  • Select the endpoint from which this source will fetch data from the Endpoint pulldown menu. Available endpoint templates are listed in the expandable boxes below. Click on an endpoint to see more information about it and how to configure your data source for this endpoint.

    Generate Text

    This endpoint generates text completions using Cohere's language models. Use this endpoint when you need to generate text, complete prompts, or create AI-generated content using Cohere's models.

    • Enter the model name in the Model field. Common Cohere models include command (default), command-light, command-nightly, and command-light-nightly. The default is command.
    • Enter the prompt text in the Prompt field. This is the input text that Cohere will use to generate the completion.
    • Optionally, enter the maximum number of tokens to generate in the Max Tokens field. The default is 300. This controls the length of the generated text.
    • Optionally, enter the temperature value in the Temperature field. The default is 0.75. Temperature controls randomness (0-5), where lower values are more deterministic and higher values are more creative.
    • Optionally, enter the k value in the k field. The default is 0. This parameter specifies the number of most likely tokens to sample from (nucleus sampling).
    • Optionally, enter the p (top-p) value in the Top P field. The default is 1. This parameter specifies the cumulative probability for nucleus sampling (0-1).
    • Optionally, enter stop sequences in the Stop Sequences field as an array. The default is an empty array. Stop sequences are text patterns where generation will stop.
    • Enter a schedule in the Schedule field to specify when this data source should run. The schedule uses cron expression format.

    The generate endpoint is useful for creating text completions, summaries, and other text generation tasks. Temperature and top-p parameters control the creativity and randomness of the generated text. For complete information about text generation, see the Cohere API Documentation.

    Chat Completion

    This endpoint generates conversational responses using Cohere's chat models. Use this endpoint when you need to have conversations with Cohere's models, build chatbots, or create interactive AI applications.

    • Enter the model name in the Model field. Common Cohere chat models include command-r (default), command-r-plus, and other command variants. The default is command-r.
    • Enter the message text in the Message field. This is the user message that Cohere will respond to in the conversation.
    • Enter a schedule in the Schedule field to specify when this data source should run. The schedule uses cron expression format.

    The chat endpoint is designed for conversational interactions and maintains context better than the generate endpoint. Chat models are optimized for multi-turn conversations. For complete information about chat completions, see the Cohere API Documentation.

    Generate Embeddings

    This endpoint generates embeddings (vector representations) for input texts using Cohere's embedding models. Use this endpoint when you need to create embeddings for semantic search, similarity matching, or machine learning applications.

    • Enter the model name in the Model field. Common Cohere embedding models include embed-english-v3.0 (default), embed-multilingual-v3.0, and other embedding variants. The default is embed-english-v3.0.
    • Enter the input type in the Input Type field. Common values include search_document (default) for documents to be searched, search_query for search queries, classification for classification tasks, and clustering for clustering tasks. The default is search_document.
    • Enter the texts to embed in the Texts field as a JSON array of strings. The default format is a JSON array like ["hello", "goodbye"]. Each string in the array will be embedded separately.
    • Enter a schedule in the Schedule field to specify when this data source should run. The schedule uses cron expression format.

    Embeddings are vector representations of text that capture semantic meaning. They are useful for semantic search, similarity matching, and machine learning applications. The input type parameter helps optimize embeddings for different use cases. For complete information about embeddings, see the Cohere API Documentation.

Endpoint Testing

Once the selected endpoint template has been configured, Nexla can retrieve a sample of the data that will be fetched according to the current settings. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Configure Manually

Cohere AI data sources can be manually configured to ingest data from any valid Cohere AI API endpoint. Manual configuration provides maximum flexibility for accessing endpoints not covered by pre-built templates or when you need custom API configurations.

With manual configuration, you can also create more complex Cohere AI sources, such as sources that use chained API calls to fetch data from multiple endpoints or sources that require custom authentication headers or request parameters.

API Method

  1. To manually configure this source, select the Advanced tab at the top of the configuration screen.

  2. Select the API method that will be used for calls to the Cohere AI API from the Method pulldown menu. The most common methods are:

    • POST: For generating text, creating embeddings, or chat completions (most common for Cohere AI)

API Endpoint URL

  1. Enter the URL of the Cohere AI API endpoint from which this source will fetch data in the Set API URL field. This should be the complete URL including the protocol (https://) and any required path parameters.

Cohere API URLs typically follow the format: https://api.cohere.com/v1/generate for text generation, https://api.cohere.com/v1/chat for chat completions, or https://api.cohere.com/v1/embed for embeddings. Replace the base URL and version with values from your credentials if different. Ensure the API endpoint URL is correct and accessible with your current credentials. You can test the endpoint using the Test button after configuring the URL. For complete information about Cohere API endpoints, see the Cohere API Documentation.

Request Headers

  1. If Nexla should include any additional request headers in API calls to this source, enter the headers & corresponding values as comma-separated pairs in the Request Headers field (e.g., header1:value1,header2:value2).

You do not need to include authentication headers (Authorization: Bearer {key}) as these are automatically included from your credentials. However, you may need to include additional headers for specific Cohere API features. The Content-Type header should be set to application/json for most Cohere API requests.

Request Body

  1. For POST requests, enter the request body in the Request Body field. The request body should be in JSON format and include the parameters required by the Cohere API endpoint you're calling.

Cohere API request bodies vary by endpoint. For generate endpoints, include parameters like model, prompt, max_tokens, and temperature. For embed endpoints, include model, texts, and input_type. For complete information about request formats, see the Cohere API Documentation.

Response Data Path

  1. Enter the JSON path expression that identifies the location of the data array in the API response in the Response Data Path field. This path tells Nexla where to find the array of records in the JSON response.

For Cohere API responses, the data path varies by endpoint. For generate responses, use $.generations[*].text to extract individual generated text completions. For embed responses, use $.embeddings to extract the embeddings array. For chat responses, use $.text to extract the chat response. JSON path expressions use dot notation and array indexing to navigate the response structure. For complete information about Cohere API response formats, see the Cohere API Documentation.

Schedule

  1. Enter a schedule in the Schedule field to specify when this data source should run. The schedule uses cron expression format to define the frequency and timing of data ingestion.

Common cron expressions include: 0 6 * * * for daily at 6 AM, 0 */6 * * * for every 6 hours, and 0 0 * * 0 for weekly on Sunday at midnight. For more information about cron expressions, see the Nexla documentation on scheduling.