Skip to main content

Google Gemini Data Source

The Google Gemini connector enables you to interact with Google's Gemini language models through the Gemini API, allowing you to generate content, analyze text, and leverage AI-powered capabilities in your data workflows. This connector is particularly useful for applications that need to generate text content, perform language analysis, or integrate AI capabilities into data processing pipelines. Follow the instructions below to create a new data flow that ingests data from a Google Gemini source in Nexla.
gemini_api.png

Google Gemini

Create a New Data Flow

  1. To create a new data flow, navigate to the Integrate section, and click the New Data Flow button. Then, select the desired flow type from the list, and click the Create button.

  2. Select the Google Gemini connector tile from the list of available connectors. Then, select the credential that will be used to connect to the Google Gemini instance, and click Next; or, create a new Google Gemini credential for use in this flow.

  3. In Nexla, Google Gemini data sources can be created using pre-built endpoint templates, which expedite source setup for common Google Gemini endpoints. Each template is designed specifically for the corresponding Google Gemini endpoint, making source configuration easy and efficient.
    • To configure this source using a template, follow the instructions in Configure Using a Template.

    Google Gemini sources can also be configured manually, allowing you to ingest data from Google Gemini endpoints not included in the pre-built templates or apply further customizations to exactly suit your needs.
    • To configure this source manually, follow the instructions in Configure Manually.

Configure Using a Template

Nexla provides pre-built templates that can be used to rapidly configure data sources to ingest data from common Google Gemini endpoints. Each template is designed specifically for the corresponding Google Gemini endpoint, making data source setup easy and efficient.

Endpoint Settings

  • Select the endpoint from which this source will fetch data from the Endpoint pulldown menu. Available endpoint templates are listed in the expandable boxes below. Click on an endpoint to see more information about it and how to configure your data source for this endpoint.

    Generate Content

    This endpoint generates content using Google's Gemini language models. Use this endpoint when you need to generate text, analyze content, or leverage AI capabilities for content creation and analysis.

    • Enter the model name in the Model field. The default value is gemini-pro, but you can specify other available Gemini models such as gemini-pro-vision for multimodal capabilities.
    • Enter the prompt or query you want to send to the model in the Message field. This is the text input that the model will process and respond to.
    • Enter the temperature value in the Temperature field. Temperature controls the randomness and creativity of the model's output by adjusting the probability distribution of token selection. Lower values (e.g., 0.1-0.3) produce more focused, deterministic, and factual responses - ideal for tasks requiring accuracy like data extraction or summarization. Higher values (e.g., 0.7-1.0) produce more creative, varied, and exploratory responses - ideal for creative writing or brainstorming. The default is 0.3, which provides a balance between creativity and consistency.
    • Enter the Top-P value in the Top-P field. Top-P (nucleus sampling) controls the diversity of token selection by considering only tokens whose cumulative probability mass reaches the specified threshold. Higher values (~1) increase diversity by considering more token options; lower values (~0.5) make the model more conservative by focusing on the most likely tokens. The default is 1, which allows maximum diversity. Use lower Top-P values when you need more predictable outputs.
    • Enter the Top-K value in the Top-K field. Top-K limits the number of top tokens the model considers for selection at each step. Higher values (~100) increase diversity by allowing more token options; lower values (~1) make the model more focused by considering only the most likely tokens. The default is 32, which provides a good balance. Note that very high Top-K values may impact response quality, so adjust carefully based on your use case.
    • Enter the maximum number of tokens in the Max tokens model param field. This limits the length of the generated response and helps control API costs. The default is 2048 tokens, which is approximately 1,500-2,000 words depending on the content. For longer responses, you can increase this value, but be aware that longer responses take more time to generate and consume more API quota. For shorter, concise responses, you can decrease this value.
    • Enter the desired MIME type for the response in the Response MIME Type field. This determines the format of the generated content. Use text/plain for plain text responses (default), application/json for JSON-formatted responses, or other MIME types as needed for your use case. The MIME type affects how the response is structured and can be useful for integrating with systems that expect specific data formats.

    The Generate Content endpoint uses POST requests to send prompts to the Gemini model. The model processes the input and generates a response based on the provided parameters. Adjust temperature, Top-P, and Top-K values based on your use case: use lower values for factual content, data extraction, summarization, and technical documentation, and use higher values for creative writing, brainstorming, and exploratory content generation. The combination of these parameters allows you to fine-tune the model's output to match your specific requirements. Experiment with different values to find the optimal settings for your use case.

Endpoint Testing

Once the selected endpoint template has been configured, Nexla can retrieve a sample of the data that will be fetched according to the current settings. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Configure Manually

Google Gemini data sources can be manually configured to ingest data from any valid Google Gemini API endpoint. Manual configuration provides maximum flexibility for accessing endpoints not covered by pre-built templates or when you need custom API configurations.

With manual configuration, you can also create more complex Google Gemini sources, such as sources that use chained API calls to fetch data from multiple endpoints or sources that require custom authentication headers or request parameters.

API Method

  1. To manually configure this source, select the Advanced tab at the top of the configuration screen.

  2. Select the API method that will be used for calls to the Google Gemini API from the Method pulldown menu. The most common methods are:

    • GET: For retrieving data from the API
    • POST: For sending data to the API or triggering actions (most Gemini endpoints use POST)
    • PUT: For updating existing data
    • PATCH: For partial updates to existing data
    • DELETE: For removing data

API Endpoint URL

  1. Enter the URL of the Google Gemini API endpoint from which this source will fetch data in the Set API URL field. This should be the complete URL including the protocol (https://) and any required path parameters. Google Gemini API endpoints typically follow the pattern https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent.

Ensure the API endpoint URL is correct and accessible with your current credentials. You can test the endpoint using the Test button after configuring the URL. The URL should include the model name and action (e.g., generateContent).

Path to Data

Optional

If only a subset of the data that will be returned by API endpoint is needed, you can designate the part(s) of the response that should be included in the Nexset(s) produced from this source by specifying the path to the relevant data within the response. This is particularly useful when API responses contain metadata, pagination information, or other data that you don't need for your analysis.

For example, when a request call is used to fetch a list of items, the API will typically return an array of records, along with metadata, in the response. By entering the path to the relevant data, you can configure Nexla to treat each element of the returned array as a record.

Path to Data is essential when API responses have nested structures. Without specifying the correct path, Nexla might not be able to properly parse and organize your data into usable records. For Gemini API responses, the generated content is typically located in $.candidates[*].content.parts[*].text for text responses.

  • To specify which data should be treated as relevant in responses from this source, enter the path to the relevant data in the Set Path to Data in Response field.

    • For responses in JSON format enter the JSON path that points to the object or array that should be treated as relevant data. JSON paths use dot notation (e.g., $.candidates[*].content.parts[*] to access content parts from Gemini responses).

    • For responses in XML format, enter the XPath that points to the object/array containing relevant data. XPath uses slash notation (e.g., /response/candidates/candidate/content/parts/part to access part elements).

Request Headers

Optional
  • If Nexla should include any additional request headers in API calls to this source, enter the headers & corresponding values as comma-separated pairs in the Request Headers field (e.g., header1:value1,header2:value2). Additional headers are often required for API versioning, content type specifications, or custom authentication requirements.

    You do not need to include any headers already present in the credentials. Common headers like Authorization, Content-Type, and Accept are typically handled automatically by Nexla based on your credential configuration. Google Gemini API uses Content-Type: application/json for request bodies.

Endpoint Testing

After configuring all settings for the selected endpoint, Nexla can retrieve a sample of the data that will be fetched according to the current configuration. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Save & Activate the Source

  1. Once all of the relevant steps in the above sections have been completed, click the Create button in the upper right corner of the screen to save and create the new Google Gemini data source. Nexla will now begin ingesting data from the configured endpoint and will organize any data that it finds into one or more Nexsets.