Querying Data Using GenAI
Expedite data analysis by directly querying your data with the RAG-based Nexla GenAI chatbot, leveraging the power of GenAI and LLMs to gain insights based on both the data itself and its contextual relationships.
The Nexla chatbot uses retrieval-augmented generation (RAG), combining the power of search algorithms with the capabilities of large language models (LLMs), to provide an interface wherein users can query their data in documents, databases, and API-based systems. The chatbot works with a wide variety of LLMs and can be used to rapidly obtain accurate insights and supporting information from any data.
Tutorial Goal
In this tutorial, we will use the Nexla chatbot to query a Nexset containing open-source data about legal filings concerning FTX Trading, including:
- Initializing a chatbot session using a Nexla service key
- Selecting the Nexset and LLMs that will be used for the query
- Using the chatbot to determine when and where the initial legal proceedings were filed
Although specific example data, LLMs, and prompts are used in this tutorial, the process outlined provides a guide that can be used to query any available data using any supported LLM(s).
Prerequisites & Information
Nexla Chatbot: How It Works
The Nexla chatbot enables users to obtain rapid, accurate insights from data via direct queries by combining the power of RAG and LLMs. These queries can include data from one or more Nexsets and can include data from various sources and in multiple formats.
Within the chatbot UI, first, users select the Nexset(s) containing relevant data and the desired LLM(s) that will provide responses to submitted prompts. Then, once a query prompt is submitted to the chatbot, the chosen LLM(s) will analyze the selected data—including any available metadata and other contextual information—and display a response within the chat window.
The Nexla chatbot maintains the same strict data isolation & access controls valued in the rest of the Nexla platform, ensuring data integrity and privacy.
Prerequisite: Nexset Preparation
Nexla's GenAI chatbot can be used to directly query Nexsets from REST API sources and data in SQL databases—no prior processing is required to prepare data from these sources for querying via the chatbot interface.
When using the Nexla chatbot to query REST API sources, the queuries are constructed using macros.
To query data from SQL databases with the Nexla chatbot, the SQL data source must be created using Query Mode, and the query must follow the format {query = <SQL query>}
—for example, {query = SELECT * FROM account_metrics_daily}
.
The Nexla chatbot can also be used to query data located in vector databases. Nexla users can easily create data flows that vectorize ingested data and then send that data to any vector database. Then, the prepared data can be directly queried via the chatbot interface, with answers provided based not only on the relevant data but also its contextual relationships with other data present in the database.
- To learn how to prepare & send data to a vector database, see the Sending Text Data to Vector Databases tutorial.
Prerequisite: Nexla Service Key
A Nexla service key is required to initialize a session with the Nexla chatbot.
Service keys are forever keys associated with your Nexla account. They are used to programmatically access Nexla and obtain a session token.
Account service keys are equivalent to your account password. These keys should be securely stored and treated as highly sensitive information.
To create a service key for your account, log into Nexla, and navigate the Settings section. Then, open the Authentication screen, and click the Create Service Key button.
Copy & paste the generated service key into a secure location for use in the subsequent steps of this tutorial.
Step 1: Initialize the Nexla Chatbot Session
To use the Nexla GenAI chatbot, each session must be initialized using the service key generated for your account.
Navigate to https://genai.nexla.com/.
Under Authorize in the toolbar on the left side of the screen, paste the account service key into the Service Key field. Then, click Initialize to authorize your account and begin the chatbot session.
Step 2: Select the Nexset(s) to Query
The Nexla chatbot can be used to query Nexsets containing data from vector databases, SQL databases, and/or REST APIs. Nexsets to be queried are selected within the chatbot UI.
Multiple Nexsets from the same or different source types can be queried in a single chatbot session, providing the ability to simultaneously query data from multiple sources.
The vector database, SQL database, or REST API to be queried must be added as a data source in your Nexla account.
Once the source is added & activated, Nexla will read and organize data from the source location according to the configured source settings. Ingested data is then organized into one or more detected Nexsets.
Both detected Nexsets and transformed Nexsets (created by transforming detected Nexsets in the Nexset Designer) can be queried using the Nexla chatbot.
Under the Select Nexsets heading in the toolbar on the left side of the screen, use the Search Nexsets field to locate the Nexset(s) to be queried. Nexsets can be located by searching with the Nexset ID or the full or partial Nexset name.
Click on a Nexset in the search results to select it for querying. The Nexset(s) selected for querying in the current session are shown under the Selected Nexsets heading.
- To query more than one Nexset, repeat the Nexset search & selection process in steps 1 & 2 to select additional Nexsets.
Select the Nexset from the Search Results
List of Selected Nexsets
Step 3: Select the LLM for Querying
The Nexla chatbot is integrated with a variety of LLM providers and models that can be selected for use when querying your Nexset data. Supported providers include Google, OpenAI, Databricks, and more. Users can select one or more LLM providers and LLM versions to be accessed during the chatbot session.
You can also integrate your own custom LLM using the REST API connector.
When selecting an LLM provider and version for use in your Nexla chatbot session, be sure to consider any applicable rate limits.
Rate limits vary among models. Often, larger models such as GPT-4o and Claude 3.5 limit users to a lower number of requests per day (i.e., 50), while smaller models such as GPT-4o mini allow a higher number of requests per day (i.e., 200). Typically, request counts are reset daily based on the user's service key.
Check the Activate box next to each LLM provider that will be used to query your selected Nexsets. In this case, we'll select Anthropic and OpenAI.
For each LLM provider, select the specific LLM that you would like to use to query your data from the pulldown menu.
Step 4: Query the Nexset Data
When queries are submitted to the Nexla chatbot, the chosen LLM(s) will analyze the data in the selected Nexsets and provide a response within the chat window. For Nexsets from REST API sources and SQL databases, responses are provided based on the Nexset data and any included contextual information. For Nexsets from vector database sources, responses are provided based on a search of the vector DB location that includes contextual information from nearest neighbors/vectors.
Type your query into the chat text field at the bottom of the window, and hit Enter or click Send to submit the prompt to the chatbot. In this tutorial, we want to determine the earliest date on which proceedings were filed and will enter the query
What is the earliest filing date?
Chatbot QueriesQuery phrases submitted to the Nexla chatbot should be as specific as possible and include any needed contextual information, such as relevant dates, data to be excluded, etc.
Each LLM response is provided within the chat window. In this case, the Antropic and OpenAI LLMs will use all of the information available in the selected Nexset to determine the earliest filing date mentioned in the source PDF.
Additional prompts can be entered to refine the query and/or obtain more information by repeating steps 1-2. For this tutorial, we'll enter the query
In what district was the lawsuit filed?
In this tutorial, we used the Nexla chatbot to query a Nexset containing data about legal filings associated with FTX Trading, with the goal of determining when and where the first legal action was filed. The selected LLMs returned the answers to these example questions, providing the requested date and location along with additional contextual information.
The steps outlined in this tutorial can be used as a general guide for querying any available Nexset(s) with the Nexla GenAI chatbot.