Skip to main content

Google Docs API Data Source

The Google Docs API connector enables you to ingest document content, structure, and metadata from Google Docs, including body text, formatted paragraphs, tables, named ranges, and revision history. This connector is particularly useful for applications that need to extract document content for analysis, archive document data to a data warehouse, monitor changes to critical documents, or build automated document processing pipelines. Follow the instructions below to create a new data flow that ingests data from a Google Docs API source in Nexla.
google_docs_api.png

Google Docs API

Create a New Data Flow

  1. To create a new data flow, navigate to the Integrate section, and click the New Data Flow button. Then, select the desired flow type from the list, and click the Create button.

  2. Select the Google Docs API connector tile from the list of available connectors. Then, select the credential that will be used to connect to the Google Docs API instance, and click Next; or, create a new Google Docs API credential for use in this flow.

  3. In Nexla, Google Docs API data sources can be created using pre-built endpoint templates, which expedite source setup for common Google Docs API endpoints. Each template is designed specifically for the corresponding Google Docs API endpoint, making source configuration easy and efficient.
    • To configure this source using a template, follow the instructions in Configure Using a Template.

    Google Docs API sources can also be configured manually, allowing you to ingest data from Google Docs API endpoints not included in the pre-built templates or apply further customizations to exactly suit your needs.
    • To configure this source manually, follow the instructions in Configure Manually.

Configure Using a Template

Nexla provides pre-built templates that can be used to rapidly configure data sources to ingest data from common Google Docs API endpoints. Each template is designed specifically for the corresponding Google Docs API endpoint, making data source setup easy and efficient.

Endpoint Settings

  • Select the endpoint from which this source will fetch data from the Endpoint pulldown menu. Available endpoint templates are listed in the expandable boxes below. Click on an endpoint to see more information about it and how to configure your data source for this endpoint.

    Get Document

    Retrieves the full contents and metadata of a specific Google Docs document by its document ID. This is the primary endpoint for extracting the complete structure and content of a document, including body text, inline images, tables, lists, headers, footers, and named ranges. Use this endpoint when you need a comprehensive snapshot of a document for analysis, archiving, or downstream processing.

    • In the Document ID field, enter the unique identifier of the Google Docs document to retrieve. The document ID can be found in the document's URL:

      • For a URL such as https://docs.google.com/document/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms/edit, the document ID is the string between /d/ and /edit — in this example, 1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms.
    • The document must be accessible to the Google account used in the credential. Ensure the authenticated account has at least Viewer access to the document, or that the document is shared with the account's organization.

    The Get Document endpoint returns the entire document structure as a JSON object. The document body is located at body.content within the response. Use the Path to Data field in manual configuration, or specify $.body.content[*] to extract the structural elements of the document body. For complete response schema details, refer to the Google Docs API documents.get reference.

Endpoint Testing

Once the selected endpoint template has been configured, Nexla can retrieve a sample of the data that will be fetched according to the current settings. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Configure Manually

Google Docs API data sources can be manually configured to ingest data from any valid Google Docs API endpoint. Manual configuration provides maximum flexibility for accessing endpoints not covered by pre-built templates or when you need custom API configurations.

With manual configuration, you can also create more complex Google Docs API sources, such as sources that use chained API calls to fetch data from multiple documents or sources that require custom request parameters.

API Method

  1. To manually configure this source, select the Advanced tab at the top of the configuration screen.

  2. Select the API method that will be used for calls to the Google Docs API from the Method pulldown menu. The most common methods are:

    • GET: For retrieving document content and metadata from the API
    • POST: For creating new documents or submitting batch update requests

API Endpoint URL

  1. Enter the URL of the Google Docs API endpoint from which this source will fetch data in the Set API URL field. This should be the complete URL including the protocol (https://) and any required path parameters.

    The base URL for the Google Docs API v1 is https://docs.googleapis.com/v1. Common endpoint paths include:

    • Get a specific document: {'https://docs.googleapis.com/v1/documents/{documentId}'}
    • Create a new document: https://docs.googleapis.com/v1/documents

Ensure the API endpoint URL is correct and that the authenticated account has access to the specified document. You can test the endpoint using the Test button after configuring the URL. Replace {documentId} with the actual document ID found in the document's Google Docs URL.

Date/Time Macros (API URL)

Optional

Optionally, the API URL can be customized using macros—all macros added to the API URL will be converted into values when Nexla executes the API call. Macros are dynamic placeholders that allow you to create flexible API endpoints that can adapt to different time periods or data requirements.

Macros are particularly useful for APIs that require date ranges, pagination parameters, or other dynamic values that change between data ingestion runs.

  1. To add a macro, type { at the appropriate position in the API URL (within the Set API URL field), and select the desired macro from the dropdown list.

    • {now} – The current datetime
    • {now-1} – The datetime one time unit before the current datetime
    • {now+1} – The datetime one time unit after the current datetime
    • custom – Datetime macros can reference any number of time units before or after the current datetime—for example, enter (now-4) to indicate the datetime four time units before the current datetime
  2. Select the format that will be applied to datetime macros from the Date Format for Date/Time Macro pulldown menu. This format will be applied to the base datetime value of the macro—i.e., the value of {now} in {now-1}.

  3. Select the datetime unit that will be used to perform mathematical operations in the included macro(s) from the Time Unit for Operations pulldown menu—for example, for the macro {now-1}, when Day is selected, {now-1} will be converted to the datetime one day before the current datetime.

Lookup-Based Macros (API URL)

Optional

Column values from existing lookups can also be included as macros in the API URL. Lookup-based macros allow you to reference data from previously configured data sources or lookups, enabling dynamic API endpoints that can adapt based on existing data.

Lookup-based macros are useful when you need to dynamically inject document IDs or other identifiers retrieved from a previous Nexla source into the Google Docs API endpoint URL. For example, you can reference a list of document IDs from a Google Drive source to iterate over multiple documents.

  1. To include a lookup column value macro, select the relevant lookup from the Add Lookups to Supported Macros pulldown menu.

  2. Type { at the appropriate position in the API URL, and select the lookup column-based macro from the dropdown list. Lookup-based macros are automatically populated into the macro list when a lookup is selected in the Add Lookups to Supported Macros pulldown menu.

Path to Data

Optional

If only a subset of the data that will be returned by the API endpoint is needed, you can designate the part(s) of the response that should be included in the Nexset(s) produced from this source by specifying the path to the relevant data within the response.

For example, the Google Docs API documents.get endpoint returns a JSON response where the document body content elements are contained in the body.content array. By entering the path $.body.content[*], you can configure Nexla to treat each structural element of the document body as a separate record.

Path to Data is important when working with the Google Docs API, as document responses have deeply nested structures. Without specifying the correct path, Nexla may not be able to properly parse and organize the document content into usable records.

  • To specify which data should be treated as relevant in responses from this source, enter the path to the relevant data in the Set Path to Data in Response field.

    • For responses in JSON format, enter the JSON path that points to the object or array that should be treated as relevant data. JSON paths use dot notation (e.g., $.body.content[*] to access the body content array in a Google Docs document response).
    Path to Data Example:

    For the Google Docs API documents.get response, which includes a body.content array containing the document's structural elements, enter the path as $.body.content[*].

Autogenerate Path Suggestions

Nexla can also autogenerate data path suggestions based on the response from the API endpoint. These suggested paths can be used as-is or modified to exactly suit your needs.

  • To use this feature, click the Test button next to the Set API URL field to fetch a sample response from the API endpoint. Suggested data paths generated based on the content & format of the response will be displayed in the Suggestions box below the Set Path to Data in Response field.

  • Click on a suggestion to automatically populate the Set Path to Data in Response field with the corresponding path. The populated path can be modified directly within the field if further customization is needed.

    PathSuggestions.png

Metadata

If metadata is included in the response but is located outside of the defined path to relevant data, you can configure Nexla to include this data as common metadata in each record.

For example, the Google Docs API documents.get response includes top-level properties such as title, documentId, revisionId, and documentStyle alongside the body.content array. If you have specified $.body.content[*] as the path to relevant data, you can specify a path to additional document-level metadata to include it with each content element record.

Metadata paths are particularly useful for preserving document-level context (such as document title, revision ID, or last modified time) that applies to all content elements in the response but is not part of the individual structural element records.

  • To specify the location of metadata that should be included with each record, enter the path to the relevant metadata in the Path to Metadata in Response field.

    • For responses in JSON format, enter the JSON path to the object or array that contains the metadata.

Request Headers

Optional
  • If Nexla should include any additional request headers in API calls to this source, enter the headers & corresponding values as comma-separated pairs in the Request Headers field (e.g., header1:value1,header2:value2). Additional headers may be required for specific Google Docs API features such as requesting gzip-compressed responses.

    You do not need to include the Authorization header — it is managed automatically by the Google Docs API credential. Common headers like Content-Type and Accept are also handled automatically by Nexla.

Endpoint Testing

After configuring all settings for the selected endpoint, Nexla can retrieve a sample of the data that will be fetched according to the current configuration. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Save & Activate the Source

  1. Once all of the relevant steps in the above sections have been completed, click the Create button in the upper right corner of the screen to save and create the new Google Docs API data source. Nexla will now begin ingesting data from the configured endpoint and will organize any data that it finds into one or more Nexsets.