Skip to main content

AWS Kinesis Firehose Data Source

The AWS Kinesis Data Firehose connector enables you to retrieve information about delivery streams, list delivery streams, and access delivery stream metadata from AWS Kinesis Data Firehose. This connector is particularly useful for applications that need to monitor delivery stream status, analyze streaming data infrastructure, or manage Firehose delivery streams. Follow the instructions below to create a new data flow that ingests data from an AWS Kinesis Firehose source in Nexla.
aws_kinesis_data_firehose_api.png

AWS Kinesis Firehose

Create a New Data Flow

  1. To create a new data flow, navigate to the Integrate section, and click the New Data Flow button. Then, select the desired flow type from the list, and click the Create button.

  2. Select the AWS Kinesis Firehose connector tile from the list of available connectors. Then, select the credential that will be used to connect to the AWS Kinesis Firehose instance, and click Next; or, create a new AWS Kinesis Firehose credential for use in this flow.

  3. In Nexla, AWS Kinesis Firehose data sources can be created using pre-built endpoint templates, which expedite source setup for common AWS Kinesis Firehose endpoints. Each template is designed specifically for the corresponding AWS Kinesis Firehose endpoint, making source configuration easy and efficient.
    • To configure this source using a template, follow the instructions in Configure Using a Template.

    AWS Kinesis Firehose sources can also be configured manually, allowing you to ingest data from AWS Kinesis Firehose endpoints not included in the pre-built templates or apply further customizations to exactly suit your needs.
    • To configure this source manually, follow the instructions in Configure Manually.

Configure Using a Template

Nexla provides pre-built templates that can be used to rapidly configure data sources to ingest data from common AWS Kinesis Firehose endpoints. Each template is designed specifically for the corresponding AWS Kinesis Firehose endpoint, making data source setup easy and efficient.

Endpoint Settings

  • Select the endpoint from which this source will fetch data from the Endpoint pulldown menu. Available endpoint templates are listed in the expandable boxes below. Click on an endpoint to see more information about it and how to configure your data source for this endpoint.

    List Delivery Streams

    This endpoint retrieves a list of all delivery streams in your AWS Kinesis Data Firehose account. Use this endpoint when you need to discover available delivery streams, monitor stream metadata, or build dynamic workflows that operate on multiple streams.

    • This endpoint automatically retrieves all delivery streams in your AWS account within the specified region. No additional configuration is required beyond selecting this endpoint template. The endpoint uses the AWS region specified in your credential configuration to determine which delivery streams to retrieve.
    • The endpoint returns delivery stream names, ARNs (Amazon Resource Names), creation timestamps, and status information for each stream in your account. This metadata is useful for discovering available streams, monitoring stream status, or building dynamic workflows that operate on multiple delivery streams.

    This endpoint is useful for discovering available delivery streams and their configurations. Use this endpoint to build dynamic workflows that can operate on multiple delivery streams.

    List Tags For Delivery Stream

    This endpoint retrieves tags associated with a specific delivery stream. Use this endpoint when you need to access tag metadata for delivery stream management, organization, or filtering purposes.

    • This endpoint requires the delivery stream name to be specified in the API request. The delivery stream name is the unique identifier for the stream you want to retrieve tags for. You can find delivery stream names using the "List Delivery Streams" endpoint or in the AWS Console under Kinesis Data Firehose.
    • The endpoint will return all tags associated with the specified delivery stream. Tags are key-value pairs that can be used to organize and manage your delivery streams. Common uses include tagging streams by project, environment (production, staging, development), data source type, or compliance requirements. Tags help you filter, search, and manage delivery streams at scale.

    Tags are useful for organizing delivery streams by project, environment, or other organizational criteria. This endpoint allows you to retrieve tag information for stream management purposes.

Endpoint Testing

Once the selected endpoint template has been configured, Nexla can retrieve a sample of the data that will be fetched according to the current settings. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Configure Manually

AWS Kinesis Firehose data sources can be manually configured to ingest data from any valid AWS Kinesis Firehose API endpoint. Manual configuration provides maximum flexibility for accessing endpoints not covered by pre-built templates or when you need custom API configurations.

With manual configuration, you can also create more complex AWS Kinesis Firehose sources, such as sources that use chained API calls to fetch data from multiple endpoints or sources that require custom authentication headers or request parameters.

API Method

  1. To manually configure this source, select the Advanced tab at the top of the configuration screen.

  2. Select the API method that will be used for calls to the AWS Kinesis Firehose API from the Method pulldown menu. The most common methods are:

    • GET: For retrieving data from the API
    • POST: For sending data to the API or triggering actions
    • PUT: For updating existing data
    • PATCH: For partial updates to existing data
    • DELETE: For removing data

API Endpoint URL

  1. Enter the URL of the AWS Kinesis Firehose API endpoint from which this source will fetch data in the Set API URL field. This should be the complete URL including the protocol (https://) and any required path parameters. AWS Kinesis Firehose API endpoints typically follow the pattern https://firehose.{region}.amazonaws.com/?Action={ActionName}.

Ensure the API endpoint URL is correct and accessible with your current credentials. You can test the endpoint using the Test button after configuring the URL. AWS Kinesis Firehose API uses query string parameters for action specification.

Path to Data

Optional

If only a subset of the data that will be returned by API endpoint is needed, you can designate the part(s) of the response that should be included in the Nexset(s) produced from this source by specifying the path to the relevant data within the response. This is particularly useful when API responses contain metadata, pagination information, or other data that you don't need for your analysis.

For example, when a request call is used to fetch a list of items, the API will typically return an array of records, along with metadata, in the response. By entering the path to the relevant data, you can configure Nexla to treat each element of the returned array as a record.

Path to Data is essential when API responses have nested structures. Without specifying the correct path, Nexla might not be able to properly parse and organize your data into usable records.

  • To specify which data should be treated as relevant in responses from this source, enter the path to the relevant data in the Set Path to Data in Response field.

    • For responses in JSON format enter the JSON path that points to the object or array that should be treated as relevant data. JSON paths use dot notation (e.g., $.DeliveryStreamNames[*] to access an array of delivery stream names).

    • For responses in XML format, enter the XPath that points to the object/array containing relevant data. XPath uses slash notation (e.g., /ListDeliveryStreamsResponse/DeliveryStreamNames/string to access delivery stream name elements).

Request Headers

Optional
  • If Nexla should include any additional request headers in API calls to this source, enter the headers & corresponding values as comma-separated pairs in the Request Headers field (e.g., header1:value1,header2:value2). Additional headers are often required for API versioning, content type specifications, or custom authentication requirements.

    You do not need to include any headers already present in the credentials. Common headers like Authorization, Content-Type, and Accept are typically handled automatically by Nexla based on your credential configuration. AWS Kinesis Firehose API uses specific headers like Content-Type:application/x-amz-json-1.1 and X-Amz-Target for API versioning.

Endpoint Testing

After configuring all settings for the selected endpoint, Nexla can retrieve a sample of the data that will be fetched according to the current configuration. This allows users to verify that the source is configured correctly before saving.

  • To test the current endpoint configuration, click the Test button to the right of the endpoint selection menu. Sample data will be fetched & displayed in the Endpoint Test Result panel on the right.

  • If the sample data is not as expected, review the selected endpoint and associated settings, and make any necessary adjustments. Then, click the Test button again, and check the sample data to ensure that the correct information is displayed.

Save & Activate the Source

  1. Once all of the relevant steps in the above sections have been completed, click the Create button in the upper right corner of the screen to save and create the new AWS Kinesis Firehose data source. Nexla will now begin ingesting data from the configured endpoint and will organize any data that it finds into one or more Nexsets.