Skip to main content

List/View Data Flows

Listing and viewing data flows in Nexla allows you to understand your data processing architecture, monitor flow status, and manage data pipelines. The system provides comprehensive APIs for viewing flow information, status, and configuration details.

List All Data Flows

The primary endpoint for listing data flows is the modern /flows endpoint, which provides comprehensive information about all flows accessible to your account.

API Endpoint

To retrieve all accessible data flows:

GET /flows
List All Data Flows: Request
GET /flows

Response Structure

The response includes comprehensive flow information including configuration, status, and metadata.

List All Data Flows: Response
[
{
"id": 2001,
"flow_node_id": "flow_2001",
"origin_node_id": "source_1001",
"flow_type": "streaming",
"status": "ACTIVE",
"owner": {
"id": 42,
"full_name": "John Smith"
},
"org": {
"id": 101,
"name": "Acme Corporation"
},
"access_roles": ["owner"],
"data_source": {
"id": 1001,
"name": "Customer Data Source",
"source_type": "s3"
},
"data_set": {
"id": 3001,
"name": "Customer Analytics Dataset"
},
"data_sink": {
"id": 4001,
"name": "Data Warehouse Sink",
"sink_type": "snowflake"
},
"created_at": "2023-01-15T10:30:00.000Z",
"updated_at": "2023-01-15T15:45:00.000Z"
},
{
"id": 2002,
"flow_node_id": "flow_2002",
"origin_node_id": "source_1002",
"flow_type": "in_memory",
"status": "PAUSED",
"owner": {
"id": 42,
"full_name": "John Smith"
},
"org": {
"id": 101,
"name": "Acme Corporation"
},
"access_roles": ["owner"],
"data_source": {
"id": 1002,
"name": "Sales Data Source",
"source_type": "postgres"
},
"data_set": {
"id": 3002,
"name": "Sales Analytics Dataset"
},
"data_sink": {
"id": 4002,
"name": "Reporting Database Sink",
"sink_type": "mysql"
},
"created_at": "2023-01-14T14:20:00.000Z",
"updated_at": "2023-01-15T12:15:00.000Z"
}
]

Show Flow by ID

To retrieve a specific data flow by its identifier, use the flow ID endpoint.

Flow by ID Endpoint

GET /flows/{flow_id}
Show Flow by ID: Request
GET /flows/2001

Flow by ID Response

The response provides detailed information about the specific flow, including all configuration and status details.

Show Flow by ID: Response
{
"id": 2001,
"flow_node_id": "flow_2001",
"origin_node_id": "source_1001",
"flow_type": "streaming",
"status": "ACTIVE",
"owner": {
"id": 42,
"full_name": "John Smith"
},
"org": {
"id": 101,
"name": "Acme Corporation"
},
"access_roles": ["owner"],
"data_source": {
"id": 1001,
"name": "Customer Data Source",
"source_type": "s3",
"status": "ACTIVE"
},
"data_set": {
"id": 3001,
"name": "Customer Analytics Dataset",
"description": "Processed customer data for analytics"
},
"data_sink": {
"id": 4001,
"name": "Data Warehouse Sink",
"sink_type": "snowflake",
"status": "ACTIVE"
},
"flow_config": {
"batch_size": 1000,
"processing_interval": "5m",
"retry_policy": "exponential_backoff"
},
"created_at": "2023-01-15T10:30:00.000Z",
"updated_at": "2023-01-15T15:45:00.000Z"
}

Show Flow by Data Source, Set or Sink

You can also retrieve flow information by querying related resources like data sources, data sets, or data sinks.

Flow by Data Source

To find flows associated with a specific data source:

GET /data_sources/{source_id}/flow
Show Flow by Data Source: Request
GET /data_sources/1001/flow

Flow by Data Set

To find flows associated with a specific data set:

GET /data_sets/{dataset_id}/flow
Show Flow by Data Set: Request
GET /data_sets/3001/flow

Flow by Data Sink

To find flows associated with a specific data sink:

GET /data_sinks/{sink_id}/flow
Show Flow by Data Sink: Request
GET /data_sinks/4001/flow

Pause Flow

To pause a running data flow, use the pause endpoint.

Pause Endpoint

PUT /flows/{flow_id}/pause
Pause Flow: Request
PUT /flows/2001/pause

Pause Response

A successful pause operation returns the updated flow status.

Pause Flow: Response
{
"id": 2001,
"status": "PAUSED",
"paused_at": "2023-01-15T16:00:00.000Z",
"updated_at": "2023-01-15T16:00:00.000Z"
}

Flow Types

Nexla supports multiple data flow types, each optimized for specific data processing scenarios and use cases. Understanding these flow types helps you choose the right approach for your data integration needs.

FlexFlow – Flexibility & functionality (recommended for most workflows)

FlexFlow is Nexla's recommended flow type for most workflows, offering maximum flexibility and functionality:

  • High Throughput: Uses Kafka engine for seamless high-throughput data movement
  • Full Transformation Support: Allows any level of complexity using built-in or custom transforms
  • Multiple Destinations: Supports branched flows that send data to multiple destinations
  • Real-time Editing: Can be edited at any time with immediate effect
  • Intermediate Sharing: Intermediate data products can be shared with other users

DB-CDC – Data migration & maintenance

DB-CDC flows are streamlined for data migration and maintenance workflows:

  • Change Detection: Monitors transaction logs to capture data changes
  • High-Speed Replication: Quickly replicates tables across databases and warehouses
  • Selective Replication: Allows inclusion/exclusion of specific tables
  • Schema Preservation: Maintains original data structure during replication
  • No Transformations: Does not support Nexset transformations

Spark ETL – Large-scale data processing with minimal latency

Spark ETL flows are designed for large-scale data processing with minimal latency:

  • Big Data Processing: Uses Apache Spark engine for large volume data processing
  • Cloud Database Support: Optimized for cloud databases and Databricks
  • Transform Support: Supports Nexset transformations and Spark SQL queries
  • Single Destination: Designed for data movement to a single destination
  • Consistent Processing: Ideal for data with consistent transformation requirements

Replication – Rapid file transfer between storage systems

Replication flows provide high-speed movement of unmodified files between storage systems:

  • File Structure Retention: Preserves original file structure during transfer
  • High-Speed Transfer: Minimizes latency by executing nodes in memory
  • Multiple Destinations: Supports replication to one or multiple locations
  • No Transformations: Does not support data transformations
  • Quick Setup: Streamlined for rapid data replication workflows

ELT – Unmodified data movement from APIs to databases

ELT flows are streamlined for seamless movement of unmodified data from APIs to databases:

  • Simplified Setup: Requires minimal configuration for rapid deployment
  • API Integration: Optimized for extracting data from API sources
  • Direct Loading: Transfers data directly to databases and warehouses
  • Table Mapping: Supports mapping data into multiple tables
  • No Source Transformations: Does not modify data before destination

DirectFlow – High-throughput point-to-point processing

DirectFlow provides high-throughput point-to-point data processing:

  • Memory Optimization: Executes entire flow within single container memory
  • Transform Support: Supports data transformations using Nexset Rules
  • Single Destination: Limited to one destination (no branching)
  • Non-streaming Sources: Designed for batch processing scenarios
  • Programmatic Triggers: Supports dynamic flows triggered programmatically

BYO – Custom code-based data processing

BYO flows allow incorporation of custom runtimes for specialized data processing:

  • Custom Code: Users define every aspect of data flow processing
  • Flexible Runtimes: Supports existing or newly created codebases
  • Operational Enhancements: Can include custom notifications and insights
  • File System Focus: Currently limited to file system sources and destinations
  • Advanced Manipulation: Enables highly specialized data processing scenarios

RAG – AI-powered data querying

RAG flows harness GenAI and LLMs for intelligent data querying:

  • AI-Powered Queries: Uses LLMs to generate natural language responses
  • Multi-Source Retrieval: Integrates data from APIs, vector databases, and Nexsets
  • Intelligent Ranking: Reranker refines retrieved data for relevance
  • Application Integration: Can power chatbots and user-facing applications
  • Query Processing: Automatically determines needed information for queries

Flow Accessors

Flow accessors determine who can view and manage specific flows.

Owner Access

Flow owners have full control over their flows:

  • Full Management: Create, update, delete, and control flows
  • Access Control: Grant access to team members and other users
  • Configuration: Modify flow settings and parameters

Team Access

Team members can access flows based on team permissions:

  • Shared Access: Access to team-shared flows
  • Limited Control: View and monitor flows (depending on permissions)
  • Collaboration: Work with team members on flow management

Organization Access

Organization-wide access controls apply to all flows:

  • Boundary Enforcement: Flows are isolated by organization
  • Admin Oversight: Organization administrators can monitor all flows
  • Resource Management: Organization-level resource allocation

Projects

Flows can be organized into projects for better management and organization.

Project Organization

Projects group related flows together:

  • Logical Grouping: Organize flows by purpose or function
  • Access Control: Manage permissions at the project level
  • Resource Management: Allocate resources across project flows

Project Management

Manage projects through dedicated endpoints:

GET /projects
POST /projects
PUT /projects/{project_id}
DELETE /projects/{project_id}

Pagination and Filtering

The flows endpoint supports pagination and filtering for large numbers of flows.

Pagination

Use query parameters to paginate results:

GET /flows?page=1&per_page=100

Filtering

Filter flows by various criteria:

GET /flows?status=ACTIVE
GET /flows?flow_type=streaming
GET /flows?owner_id=42

Search flows by name or description:

GET /flows?search=customer

Best Practices

To effectively list and view data flows:

  1. Use Pagination: Implement pagination for large flow collections
  2. Filter by Status: Focus on active or paused flows as needed
  3. Monitor Changes: Track flow status changes over time
  4. Organize by Projects: Use projects to group related flows
  5. Regular Review: Periodically review flow configurations and access

Error Handling

Common flow listing issues and solutions:

  • Permission Denied: Ensure you have appropriate access rights
  • Invalid Flow ID: Verify the flow ID exists and is accessible
  • Organization Issues: Check organization membership and access
  • Resource Not Found: Confirm the requested flow exists

After viewing flows, you may need to:

Control Flow Status

PUT /flows/{flow_id}/activate
PUT /flows/{flow_id}/pause

Update Flow Configuration

PUT /flows/{flow_id}

Monitor Flow Performance

GET /flows/{flow_id}/metrics
GET /flows/{flow_id}/logs

Manage Flow Access

GET /flows/{flow_id}/access
PUT /flows/{flow_id}/access