Control Ingestion

Data source ingestion can be controlled through various API endpoints that allow you to start, stop, and manage data collection processes. These controls provide immediate management of data ingestion while maintaining scheduled operations and configuration integrity.

Activate and Pause Source

Data sources can be activated to start immediate data ingestion or paused to stop ongoing collection. These controls work alongside scheduled ingestion to provide flexible data collection management.

Activate Source

Trigger immediate data ingestion by activating a data source:

Nexla API

Activate Source: Request
PUT /data_sources/{source_id}/activate

Example with curl:

curl -X PUT https://api.nexla.io/data_sources/5001/activate \
  -H "Authorization: Bearer <Access-Token>" \
  -H "Accept: application/vnd.nexla.api.v1+json"

Pause Source

Stop ongoing data ingestion by pausing a data source:

Nexla API

Pause Source: Request
PUT /data_sources/{source_id}/pause

Example with curl:

curl -X PUT https://api.nexla.io/data_sources/5001/pause \
  -H "Authorization: Bearer <Access-Token>" \
  -H "Accept: application/vnd.nexla.api.v1+json"

Activation Behavior

When you activate or pause a source:

Immediate Effect: Changes take effect immediately
Scheduled Operations: Scheduled ingestion continues based on configuration
Flow Integration: Affects all downstream data flows
Status Updates: Source status changes to reflect current state

Re-ingest Files

For file-based data sources, you can trigger re-ingestion of specific files. This is useful for reprocessing files that failed ingestion or for handling updated files.

Re-ingest Endpoint

Nexla API

Re-ingest File: Request
POST /data_sources/{source_id}/file/ingest

Example Request Body:

{
  "file": "daily/customer_data_2023-01-15.csv"
}

File Path Requirements

The file path must:

Start from Source Root: Begin with the location specified in source configuration
Match File Patterns: Conform to any file pattern filters in source_config
Be Accessible: Exist and be readable by the source credentials

Response Structure

Nexla API

Re-ingest File: Response
{
  "status": "ok",
  "message": "File ingestion initiated",
  "file": "daily/customer_data_2023-01-15.csv",
  "source_id": 5001
}

Validate Source Configuration

Source configuration validation ensures that your data source settings are complete and correct before attempting data ingestion. This helps prevent ingestion failures and configuration errors.

Validation Endpoint

Nexla API

Validate Source Configuration: Request
POST /data_sources/{source_id}/config/validate

Example with curl:

curl -X POST https://api.nexla.io/data_sources/5001/config/validate \
  -H "Authorization: Bearer <Access-Token>" \
  -H "Content-Type: application/json"

Optional Configuration Override

You can validate a different configuration without updating the source:

Nexla API

Validate with Custom Config: Request
{
  "source_config": {
    "bucket": "test-bucket",
    "prefix": "test/",
    "file_pattern": "*.csv"
  }
}

Validation Response

The validation endpoint provides detailed feedback about configuration issues:

Nexla API

Validate Source Configuration: Response
{
  "status": "ok",
  "output": [
    {
      "name": "bucket",
      "value": "my-data-bucket",
      "errors": [],
      "visible": true,
      "recommendedValues": []
    },
    {
      "name": "prefix",
      "value": "daily/",
      "errors": [],
      "visible": true,
      "recommendedValues": []
    },
    {
      "name": "file_pattern",
      "value": "*.csv",
      "errors": [],
      "visible": true,
      "recommendedValues": []
    }
  ]
}

Validation Errors

Common validation issues include:

Missing Required Fields: Essential configuration parameters not provided
Invalid Values: Values outside acceptable ranges or formats
Credential Issues: Authentication or access problems
Path Problems: Invalid file paths or bucket references

Test Source Connection

Before activating a source, you can test the connection to ensure credentials and configuration are working correctly.

Test Endpoint

Nexla API

Test Source Connection: Request
PUT /data_sources/{source_id}/test

Example with curl:

curl -X PUT https://api.nexla.io/data_sources/5001/test \
  -H "Authorization: Bearer <Access-Token>" \
  -H "Accept: application/vnd.nexla.api.v1+json"

Test Response

Nexla API

Test Source Connection: Response
{
  "status": "ok",
  "message": "Connection test successful",
  "details": {
    "connection": "established",
    "authentication": "verified",
    "access": "confirmed"
  }
}

Ingestion Scheduling

Data sources support various scheduling options for automated data collection:

Cron-based Scheduling

Configure automatic ingestion using cron expressions:

{
  "source_config": {
    "poll_schedule": "0 */6 * * *",
    "timezone": "UTC"
  }
}

Event-driven Scheduling

Trigger ingestion based on external events:

{
  "source_config": {
    "trigger_type": "webhook",
    "webhook_url": "https://api.example.com/trigger"
  }
}

Manual Control

Override scheduled operations with manual activation:

Immediate Ingestion: Activate source for immediate data collection
Scheduled Override: Pause scheduled operations temporarily
Resume Operations: Restart scheduled ingestion after manual control

Monitoring Ingestion

Track the performance and health of your data sources:

Status Monitoring

Monitor source status through:

API Endpoints: Check current status and health
Flow Integration: Monitor impact on data flows
Performance Metrics: Track ingestion rates and volumes
Error Logs: Review and resolve ingestion issues

Health Checks

Regular health checks include:

Connection Status: Verify connectivity to source systems
Credential Validity: Ensure authentication still works
Configuration Integrity: Validate source settings
Performance Metrics: Monitor ingestion efficiency

Best Practices

To ensure effective ingestion control:

Test Before Production: Validate configurations in test environments
Monitor Performance: Track ingestion rates and error patterns
Use Validation: Validate configurations before activation
Plan Scheduling: Design efficient ingestion schedules
Handle Errors: Implement proper error handling and retry logic
Document Changes: Keep track of configuration modifications

Error Handling

Common ingestion control errors and solutions:

Connection Failures: Check network connectivity and credentials
Configuration Errors: Validate source_config parameters
Permission Issues: Verify access rights to source systems
File Access Problems: Ensure file paths and permissions are correct
Rate Limiting: Handle API rate limits and throttling

Activate and Pause Source​

Activate Source​

Pause Source​

Activation Behavior​

Re-ingest Files​

Re-ingest Endpoint​

File Path Requirements​

Response Structure​

Validate Source Configuration​

Validation Endpoint​

Optional Configuration Override​

Validation Response​

Validation Errors​

Test Source Connection​

Test Endpoint​

Test Response​

Ingestion Scheduling​

Cron-based Scheduling​

Event-driven Scheduling​

Manual Control​

Monitoring Ingestion​

Status Monitoring​

Health Checks​

Best Practices​

Error Handling​

Activate and Pause Source

Activate Source

Pause Source

Activation Behavior

Re-ingest Files

Re-ingest Endpoint

File Path Requirements

Response Structure

Validate Source Configuration

Validation Endpoint

Optional Configuration Override

Validation Response

Validation Errors

Test Source Connection

Test Endpoint

Test Response

Ingestion Scheduling

Cron-based Scheduling

Event-driven Scheduling

Manual Control

Monitoring Ingestion

Status Monitoring

Health Checks

Best Practices

Error Handling