Skip to main content

Control Ingestion

Data source ingestion can be controlled through various API endpoints that allow you to start, stop, and manage data collection processes. These controls provide immediate management of data ingestion while maintaining scheduled operations and configuration integrity.

Activate and Pause Source

Data sources can be activated to start immediate data ingestion or paused to stop ongoing collection. These controls work alongside scheduled ingestion to provide flexible data collection management.

Activate Source

Trigger immediate data ingestion by activating a data source:

Activate Source: Request
PUT /data_sources/{source_id}/activate

Example with curl:

curl -X PUT https://api.nexla.io/data_sources/5001/activate \
-H "Authorization: Bearer <Access-Token>" \
-H "Accept: application/vnd.nexla.api.v1+json"

Pause Source

Stop ongoing data ingestion by pausing a data source:

Pause Source: Request
PUT /data_sources/{source_id}/pause

Example with curl:

curl -X PUT https://api.nexla.io/data_sources/5001/pause \
-H "Authorization: Bearer <Access-Token>" \
-H "Accept: application/vnd.nexla.api.v1+json"

Activation Behavior

When you activate or pause a source:

  • Immediate Effect: Changes take effect immediately
  • Scheduled Operations: Scheduled ingestion continues based on configuration
  • Flow Integration: Affects all downstream data flows
  • Status Updates: Source status changes to reflect current state

Re-ingest Files

For file-based data sources, you can trigger re-ingestion of specific files. This is useful for reprocessing files that failed ingestion or for handling updated files.

Re-ingest Endpoint

Re-ingest File: Request
POST /data_sources/{source_id}/file/ingest

Example Request Body:

{
"file": "daily/customer_data_2023-01-15.csv"
}

File Path Requirements

The file path must:

  • Start from Source Root: Begin with the location specified in source configuration
  • Match File Patterns: Conform to any file pattern filters in source_config
  • Be Accessible: Exist and be readable by the source credentials

Response Structure

Re-ingest File: Response
{
"status": "ok",
"message": "File ingestion initiated",
"file": "daily/customer_data_2023-01-15.csv",
"source_id": 5001
}

Validate Source Configuration

Source configuration validation ensures that your data source settings are complete and correct before attempting data ingestion. This helps prevent ingestion failures and configuration errors.

Validation Endpoint

Validate Source Configuration: Request
POST /data_sources/{source_id}/config/validate

Example with curl:

curl -X POST https://api.nexla.io/data_sources/5001/config/validate \
-H "Authorization: Bearer <Access-Token>" \
-H "Content-Type: application/json"

Optional Configuration Override

You can validate a different configuration without updating the source:

Validate with Custom Config: Request
{
"source_config": {
"bucket": "test-bucket",
"prefix": "test/",
"file_pattern": "*.csv"
}
}

Validation Response

The validation endpoint provides detailed feedback about configuration issues:

Validate Source Configuration: Response
{
"status": "ok",
"output": [
{
"name": "bucket",
"value": "my-data-bucket",
"errors": [],
"visible": true,
"recommendedValues": []
},
{
"name": "prefix",
"value": "daily/",
"errors": [],
"visible": true,
"recommendedValues": []
},
{
"name": "file_pattern",
"value": "*.csv",
"errors": [],
"visible": true,
"recommendedValues": []
}
]
}

Validation Errors

Common validation issues include:

  • Missing Required Fields: Essential configuration parameters not provided
  • Invalid Values: Values outside acceptable ranges or formats
  • Credential Issues: Authentication or access problems
  • Path Problems: Invalid file paths or bucket references

Test Source Connection

Before activating a source, you can test the connection to ensure credentials and configuration are working correctly.

Test Endpoint

Test Source Connection: Request
PUT /data_sources/{source_id}/test

Example with curl:

curl -X PUT https://api.nexla.io/data_sources/5001/test \
-H "Authorization: Bearer <Access-Token>" \
-H "Accept: application/vnd.nexla.api.v1+json"

Test Response

Test Source Connection: Response
{
"status": "ok",
"message": "Connection test successful",
"details": {
"connection": "established",
"authentication": "verified",
"access": "confirmed"
}
}

Ingestion Scheduling

Data sources support various scheduling options for automated data collection:

Cron-based Scheduling

Configure automatic ingestion using cron expressions:

{
"source_config": {
"poll_schedule": "0 */6 * * *",
"timezone": "UTC"
}
}

Event-driven Scheduling

Trigger ingestion based on external events:

{
"source_config": {
"trigger_type": "webhook",
"webhook_url": "https://api.example.com/trigger"
}
}

Manual Control

Override scheduled operations with manual activation:

  • Immediate Ingestion: Activate source for immediate data collection
  • Scheduled Override: Pause scheduled operations temporarily
  • Resume Operations: Restart scheduled ingestion after manual control

Monitoring Ingestion

Track the performance and health of your data sources:

Status Monitoring

Monitor source status through:

  • API Endpoints: Check current status and health
  • Flow Integration: Monitor impact on data flows
  • Performance Metrics: Track ingestion rates and volumes
  • Error Logs: Review and resolve ingestion issues

Health Checks

Regular health checks include:

  • Connection Status: Verify connectivity to source systems
  • Credential Validity: Ensure authentication still works
  • Configuration Integrity: Validate source settings
  • Performance Metrics: Monitor ingestion efficiency

Best Practices

To ensure effective ingestion control:

  1. Test Before Production: Validate configurations in test environments
  2. Monitor Performance: Track ingestion rates and error patterns
  3. Use Validation: Validate configurations before activation
  4. Plan Scheduling: Design efficient ingestion schedules
  5. Handle Errors: Implement proper error handling and retry logic
  6. Document Changes: Keep track of configuration modifications

Error Handling

Common ingestion control errors and solutions:

  • Connection Failures: Check network connectivity and credentials
  • Configuration Errors: Validate source_config parameters
  • Permission Issues: Verify access rights to source systems
  • File Access Problems: Ensure file paths and permissions are correct
  • Rate Limiting: Handle API rate limits and throttling