Skip to main content

Errors

Error handling in Nexla provides comprehensive capabilities for identifying, analyzing, and resolving data processing issues, ensuring data quality and system reliability across your data pipelines.

Error Handling Overview

When Nexla encounters issues processing data records, the system generates error notifications and quarantines problematic records, enabling you to review, analyze, and resolve issues while maintaining data pipeline integrity.

Error Quarantine System

The error quarantine system automatically isolates problematic records for analysis and resolution.

Quarantine Overview

When data processing fails, Nexla automatically quarantines error records to prevent data loss and enable systematic issue resolution. This system provides visibility into processing failures and supports data quality improvement workflows.

Error Record Structure

Understanding the structure of quarantined error records helps you effectively analyze and resolve issues.

Error Record Components

Each quarantined error record contains comprehensive information:

  • nexlaMetaData: Complete metadata about the failed record, including source information, ingestion details, and processing context
  • error: Detailed error information explaining why the record failed to process
  • rawMessage: The original data record that failed processing, enabling you to examine the problematic data

Metadata Information

The nexlaMetaData field provides essential context for error analysis:

  • Source Information: Details about where the data originated
  • Processing Context: Information about the processing pipeline and stage
  • Timing Data: When the record was ingested and processed
  • Resource References: Links to related resources and configurations

Error Details

The error field contains comprehensive error information:

  • Error Message: Human-readable description of the failure
  • Error Type: Classification of the error for systematic resolution
  • Technical Details: Specific technical information for debugging
  • Resolution Guidance: Suggestions for resolving the issue

Fetch Error Samples

Retrieve quarantined error records to analyze and resolve processing issues.

Error Samples Endpoint

To fetch error samples for a specific resource:

POST /{resource_type}/{resource_id}/probe/quarantine/sample
Fetch Error Samples: Request
POST /data_sources/1002/probe/quarantine/sample

{
"page": 1,
"per_page": 10,
"start_time": 1640995200000,
"end_time": 1641081600000
}

Error Samples Response

The response provides detailed information about quarantined error records:

Fetch Error Samples: Response
{
"status": 200,
"message": "Ok",
"output": {
"data": [
{
"nexlaMetaData": {
"sourceType": "s3",
"ingestTime": 1640995200000,
"sourceOffset": 43808,
"sourceKey": "customer-data-bucket/transactions_2022.csv",
"bucket": "customer-data-bucket",
"topic": "dataset-3001-source-1002",
"resourceType": "SOURCE",
"resourceId": 1002,
"nexlaUUID": null,
"trackerId": {
"source": {
"id": 1002,
"source": "transactions_2022.csv",
"offset": 43808,
"recordNumber": "43808",
"version": 1,
"initialIngestTimestamp": 1640995200000
},
"sets": [],
"sink": null
},
"eof": false,
"lastModified": null,
"runId": 1640995200000,
"tags": null
},
"error": {
"message": "Schema validation failed: expected 6 columns, found 7",
"errorType": "SCHEMA_VALIDATION",
"details": "Column count mismatch in CSV processing",
"lineNumber": 43808,
"columnNumber": 128
},
"rawMessage": {
"data": "2022-01-01,1001,John Smith,Product A,25.99,2,51.98,5.20"
}
}
],
"meta": {
"currentPage": 1,
"totalCount": 1,
"pageCount": 1
}
}
}

Error Sample Parameters

Configure error sample retrieval to focus on specific issues and time periods.

Pagination Parameters

Control the volume of error samples returned:

  • page: Page number for paginated results (default: 1)
  • per_page: Number of samples per page (default: 10, max: 100)

Time Range Parameters

Focus on errors from specific time periods:

  • start_time: Beginning of the time range (Unix timestamp)
  • end_time: End of the time range (Unix timestamp)

Resource-Specific Error Retrieval

Fetch error samples for different resource types to address specific processing issues.

Data Source Errors

Retrieve errors from data ingestion processes:

POST /data_sources/{data_source_id}/probe/quarantine/sample

Data Set Errors

Retrieve errors from data transformation processes:

POST /data_sets/{data_set_id}/probe/quarantine/sample

Data Sink Errors

Retrieve errors from data output processes:

POST /data_sinks/{data_sink_id}/probe/quarantine/sample

CLI Error Retrieval

Use the Nexla CLI to retrieve error samples for efficient command-line analysis.

CLI Command Structure

Basic CLI command for retrieving error samples:

nexla {resource_type} sample-quarantine {resource_id} [options]

CLI Options

Available CLI options for error sample retrieval:

  • -c, --count: Number of samples to display
  • --start-time: Beginning of time range
  • --end-time: End of time range
  • --format: Output format (json, table, csv)

CLI Examples

Common CLI usage patterns for error analysis:

CLI Error Sample Retrieval: Examples
# Get single error sample from data source
nexla source sample-quarantine 1002 --count 1

# Get multiple error samples from data set
nexla dataset sample-quarantine 3001 --count 5

# Get error samples from data sink
nexla sink sample-quarantine 4001 --count 10

Error Analysis and Resolution

Effectively analyze error samples to identify root causes and implement solutions.

Error Pattern Analysis

Identify common patterns in error records:

  • Schema Issues: Consistent schema validation failures
  • Data Quality: Recurring data format or content problems
  • Processing Errors: Systematic processing failures
  • Resource Issues: Infrastructure or configuration problems

Root Cause Identification

Determine the underlying causes of processing failures:

  • Data Source Issues: Problems with incoming data quality or format
  • Configuration Problems: Incorrect resource configuration settings
  • Schema Mismatches: Incompatible data structure definitions
  • Resource Constraints: Insufficient capacity or permissions

Resolution Strategies

Implement appropriate solutions based on error analysis:

  • Data Correction: Fix data quality issues at the source
  • Configuration Updates: Adjust resource settings and parameters
  • Schema Updates: Modify data structure definitions
  • Resource Provisioning: Increase capacity or fix permissions

Error Prevention

Implement proactive measures to prevent future processing failures.

Data Quality Monitoring

Monitor data quality to prevent processing issues:

  • Validation Rules: Implement comprehensive data validation
  • Quality Checks: Regular assessment of data quality metrics
  • Source Monitoring: Track data source health and reliability
  • Proactive Alerts: Early warning of potential quality issues

Configuration Management

Maintain robust resource configuration:

  • Validation Testing: Test configurations before deployment
  • Change Management: Controlled configuration updates
  • Documentation: Clear configuration documentation
  • Version Control: Track configuration changes over time

Error Handling Best Practices

To effectively handle errors in your Nexla platform:

  1. Regular Monitoring: Monitor error rates and patterns regularly
  2. Systematic Analysis: Analyze errors systematically to identify root causes
  3. Proactive Resolution: Address issues before they impact data quality
  4. Documentation: Maintain clear documentation of error patterns and resolutions
  5. Continuous Improvement: Use error analysis to improve data processing workflows

Error Workflow Management

Implement structured workflows for error handling and resolution.

Error Detection Workflow

Standard workflow for detecting and responding to errors:

  1. Error Generation: System detects processing failure and quarantines record
  2. Notification: Error notification is generated and sent to users
  3. Sample Retrieval: Users retrieve error samples for analysis
  4. Root Cause Analysis: Analyze errors to identify underlying causes
  5. Resolution Implementation: Implement fixes and configuration updates
  6. Verification: Verify that errors are resolved and data processing succeeds

Team Coordination

Coordinate error resolution across team members:

  • Error Assignment: Assign errors to appropriate team members
  • Progress Tracking: Track resolution progress and status
  • Knowledge Sharing: Share error patterns and solutions across team
  • Escalation Procedures: Escalate complex issues to senior team members

Error Handling Integration

Integrate error handling with your broader monitoring and operations systems.

Monitoring Integration

Connect error handling to monitoring systems:

  • Error Metrics: Track error rates and patterns over time
  • Alert Integration: Integrate error notifications with alerting systems
  • Dashboard Display: Show error metrics in operational dashboards
  • Reporting: Generate error reports for operational reviews

Workflow Integration

Integrate error handling with business processes:

  • Issue Tracking: Connect errors to issue tracking systems
  • Change Management: Integrate with change management processes
  • Compliance Reporting: Include error handling in compliance reports
  • Performance Metrics: Track error resolution performance

Error Handling Metrics

Track key metrics to measure error handling effectiveness.

Error Volume Metrics

Monitor the volume and frequency of errors:

  • Error Rate: Percentage of records that fail processing
  • Error Frequency: Number of errors per time period
  • Error Distribution: Distribution of errors across resources and time
  • Trend Analysis: Changes in error patterns over time

Resolution Metrics

Track error resolution performance:

  • Resolution Time: Time from error detection to resolution
  • Resolution Rate: Percentage of errors successfully resolved
  • Recurrence Rate: Frequency of similar errors recurring
  • Prevention Effectiveness: Reduction in error rates over time

Error Handling Challenges

Common challenges in error handling and their solutions.

Data Volume Challenges

Managing errors in high-volume data processing:

  • Sample Management: Use sampling to manage large error volumes
  • Prioritization: Prioritize errors based on impact and frequency
  • Automation: Automate common error resolution tasks
  • Resource Allocation: Allocate appropriate resources for error handling

Complexity Challenges

Addressing complex error scenarios:

  • Root Cause Analysis: Systematic approach to identifying root causes
  • Expert Consultation: Consult with subject matter experts for complex issues
  • Documentation: Maintain comprehensive error documentation
  • Knowledge Management: Build and maintain error resolution knowledge base

Error Handling Tools

Use appropriate tools and utilities for effective error management.

Analysis Tools

Tools for analyzing error patterns and causes:

  • Error Aggregation: Group similar errors for pattern analysis
  • Statistical Analysis: Statistical analysis of error distributions
  • Visualization: Visual representation of error patterns and trends
  • Reporting: Automated error reporting and analysis

Resolution Tools

Tools for implementing error resolutions:

  • Configuration Management: Tools for updating resource configurations
  • Data Correction: Utilities for fixing data quality issues
  • Testing Tools: Tools for testing resolution effectiveness
  • Deployment Tools: Tools for deploying configuration changes

After handling errors, you may need to:

Monitor Error Rates

GET /metrics/errors
GET /metrics/{resource_type}/{resource_id}/errors

Update Configurations

PUT /{resource_type}/{resource_id}
PUT /{resource_type}/{resource_id}/config

Test Resolutions

POST /{resource_type}/{resource_id}/test
POST /{resource_type}/{resource_id}/validate

View Error History

GET /{resource_type}/{resource_id}/errors/history
GET /{resource_type}/{resource_id}/errors/trends