Errors
Error handling in Nexla provides comprehensive capabilities for identifying, analyzing, and resolving data processing issues, ensuring data quality and system reliability across your data pipelines.
Error Handling Overview
When Nexla encounters issues processing data records, the system generates error notifications and quarantines problematic records, enabling you to review, analyze, and resolve issues while maintaining data pipeline integrity.
Error Quarantine System
The error quarantine system automatically isolates problematic records for analysis and resolution.
Quarantine Overview
When data processing fails, Nexla automatically quarantines error records to prevent data loss and enable systematic issue resolution. This system provides visibility into processing failures and supports data quality improvement workflows.
Error Record Structure
Understanding the structure of quarantined error records helps you effectively analyze and resolve issues.
Error Record Components
Each quarantined error record contains comprehensive information:
nexlaMetaData: Complete metadata about the failed record, including source information, ingestion details, and processing contexterror: Detailed error information explaining why the record failed to processrawMessage: The original data record that failed processing, enabling you to examine the problematic data
Metadata Information
The nexlaMetaData field provides essential context for error analysis:
- Source Information: Details about where the data originated
- Processing Context: Information about the processing pipeline and stage
- Timing Data: When the record was ingested and processed
- Resource References: Links to related resources and configurations
Error Details
The error field contains comprehensive error information:
- Error Message: Human-readable description of the failure
- Error Type: Classification of the error for systematic resolution
- Technical Details: Specific technical information for debugging
- Resolution Guidance: Suggestions for resolving the issue
Fetch Error Samples
Retrieve quarantined error records to analyze and resolve processing issues.
Error Samples Endpoint
To fetch error samples for a specific resource:
POST /{resource_type}/{resource_id}/probe/quarantine/sample
- Nexla API
POST /data_sources/1002/probe/quarantine/sample
{
"page": 1,
"per_page": 10,
"start_time": 1640995200000,
"end_time": 1641081600000
}
Error Samples Response
The response provides detailed information about quarantined error records:
- Nexla API
{
"status": 200,
"message": "Ok",
"output": {
"data": [
{
"nexlaMetaData": {
"sourceType": "s3",
"ingestTime": 1640995200000,
"sourceOffset": 43808,
"sourceKey": "customer-data-bucket/transactions_2022.csv",
"bucket": "customer-data-bucket",
"topic": "dataset-3001-source-1002",
"resourceType": "SOURCE",
"resourceId": 1002,
"nexlaUUID": null,
"trackerId": {
"source": {
"id": 1002,
"source": "transactions_2022.csv",
"offset": 43808,
"recordNumber": "43808",
"version": 1,
"initialIngestTimestamp": 1640995200000
},
"sets": [],
"sink": null
},
"eof": false,
"lastModified": null,
"runId": 1640995200000,
"tags": null
},
"error": {
"message": "Schema validation failed: expected 6 columns, found 7",
"errorType": "SCHEMA_VALIDATION",
"details": "Column count mismatch in CSV processing",
"lineNumber": 43808,
"columnNumber": 128
},
"rawMessage": {
"data": "2022-01-01,1001,John Smith,Product A,25.99,2,51.98,5.20"
}
}
],
"meta": {
"currentPage": 1,
"totalCount": 1,
"pageCount": 1
}
}
}
Error Sample Parameters
Configure error sample retrieval to focus on specific issues and time periods.
Pagination Parameters
Control the volume of error samples returned:
page: Page number for paginated results (default: 1)per_page: Number of samples per page (default: 10, max: 100)
Time Range Parameters
Focus on errors from specific time periods:
start_time: Beginning of the time range (Unix timestamp)end_time: End of the time range (Unix timestamp)
Resource-Specific Error Retrieval
Fetch error samples for different resource types to address specific processing issues.
Data Source Errors
Retrieve errors from data ingestion processes:
POST /data_sources/{data_source_id}/probe/quarantine/sample
Data Set Errors
Retrieve errors from data transformation processes:
POST /data_sets/{data_set_id}/probe/quarantine/sample
Data Sink Errors
Retrieve errors from data output processes:
POST /data_sinks/{data_sink_id}/probe/quarantine/sample
CLI Error Retrieval
Use the Nexla CLI to retrieve error samples for efficient command-line analysis.
CLI Command Structure
Basic CLI command for retrieving error samples:
nexla {resource_type} sample-quarantine {resource_id} [options]
CLI Options
Available CLI options for error sample retrieval:
-c, --count: Number of samples to display--start-time: Beginning of time range--end-time: End of time range--format: Output format (json, table, csv)
CLI Examples
Common CLI usage patterns for error analysis:
- Nexla CLI
# Get single error sample from data source
nexla source sample-quarantine 1002 --count 1
# Get multiple error samples from data set
nexla dataset sample-quarantine 3001 --count 5
# Get error samples from data sink
nexla sink sample-quarantine 4001 --count 10
Error Analysis and Resolution
Effectively analyze error samples to identify root causes and implement solutions.
Error Pattern Analysis
Identify common patterns in error records:
- Schema Issues: Consistent schema validation failures
- Data Quality: Recurring data format or content problems
- Processing Errors: Systematic processing failures
- Resource Issues: Infrastructure or configuration problems
Root Cause Identification
Determine the underlying causes of processing failures:
- Data Source Issues: Problems with incoming data quality or format
- Configuration Problems: Incorrect resource configuration settings
- Schema Mismatches: Incompatible data structure definitions
- Resource Constraints: Insufficient capacity or permissions
Resolution Strategies
Implement appropriate solutions based on error analysis:
- Data Correction: Fix data quality issues at the source
- Configuration Updates: Adjust resource settings and parameters
- Schema Updates: Modify data structure definitions
- Resource Provisioning: Increase capacity or fix permissions
Error Prevention
Implement proactive measures to prevent future processing failures.
Data Quality Monitoring
Monitor data quality to prevent processing issues:
- Validation Rules: Implement comprehensive data validation
- Quality Checks: Regular assessment of data quality metrics
- Source Monitoring: Track data source health and reliability
- Proactive Alerts: Early warning of potential quality issues
Configuration Management
Maintain robust resource configuration:
- Validation Testing: Test configurations before deployment
- Change Management: Controlled configuration updates
- Documentation: Clear configuration documentation
- Version Control: Track configuration changes over time
Error Handling Best Practices
To effectively handle errors in your Nexla platform:
- Regular Monitoring: Monitor error rates and patterns regularly
- Systematic Analysis: Analyze errors systematically to identify root causes
- Proactive Resolution: Address issues before they impact data quality
- Documentation: Maintain clear documentation of error patterns and resolutions
- Continuous Improvement: Use error analysis to improve data processing workflows
Error Workflow Management
Implement structured workflows for error handling and resolution.
Error Detection Workflow
Standard workflow for detecting and responding to errors:
- Error Generation: System detects processing failure and quarantines record
- Notification: Error notification is generated and sent to users
- Sample Retrieval: Users retrieve error samples for analysis
- Root Cause Analysis: Analyze errors to identify underlying causes
- Resolution Implementation: Implement fixes and configuration updates
- Verification: Verify that errors are resolved and data processing succeeds
Team Coordination
Coordinate error resolution across team members:
- Error Assignment: Assign errors to appropriate team members
- Progress Tracking: Track resolution progress and status
- Knowledge Sharing: Share error patterns and solutions across team
- Escalation Procedures: Escalate complex issues to senior team members
Error Handling Integration
Integrate error handling with your broader monitoring and operations systems.
Monitoring Integration
Connect error handling to monitoring systems:
- Error Metrics: Track error rates and patterns over time
- Alert Integration: Integrate error notifications with alerting systems
- Dashboard Display: Show error metrics in operational dashboards
- Reporting: Generate error reports for operational reviews
Workflow Integration
Integrate error handling with business processes:
- Issue Tracking: Connect errors to issue tracking systems
- Change Management: Integrate with change management processes
- Compliance Reporting: Include error handling in compliance reports
- Performance Metrics: Track error resolution performance
Error Handling Metrics
Track key metrics to measure error handling effectiveness.
Error Volume Metrics
Monitor the volume and frequency of errors:
- Error Rate: Percentage of records that fail processing
- Error Frequency: Number of errors per time period
- Error Distribution: Distribution of errors across resources and time
- Trend Analysis: Changes in error patterns over time
Resolution Metrics
Track error resolution performance:
- Resolution Time: Time from error detection to resolution
- Resolution Rate: Percentage of errors successfully resolved
- Recurrence Rate: Frequency of similar errors recurring
- Prevention Effectiveness: Reduction in error rates over time
Error Handling Challenges
Common challenges in error handling and their solutions.
Data Volume Challenges
Managing errors in high-volume data processing:
- Sample Management: Use sampling to manage large error volumes
- Prioritization: Prioritize errors based on impact and frequency
- Automation: Automate common error resolution tasks
- Resource Allocation: Allocate appropriate resources for error handling
Complexity Challenges
Addressing complex error scenarios:
- Root Cause Analysis: Systematic approach to identifying root causes
- Expert Consultation: Consult with subject matter experts for complex issues
- Documentation: Maintain comprehensive error documentation
- Knowledge Management: Build and maintain error resolution knowledge base
Error Handling Tools
Use appropriate tools and utilities for effective error management.
Analysis Tools
Tools for analyzing error patterns and causes:
- Error Aggregation: Group similar errors for pattern analysis
- Statistical Analysis: Statistical analysis of error distributions
- Visualization: Visual representation of error patterns and trends
- Reporting: Automated error reporting and analysis
Resolution Tools
Tools for implementing error resolutions:
- Configuration Management: Tools for updating resource configurations
- Data Correction: Utilities for fixing data quality issues
- Testing Tools: Tools for testing resolution effectiveness
- Deployment Tools: Tools for deploying configuration changes
Related Operations
After handling errors, you may need to:
Monitor Error Rates
GET /metrics/errors
GET /metrics/{resource_type}/{resource_id}/errors
Update Configurations
PUT /{resource_type}/{resource_id}
PUT /{resource_type}/{resource_id}/config
Test Resolutions
POST /{resource_type}/{resource_id}/test
POST /{resource_type}/{resource_id}/validate
View Error History
GET /{resource_type}/{resource_id}/errors/history
GET /{resource_type}/{resource_id}/errors/trends