Sample Data
Sample data functionality in Nexla allows you to retrieve representative data samples from any Nexset, enabling you to validate transformations, verify data quality, and understand how your data processing workflows are functioning.
Sample Data Overview
Sample data provides immediate visibility into your data processing results, allowing you to compare input and output data, verify transformation logic, and ensure data quality without processing entire datasets.
Core Sample Data Capabilities
The sample data system provides several key capabilities for data validation and analysis.
Data Validation
Validate your data processing workflows:
- Transformation Verification: Compare input and output data to verify transforms
- Data Quality Assessment: Examine sample data for quality issues
- Schema Validation: Verify data structure and format compliance
- Business Logic Testing: Test business rules and calculations
Operational Monitoring
Monitor data processing operations:
- Real-Time Visibility: View current data processing results
- Performance Monitoring: Assess processing performance and efficiency
- Error Detection: Identify data quality and processing issues
- Trend Analysis: Monitor data patterns over time
Fetch Input and Output Samples
Retrieve both input and output samples to understand how data is transformed through your Nexsets.
Sample Retrieval Endpoint
To fetch input and output samples from a Nexset:
GET /nexsets/{nexset_id}/samples
- Nexla API
GET /nexsets/3001/samples
Sample Data Response
The response provides both input and output data for comparison:
- Nexla API
[
{
"input": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98
},
"output": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98,
"discount_applied": 5.20,
"final_amount": 46.78
}
},
{
"input": {
"transaction_date": "2023-01-15",
"customer_id": "1002",
"product_name": "Standard Widget",
"quantity": 1,
"unit_price": 19.99,
"total_amount": 19.99
},
"output": {
"transaction_date": "2023-01-15",
"customer_id": "1002",
"product_name": "Standard Widget",
"quantity": 1,
"unit_price": 19.99,
"total_amount": 19.99,
"discount_applied": 2.00,
"final_amount": 17.99
}
}
]
CLI Sample Retrieval
Use the Nexla CLI to retrieve samples for efficient command-line analysis.
CLI Command Structure
Basic CLI command for retrieving samples:
nexla nexset sample <nexset_id> [options]
CLI Options
Available CLI options for sample retrieval:
-c, --count: Number of samples to display (default: 10)-t, --transform: Display transformed output samples-i, --show_inputs: Display input samples for comparison
CLI Examples
Common CLI usage patterns for sample retrieval:
- Nexla CLI
# Get default samples (transformed output only)
nexla nexset sample 3001
# Get specific number of samples with inputs
nexla nexset sample 3001 --count 5 --show_inputs true
# Get transformed output only
nexla nexset sample 3001 --transform true --show_inputs false
Output-Only Samples
Retrieve only the transformed output samples to focus on final results.
Output-Only Endpoint
To fetch only output samples:
GET /nexsets/{nexset_id}/samples?output_only=true
- Nexla API
GET /nexsets/3001/samples?output_only=true
Output-Only Response
The response contains only the transformed output data:
- Nexla API
[
{
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98,
"discount_applied": 5.20,
"final_amount": 46.78
},
{
"transaction_date": "2023-01-15",
"customer_id": "1002",
"product_name": "Standard Widget",
"quantity": 1,
"unit_price": 19.99,
"total_amount": 19.99,
"discount_applied": 2.00,
"final_amount": 17.99
}
]
CLI Output-Only Retrieval
Use CLI options to retrieve output-only samples:
nexla nexset sample 3001 --transform true --show_inputs false
Samples with Metadata
Retrieve samples with comprehensive metadata for detailed analysis and debugging.
Metadata-Enabled Endpoint
To fetch samples with metadata:
GET /nexsets/{nexset_id}/samples?include_metadata=1
- Nexla API
GET /nexsets/3001/samples?include_metadata=1
Metadata Response Structure
The response includes both data and comprehensive metadata:
- Nexla API
[
{
"input": {
"rawMessage": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98
},
"nexlaMetaData": {
"trackerId": "u1002::customer-data-bucket/transactions_2023.csv:1:1:1:1640995200000;NA",
"sourceType": "s3",
"ingestTime": 1640995200000,
"sourceOffset": 43808,
"sourceKey": "customer-data-bucket/transactions_2023.csv",
"bucket": "customer-data-bucket",
"topic": "nexset-3001-source-1002",
"resourceType": "SOURCE",
"resourceId": 1002,
"nexlaUUID": null,
"eof": false,
"runId": 1640995200000,
"tags": [{}],
"transformTime": 1640995200000,
"transformTimeISO8601": "2023-01-15T10:00:00.000Z"
},
"error": null
},
"output": {
"rawMessage": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98,
"discount_applied": 5.20,
"final_amount": 46.78
},
"nexlaMetaData": {
"trackerId": "u1002::customer-data-bucket/transactions_2023.csv:1:1:1:1640995200000;NA",
"sourceType": "s3",
"ingestTime": 1640995200000,
"sourceOffset": 43808,
"sourceKey": "customer-data-bucket/transactions_2023.csv",
"bucket": "customer-data-bucket",
"topic": "nexset-3001-source-1002",
"resourceType": "SOURCE",
"resourceId": 1002,
"nexlaUUID": null,
"eof": false,
"runId": 1640995200000,
"tags": [{}],
"transformTime": 1640995200000,
"transformTimeISO8601": "2023-01-15T10:00:00.000Z"
},
"error": null
}
}
]
CLI Metadata Retrieval
Use CLI options to retrieve samples with metadata:
nexla nexset sample 3001 --metadata --transform true --show_inputs true
Sample Data Parameters
Configure sample retrieval to meet your specific analysis needs.
Count Parameter
Control the number of samples returned:
count: Total number of samples to fetch (default: 10, max: 100)- API Usage:
?count=25 - CLI Usage:
--count 25
Output Control Parameters
Control which data to include in samples:
output_only: Return only transformed output (API:?output_only=1)show_inputs: Include input samples (CLI:--show_inputs true)transform: Include transformed output (CLI:--transform true)
Metadata Parameters
Control metadata inclusion:
include_metadata: Include comprehensive metadata (API:?include_metadata=1)--metadata: Include metadata in CLI output
Sample Data Use Cases
Sample data serves various operational and analytical purposes.
Development and Testing
Use samples during development and testing:
- Transform Validation: Verify transformation logic and results
- Data Quality Testing: Test data quality rules and constraints
- Performance Testing: Assess processing performance with sample data
- Integration Testing: Test data flow integration points
Production Monitoring
Monitor production data processing:
- Quality Assurance: Verify data quality in production environments
- Performance Monitoring: Monitor processing performance and efficiency
- Error Detection: Identify and diagnose processing issues
- Compliance Verification: Verify data compliance and governance
Business Intelligence
Support business intelligence activities:
- Data Exploration: Explore data patterns and characteristics
- Report Validation: Verify report accuracy and completeness
- Trend Analysis: Analyze data trends and patterns
- Decision Support: Support data-driven decision making
Sample Data Best Practices
To effectively use sample data in your Nexla platform:
- Regular Sampling: Periodically sample data to monitor quality and performance
- Representative Samples: Ensure samples represent your data population
- Metadata Analysis: Use metadata for comprehensive data analysis
- Comparison Analysis: Compare input and output to verify transformations
- Documentation: Document sample analysis procedures and findings
Sample Data Workflows
Implement structured workflows for effective sample data analysis.
Quality Assurance Workflow
Standard workflow for data quality assurance:
- Sample Retrieval: Retrieve representative data samples
- Quality Assessment: Assess data quality and identify issues
- Issue Analysis: Analyze quality issues and determine root causes
- Resolution Implementation: Implement fixes and improvements
- Verification: Verify that quality issues are resolved
Transformation Validation Workflow
Workflow for validating data transformations:
- Input Sampling: Retrieve input data samples
- Output Sampling: Retrieve transformed output samples
- Comparison Analysis: Compare input and output for validation
- Logic Verification: Verify transformation logic and business rules
- Performance Assessment: Assess transformation performance and efficiency
Error Handling
Common sample data issues and solutions:
- Permission Denied: Ensure you have appropriate access rights
- Nexset Not Found: Verify the Nexset ID exists and is accessible
- Invalid Parameters: Check that query parameters are correctly formatted
- Large Sample Sets: Use count parameter to limit sample size
Related Operations
After retrieving sample data, you may need to:
Validate Data Quality
GET /nexsets/{nexset_id}/validate
POST /nexsets/{nexset_id}/validate
Monitor Processing
GET /nexsets/{nexset_id}/metrics
GET /nexsets/{nexset_id}/status
Update Configurations
PUT /nexsets/{nexset_id}
PUT /nexsets/{nexset_id}/config
View Processing History
GET /nexsets/{nexset_id}/history
GET /nexsets/{nexset_id}/audit