Skip to main content

Sample Data

Sample data functionality in Nexla allows you to retrieve representative data samples from any Nexset, enabling you to validate transformations, verify data quality, and understand how your data processing workflows are functioning.

Sample Data Overview

Sample data provides immediate visibility into your data processing results, allowing you to compare input and output data, verify transformation logic, and ensure data quality without processing entire datasets.

Core Sample Data Capabilities

The sample data system provides several key capabilities for data validation and analysis.

Data Validation

Validate your data processing workflows:

  • Transformation Verification: Compare input and output data to verify transforms
  • Data Quality Assessment: Examine sample data for quality issues
  • Schema Validation: Verify data structure and format compliance
  • Business Logic Testing: Test business rules and calculations

Operational Monitoring

Monitor data processing operations:

  • Real-Time Visibility: View current data processing results
  • Performance Monitoring: Assess processing performance and efficiency
  • Error Detection: Identify data quality and processing issues
  • Trend Analysis: Monitor data patterns over time

Fetch Input and Output Samples

Retrieve both input and output samples to understand how data is transformed through your Nexsets.

Sample Retrieval Endpoint

To fetch input and output samples from a Nexset:

GET /nexsets/{nexset_id}/samples
Fetch Input & Output Samples: Request
GET /nexsets/3001/samples

Sample Data Response

The response provides both input and output data for comparison:

Fetch Input & Output Samples: Response
[
{
"input": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98
},
"output": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98,
"discount_applied": 5.20,
"final_amount": 46.78
}
},
{
"input": {
"transaction_date": "2023-01-15",
"customer_id": "1002",
"product_name": "Standard Widget",
"quantity": 1,
"unit_price": 19.99,
"total_amount": 19.99
},
"output": {
"transaction_date": "2023-01-15",
"customer_id": "1002",
"product_name": "Standard Widget",
"quantity": 1,
"unit_price": 19.99,
"total_amount": 19.99,
"discount_applied": 2.00,
"final_amount": 17.99
}
}
]

CLI Sample Retrieval

Use the Nexla CLI to retrieve samples for efficient command-line analysis.

CLI Command Structure

Basic CLI command for retrieving samples:

nexla nexset sample <nexset_id> [options]

CLI Options

Available CLI options for sample retrieval:

  • -c, --count: Number of samples to display (default: 10)
  • -t, --transform: Display transformed output samples
  • -i, --show_inputs: Display input samples for comparison

CLI Examples

Common CLI usage patterns for sample retrieval:

CLI Sample Retrieval: Examples
# Get default samples (transformed output only)
nexla nexset sample 3001

# Get specific number of samples with inputs
nexla nexset sample 3001 --count 5 --show_inputs true

# Get transformed output only
nexla nexset sample 3001 --transform true --show_inputs false

Output-Only Samples

Retrieve only the transformed output samples to focus on final results.

Output-Only Endpoint

To fetch only output samples:

GET /nexsets/{nexset_id}/samples?output_only=true
Fetch Output Samples: Request
GET /nexsets/3001/samples?output_only=true

Output-Only Response

The response contains only the transformed output data:

Fetch Output Samples: Response
[
{
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98,
"discount_applied": 5.20,
"final_amount": 46.78
},
{
"transaction_date": "2023-01-15",
"customer_id": "1002",
"product_name": "Standard Widget",
"quantity": 1,
"unit_price": 19.99,
"total_amount": 19.99,
"discount_applied": 2.00,
"final_amount": 17.99
}
]

CLI Output-Only Retrieval

Use CLI options to retrieve output-only samples:

nexla nexset sample 3001 --transform true --show_inputs false

Samples with Metadata

Retrieve samples with comprehensive metadata for detailed analysis and debugging.

Metadata-Enabled Endpoint

To fetch samples with metadata:

GET /nexsets/{nexset_id}/samples?include_metadata=1
Fetch Samples With Metadata: Request
GET /nexsets/3001/samples?include_metadata=1

Metadata Response Structure

The response includes both data and comprehensive metadata:

Fetch Samples With Metadata: Response
[
{
"input": {
"rawMessage": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98
},
"nexlaMetaData": {
"trackerId": "u1002::customer-data-bucket/transactions_2023.csv:1:1:1:1640995200000;NA",
"sourceType": "s3",
"ingestTime": 1640995200000,
"sourceOffset": 43808,
"sourceKey": "customer-data-bucket/transactions_2023.csv",
"bucket": "customer-data-bucket",
"topic": "nexset-3001-source-1002",
"resourceType": "SOURCE",
"resourceId": 1002,
"nexlaUUID": null,
"eof": false,
"runId": 1640995200000,
"tags": [{}],
"transformTime": 1640995200000,
"transformTimeISO8601": "2023-01-15T10:00:00.000Z"
},
"error": null
},
"output": {
"rawMessage": {
"transaction_date": "2023-01-15",
"customer_id": "1001",
"product_name": "Premium Widget",
"quantity": 2,
"unit_price": 25.99,
"total_amount": 51.98,
"discount_applied": 5.20,
"final_amount": 46.78
},
"nexlaMetaData": {
"trackerId": "u1002::customer-data-bucket/transactions_2023.csv:1:1:1:1640995200000;NA",
"sourceType": "s3",
"ingestTime": 1640995200000,
"sourceOffset": 43808,
"sourceKey": "customer-data-bucket/transactions_2023.csv",
"bucket": "customer-data-bucket",
"topic": "nexset-3001-source-1002",
"resourceType": "SOURCE",
"resourceId": 1002,
"nexlaUUID": null,
"eof": false,
"runId": 1640995200000,
"tags": [{}],
"transformTime": 1640995200000,
"transformTimeISO8601": "2023-01-15T10:00:00.000Z"
},
"error": null
}
}
]

CLI Metadata Retrieval

Use CLI options to retrieve samples with metadata:

nexla nexset sample 3001 --metadata --transform true --show_inputs true

Sample Data Parameters

Configure sample retrieval to meet your specific analysis needs.

Count Parameter

Control the number of samples returned:

  • count: Total number of samples to fetch (default: 10, max: 100)
  • API Usage: ?count=25
  • CLI Usage: --count 25

Output Control Parameters

Control which data to include in samples:

  • output_only: Return only transformed output (API: ?output_only=1)
  • show_inputs: Include input samples (CLI: --show_inputs true)
  • transform: Include transformed output (CLI: --transform true)

Metadata Parameters

Control metadata inclusion:

  • include_metadata: Include comprehensive metadata (API: ?include_metadata=1)
  • --metadata: Include metadata in CLI output

Sample Data Use Cases

Sample data serves various operational and analytical purposes.

Development and Testing

Use samples during development and testing:

  • Transform Validation: Verify transformation logic and results
  • Data Quality Testing: Test data quality rules and constraints
  • Performance Testing: Assess processing performance with sample data
  • Integration Testing: Test data flow integration points

Production Monitoring

Monitor production data processing:

  • Quality Assurance: Verify data quality in production environments
  • Performance Monitoring: Monitor processing performance and efficiency
  • Error Detection: Identify and diagnose processing issues
  • Compliance Verification: Verify data compliance and governance

Business Intelligence

Support business intelligence activities:

  • Data Exploration: Explore data patterns and characteristics
  • Report Validation: Verify report accuracy and completeness
  • Trend Analysis: Analyze data trends and patterns
  • Decision Support: Support data-driven decision making

Sample Data Best Practices

To effectively use sample data in your Nexla platform:

  1. Regular Sampling: Periodically sample data to monitor quality and performance
  2. Representative Samples: Ensure samples represent your data population
  3. Metadata Analysis: Use metadata for comprehensive data analysis
  4. Comparison Analysis: Compare input and output to verify transformations
  5. Documentation: Document sample analysis procedures and findings

Sample Data Workflows

Implement structured workflows for effective sample data analysis.

Quality Assurance Workflow

Standard workflow for data quality assurance:

  1. Sample Retrieval: Retrieve representative data samples
  2. Quality Assessment: Assess data quality and identify issues
  3. Issue Analysis: Analyze quality issues and determine root causes
  4. Resolution Implementation: Implement fixes and improvements
  5. Verification: Verify that quality issues are resolved

Transformation Validation Workflow

Workflow for validating data transformations:

  1. Input Sampling: Retrieve input data samples
  2. Output Sampling: Retrieve transformed output samples
  3. Comparison Analysis: Compare input and output for validation
  4. Logic Verification: Verify transformation logic and business rules
  5. Performance Assessment: Assess transformation performance and efficiency

Error Handling

Common sample data issues and solutions:

  • Permission Denied: Ensure you have appropriate access rights
  • Nexset Not Found: Verify the Nexset ID exists and is accessible
  • Invalid Parameters: Check that query parameters are correctly formatted
  • Large Sample Sets: Use count parameter to limit sample size

After retrieving sample data, you may need to:

Validate Data Quality

GET /nexsets/{nexset_id}/validate
POST /nexsets/{nexset_id}/validate

Monitor Processing

GET /nexsets/{nexset_id}/metrics
GET /nexsets/{nexset_id}/status

Update Configurations

PUT /nexsets/{nexset_id}
PUT /nexsets/{nexset_id}/config

View Processing History

GET /nexsets/{nexset_id}/history
GET /nexsets/{nexset_id}/audit