List/View Data Sources
Listing and viewing data sources in Nexla allows you to understand your data ingestion architecture, monitor source status, and manage data pipelines. The system provides comprehensive APIs for viewing source information, configuration, and operational details.
List All Data Sources
The primary endpoint for listing data sources is the /data_sources
endpoint, which provides comprehensive information about all sources accessible to your account.
API Endpoint
To retrieve all accessible data sources:
GET /data_sources
- Nexla API
GET /data_sources
Response Structure
The response includes comprehensive source information including configuration, status, and metadata.
- Nexla API
[
{
"id": 1001,
"name": "Customer Data Source",
"description": "Customer data ingestion from S3",
"connector_type": "s3",
"status": "ACTIVE",
"owner": {
"id": 42,
"full_name": "John Smith"
},
"org": {
"id": 101,
"name": "Acme Corporation"
},
"access_roles": ["owner"],
"data_credentials": {
"id": 5001,
"name": "S3 Credentials",
"credentials_type": "s3"
},
"source_config": {
"path": "customer-data-bucket/raw",
"file_pattern": "*.csv",
"recursive": true
},
"flow_type": "streaming",
"created_at": "2023-01-15T10:30:00.000Z",
"updated_at": "2023-01-15T15:45:00.000Z"
},
{
"id": 1002,
"name": "Sales Data Source",
"description": "Sales data from PostgreSQL database",
"connector_type": "postgres",
"status": "PAUSED",
"owner": {
"id": 42,
"full_name": "John Smith"
},
"org": {
"id": 101,
"name": "Acme Corporation"
},
"access_roles": ["owner"],
"data_credentials": {
"id": 5002,
"name": "PostgreSQL Credentials",
"credentials_type": "postgres"
},
"source_config": {
"host": "db.example.com",
"port": 5432,
"database": "sales_db",
"table": "transactions"
},
"flow_type": "in_memory",
"created_at": "2023-01-14T14:20:00.000Z",
"updated_at": "2023-01-15T12:15:00.000Z"
}
]
Show Source by ID
To retrieve a specific data source by its identifier, use the source ID endpoint.
Source by ID Endpoint
GET /data_sources/{source_id}
- Nexla API
GET /data_sources/1001
Source by ID Response
The response provides detailed information about the specific source, including all configuration and status details.
- Nexla API
{
"id": 1001,
"name": "Customer Data Source",
"description": "Customer data ingestion from S3",
"connector_type": "s3",
"status": "ACTIVE",
"owner": {
"id": 42,
"full_name": "John Smith"
},
"org": {
"id": 101,
"name": "Acme Corporation"
},
"access_roles": ["owner"],
"data_credentials": {
"id": 5001,
"name": "S3 Credentials",
"credentials_type": "s3"
},
"source_config": {
"path": "customer-data-bucket/raw",
"file_pattern": "*.csv",
"recursive": true,
"batch_size": 1000,
"polling_interval": "5m"
},
"flow_type": "streaming",
"data_set": {
"id": 3001,
"name": "Customer Analytics Dataset"
},
"created_at": "2023-01-15T10:30:00.000Z",
"updated_at": "2023-01-15T15:45:00.000Z"
}
Expand Parameter
Use the expand parameter to include additional related information in the response.
Expand Options
You can expand various related resources:
GET /data_sources?expand=data_credentials
GET /data_sources?expand=data_set
GET /data_sources?expand=flow
GET /data_sources?expand=all
- Nexla API
GET /data_sources?expand=data_credentials
Expanded Response
The expanded response includes detailed information about related resources.
- Nexla API
[
{
"id": 1001,
"name": "Customer Data Source",
"connector_type": "s3",
"status": "ACTIVE",
"data_credentials": {
"id": 5001,
"name": "S3 Credentials",
"credentials_type": "s3",
"verified_status": "200 Ok",
"created_at": "2023-01-10T09:00:00.000Z"
},
"data_set": {
"id": 3001,
"name": "Customer Analytics Dataset",
"status": "ACTIVE",
"record_count": 1500000
},
"flow": {
"id": 2001,
"status": "ACTIVE",
"flow_type": "streaming"
}
}
]
Source Status and Monitoring
Data sources have various statuses that indicate their operational state.
Status Types
- ACTIVE: Source is actively ingesting data
- PAUSED: Source is temporarily stopped
- ERROR: Source encountered an error and stopped
- INACTIVE: Source is not currently processing
Status Monitoring
Monitor source status through the API:
GET /data_sources/{source_id}/status
GET /data_sources/{source_id}/metrics
Source Configuration
Each data source has specific configuration based on its connector type.
File System Configuration
{
"path": "bucket-name/folder",
"file_pattern": "*.csv",
"recursive": true,
"batch_size": 1000
}
Database Configuration
{
"host": "db.example.com",
"port": 5432,
"database": "analytics_db",
"table": "customer_data"
}
API Configuration
{
"base_url": "https://api.example.com",
"endpoint": "/customers",
"auth_type": "bearer",
"polling_interval": "1h"
}
Integration with Data Flows
Data sources are integral components of Nexla's data flow architecture.
Flow Relationships
- Origin Node: Sources serve as the starting point for data flows
- Data Processing: Sources feed data into Nexsets for transformation
- Flow Control: Source status affects entire flow operation
Flow Management
Manage flows through source endpoints:
GET /data_sources/{source_id}/flow
PUT /data_sources/{source_id}/flow/activate
PUT /data_sources/{source_id}/flow/pause
Best Practices
To effectively list and view data sources:
- Use Pagination: Implement pagination for large source collections
- Filter by Status: Focus on active or problematic sources
- Monitor Performance: Track source metrics and performance
- Organize by Type: Group sources by connector type for management
- Regular Review: Periodically review source configurations and access
Error Handling
Common source listing issues and solutions:
- Permission Denied: Ensure you have appropriate access rights
- Invalid Source ID: Verify the source ID exists and is accessible
- Organization Issues: Check organization membership and access
- Resource Not Found: Confirm the requested source exists
Related Operations
After viewing sources, you may need to:
Control Source Status
PUT /data_sources/{source_id}/activate
PUT /data_sources/{source_id}/pause
Update Source Configuration
PUT /data_sources/{source_id}
Monitor Source Performance
GET /data_sources/{source_id}/metrics
GET /data_sources/{source_id}/logs
Manage Source Access
GET /data_sources/{source_id}/access
PUT /data_sources/{source_id}/access