Create a Data Source

Creating a data source is the first step in setting up data ingestion in Nexla. Data sources define the connection to external systems and configure how data should be extracted and processed.

Required Fields

When creating a data source, you must provide:

name: A descriptive name for the data source
source_type: The connector type (e.g., s3, mysql, rest)
data_credentials_id: ID of existing credentials or inline credentials object
source_config: Connector-specific configuration settings

API Endpoint

Nexla API

Create Source: Request
POST /data_sources

Example Request Body:

{
  "name": "Example S3 Data Source",
  "source_type": "s3",
  "data_credentials_id": 5001,
  "source_config": {
    "bucket": "my-data-bucket",
    "prefix": "daily/",
    "file_pattern": "*.csv"
  }
}

Response Structure

A successful creation returns the complete data source object:

Nexla API

Create Source: Response
{
  "id": 5001,
  "owner_id": 2,
  "org_id": 1,
  "name": "Example S3 Data Source",
  "description": null,
  "status": "INIT",
  "source_type": "s3",
  "source_config": {
    "bucket": "my-data-bucket",
    "prefix": "daily/",
    "file_pattern": "*.csv"
  },
  "data_credentials_id": 5001,
  "flow_type": "streaming",
  "data_sets": [],
  "created_at": "2023-01-15T10:30:00.000Z",
  "updated_at": "2023-01-15T10:30:00.000Z"
}

Creating with Existing Credentials

When you have already created data credentials, you can reference them by ID when creating a data source. This approach is recommended for production environments where credentials are managed separately.

Using Credential ID

Nexla API

Create with Existing Credentials: Request
{
  "name": "Production MySQL Source",
  "source_type": "mysql",
  "data_credentials_id": 5001,
  "source_config": {
    "host": "db.example.com",
    "port": 3306,
    "database": "customers",
    "incremental_column": "updated_at"
  }
}

Benefits of Existing Credentials

Security: Credentials are managed centrally and encrypted
Reusability: Same credentials can be used for multiple sources
Audit Trail: Clear tracking of credential usage and access
Rotation: Easier to update credentials across multiple sources

Creating with Inline Credentials

For development and testing, you can create credentials inline with the data source. This approach is convenient but should be used carefully in production environments.

Inline Credentials Example

Nexla API

Create with Inline Credentials: Request
{
  "name": "Test FTP Source",
  "source_type": "ftp",
  "data_credentials": {
    "name": "FTP Test Credentials",
    "credentials_type": "ftp",
    "credentials": {
      "host": "ftp.example.com",
      "username": "testuser",
      "password": "testpass",
      "port": 21
    }
  },
  "source_config": {
    "path": "/data/",
    "file_pattern": "*.txt"
  }
}

When to Use Inline Credentials

Development: Quick setup for testing and development
Temporary Sources: Short-lived or experimental data sources
Demo Purposes: Demonstrations and proof-of-concept work
CI/CD: Automated testing and deployment scenarios

Source Configuration

Each connector type requires specific configuration parameters. The source_config object contains these connector-specific settings.

Common Configuration Elements

Connection Details: Host, port, database, bucket, or endpoint information
Authentication: Additional authentication parameters beyond credentials
Data Selection: Paths, file patterns, table names, or query filters
Scheduling: Poll intervals, cron expressions, or event triggers
Schema Detection: Automatic or manual schema identification settings

Connector-Specific Examples

S3 Configuration:

{
  "bucket": "my-data-bucket",
  "prefix": "daily/",
  "file_pattern": "*.csv",
  "region": "us-east-1",
  "compression": "gzip"
}

MySQL Configuration:

{
  "host": "db.example.com",
  "port": 3306,
  "database": "customers",
  "incremental_column": "updated_at",
  "query_timeout": 300
}

REST API Configuration:

{
  "base_url": "https://api.example.com",
  "endpoint": "/v1/data",
  "method": "GET",
  "headers": {
    "Accept": "application/json"
  },
  "rate_limit": 100
}

Flow Type Configuration

Data sources can be configured with different flow types that optimize performance for specific use cases:

Available Flow Types

streaming (default): Standard streaming data processing
in_memory: High-performance in-memory processing
replication: Data replication and synchronization

Setting Flow Type

Nexla API

Create with Flow Type: Request
{
  "name": "High-Performance Source",
  "source_type": "mysql",
  "data_credentials_id": 5001,
  "flow_type": "in_memory",
  "source_config": {
    "host": "db.example.com",
    "database": "analytics"
  }
}

Code Container Integration

Data sources can be enhanced with custom code containers for advanced data processing:

Code Container Configuration

Nexla API

Create with Code Container: Request
{
  "name": "Custom Processing Source",
  "source_type": "rest",
  "data_credentials_id": 5001,
  "code_container": {
    "name": "Custom REST Processor",
    "code_type": "python",
    "code": "def process_data(data): return data.upper()",
    "resource_type": "source_custom"
  },
  "source_config": {
    "base_url": "https://api.example.com"
  }
}

Code Container Benefits

Custom Logic: Implement source-specific data processing
Data Transformation: Clean, filter, or enrich data at the source
Format Conversion: Convert data to standard formats
Validation: Ensure data quality before processing

Post-Creation Steps

After creating a data source, you typically need to:

1. Verify Configuration

Check that the source configuration is correct:

GET /data_sources/{source_id}

2. Test Connection

Verify connectivity and credentials:

PUT /data_sources/{source_id}/test

3. Activate Source

Start data ingestion:

PUT /data_sources/{source_id}/activate

4. Monitor Performance

Track ingestion rates and data quality:

GET /data_sources/{source_id}/metrics

Best Practices

To ensure successful data source creation:

Use Descriptive Names: Choose names that clearly identify the source purpose
Secure Credentials: Store sensitive information in dedicated credential resources
Validate Configuration: Test source settings before production use
Plan for Scale: Consider data volume and processing requirements
Document Settings: Maintain clear documentation of configuration choices
Monitor Health: Set up alerts for source performance and errors

Error Handling

Common creation errors and solutions:

Invalid Source Type: Ensure the connector type is supported
Missing Credentials: Provide valid data credentials or inline credentials
Configuration Errors: Verify source_config parameters for the connector type
Permission Issues: Ensure you have access to create data sources
Duplicate Names: Use unique names within your organization

Required Fields​

API Endpoint​

Response Structure​

Creating with Existing Credentials​

Using Credential ID​

Benefits of Existing Credentials​

Creating with Inline Credentials​

Inline Credentials Example​

When to Use Inline Credentials​

Source Configuration​

Common Configuration Elements​

Connector-Specific Examples​

Flow Type Configuration​

Available Flow Types​

Setting Flow Type​

Code Container Integration​

Code Container Configuration​

Code Container Benefits​

Post-Creation Steps​

1. Verify Configuration​

2. Test Connection​

3. Activate Source​

4. Monitor Performance​

Best Practices​

Error Handling​