Theme switcher

CipherStream Enterprise Data Extraction Platform


About CipherStream

CipherStream is Centaur Software's proprietary enterprise bulk data extraction platform, designed for high-performance data extraction from cloud-hosted databases with intelligent processing optimisation.

Key Features

  • 38 Specialised Data Endpoints: Complete coverage of Dental4Web data
  • Intelligent Processing: Auto-switching between streaming (≤100K rows) and job processing (>100K rows)
  • Enterprise Security: AES-256-GCM encryption, API key authentication, IP whitelisting
  • Multiple Formats: JSON, NDJSON, CSV with optional GZIP/ZIP compression
  • Real-time & Batch: Sub-second streaming for small datasets, S3-backed jobs for large datasets
  • Webhook Integration: Real-time notifications for job completion

Authentication

API Key Types

Key Type

Format

Access Level

Use Case

Customer Live

cs_live_*

Production API access

Live data extraction

Customer Demo

cs_demo_*

Demo/testing access

Testing and evaluation

Headers Required

Plain text
Authorization: Bearer <your-api-key>
Content-Type: application/json

Execution Modes

CipherStream automatically determines the optimal processing method based on data volume:

Auto Mode (Recommended)

  • ≤100K rows: Streaming response (immediate)
  • >100K rows: Background job with S3 download
  • >50MB estimated: Automatic job mode regardless of row count
  • System decides: Based on current load and data characteristics

Stream Mode

  • Force streaming: Real-time response for all requests
  • Recommended limit: ≤100K rows for optimal performance
  • Timeout: 30 seconds maximum
  • Use case: Small datasets requiring immediate response

Job Mode

  • Force background processing: All requests create jobs
  • S3 storage: Secure file storage with presigned URLs
  • Extended timeout: 180 seconds for large extractions
  • Use case: Large datasets, scheduled extractions

Date Modifiers

Control which records are returned based on their creation and modification dates:

Available Options

  • All (default): Returns all records within the date range regardless of creation/update dates
  • Created: Returns only records created within the specified date range
  • Updated: Returns only records updated within the specified date range

Usage Examples

Plain text
{
"date_modifier": "Created",
"from_date": "2024-01-01",
"to_date": "2024-12-31"
}

Behavior by Procedure Type

  • Reference Data: Date modifiers not applicable (always returns current data)
  • Data Procedures: Uses ts_4_insert (Created) and ts_4_update (Updated) fields
  • Special Procedures: Applies to underlying table timestamp fields

Output Formats

NDJSON (Default)

  • Format: Newline-delimited JSON
  • Structure: One JSON object per line
  • Benefits: Streaming-friendly, memory efficient
  • Use case: Large datasets, real-time processing
Plain text

JSON

  • Format: Standard JSON array
  • Structure: Array of objects with metadata
  • Benefits: Standard format, easy parsing
  • Use case: Small to medium datasets, standard integrations
Plain text
{
"data": [
{"id": 1, "name": "John Doe", "date": "2024-01-01"},
{"id": 2, "name": "Jane Smith", "date": "2024-01-02"}
],
"metadata": {"execution_time": 2.5, "record_count": 2}
}

CSV

  • Format: Comma-separated values
  • Structure: Header row followed by data rows
  • Benefits: Universal compatibility, Excel-friendly
  • Use case: Reporting, data analysis, spreadsheet import
Plain text
id,name,date
1,John Doe,2024-01-01
2,Jane Smith,2024-01-02

Compression Options

None (Default)

  • No compression: Raw output format
  • Use case: Small datasets, immediate processing
  • Available in: Streaming and Job modes

GZIP

  • Compression ratio: 70-96% size reduction (tested)
  • Format: .gz compressed files
  • Smart threshold: Only compresses data >8KB for optimal performance
  • Available in: Streaming mode and Job mode
  • Use case: Large datasets, bandwidth optimisation
  • HTTP Header: Content-Encoding: gzip

ZIP

  • Compression ratio: 60-97% size reduction (tested)
  • Format: .zip archive files with proper structure
  • Smart threshold: Only compresses data >8KB for optimal performance
  • Available in: Streaming mode and Job mode
  • Use case: Multiple files, Windows compatibility
  • HTTP Header: Content-Encoding: deflate
  • Compression ratio: 70-90% size reduction
  • Format: .gz compressed files
  • Use case: Large datasets, bandwidth optimisation


Security & Encryption

Data in Transit

  • TLS 1.3: All API communications encrypted
  • Certificate pinning: Enhanced security for production
  • HSTS: HTTP Strict Transport Security enabled

Data at Rest

  • AES-256-GCM: S3 bucket encryption for job results
  • Presigned URLs: Time-limited access (12 hours)
  • Automatic cleanup: Files removed after expiry

End-to-End Encryption

  • AES-256-GCM: Customer-specific encryption keys
  • Available in: Streaming mode and Job mode
  • Chunk-based: Each data chunk encrypted individually
  • Key rotation: Key rotation support
  • Usage: Set "encrypted": true in request body
  • Format: {"encrypted_data": "key_id:iv:ciphertext"}

Authentication

  • Bearer tokens: API key-based authentication
  • IP whitelisting: Optional IP restriction
  • Rate limiting: Prevents abuse and ensures fair usage

Webhook System

Supported Events

  • job.completed: Job finished successfully with download URL
  • job.failed: Job encountered an error with details
  • job.cancelled: Job was manually cancelled

Security Features

  • HMAC-SHA256: Payload signature verification
  • Automatic retries: Up to 3 attempts with exponential backoff
  • Delivery tracking: Monitor success/failure rates
  • Timeout handling: 30-second response timeout

Webhook Payload Example

Plain text
{
"event": "job.completed",
"timestamp": "2024-09-26T10:02:15Z",
"signature": "sha256=abc123...",
"data": {
"job_id": "appointments_1758630144658_c7511999",
"customer_id": "your-customer-id",
"status": "completed",
"rows_processed": 45678,
"execution_time_seconds": 64.75,
"s3_url": "https://secure-download-url",
"expires_at": "2024-09-26T22:02:15Z",
"file_size_bytes": 2048576,
"output_format": "ndjson",
"compression": "gzip"
}
}

Job Management Lifecycle

Job States

  1. Queued: Job created and waiting for processing
  2. Running: Data extraction in progress with real-time progress updates
  3. Completed: Data available for download via S3 presigned URL
  4. Failed: Error occurred during processing with detailed error information
  5. Cancelled: Job manually cancelled by user or system timeout

Job Features

  • Progress tracking: Real-time percentage completion (0-100%)
  • Row counting: Live count of processed records
  • Time estimation: Estimated completion time based on current progress
  • Resource monitoring: Memory and CPU usage tracking
  • Error handling: Detailed error messages and recovery suggestions

Download Management

  • S3 presigned URLs: Secure, time-limited download links
  • 12-hour expiry: URLs automatically expire for security
  • Resume support: Partial download recovery for large files
  • Bandwidth optimisation: CDN-accelerated downloads

Data Procedures Overview

Reference Data

  • Always streamed: Immediate response (no job creation)
  • 5-minute cache: Optimal performance for frequently accessed data
  • No parameters required: Simple, consistent access pattern
  • Small datasets: Typically <1000 records per procedure
  • Use cases: Dropdown population, validation, system configuration

Data Procedures

  • Intelligent mode selection: Auto streaming/job based on volume
  • Date range filtering: Flexible from_date/to_date parameters
  • Date modifier support: Created, Updated, or All records
  • High-volume capable: Handles millions of records efficiently
  • Use cases: Bulk data extraction, reporting, analytics

Special Procedures

  • CALL syntax: Advanced stored procedure execution
  • Direct table access: Extract from any accessible database table
  • Extended timeout: 180-second timeout for large extractions
  • Flexible parameters: Custom table names and date filtering
  • Use cases: Custom extractions, ad-hoc queries, data migration

Performance Characteristics

Streaming Response

  • Latency: <2 seconds for reference data
  • Throughput: Up to 10,000 records/second
  • Memory usage: Constant memory footprint
  • Concurrent requests: Up to 10 simultaneous streams

Job Processing

  • Throughput: Up to 100,000 records/second
  • Scalability: Auto-scaling based on queue depth
  • Resource allocation: Dedicated processing resources
  • Monitoring: Real-time progress and performance metrics

Error Handling

HTTP Status Codes

  • 200: Success - Data returned or job created
  • 400: Bad Request - Invalid parameters or format
  • 401: Unauthorized - Invalid or missing API key
  • 403: Forbidden - Access denied or rate limited
  • 404: Not Found - Endpoint or resource not found
  • 429: Too Many Requests - Rate limit exceeded
  • 500: Internal Server Error - System error occurred

Error Response Format

Plain text
{
"error": {
"code": "INVALID_DATE_RANGE",
"message": "The specified date range is invalid",
"details": "from_date must be earlier than to_date",
"timestamp": "2024-09-26T10:00:00Z",
"request_id": "req_abc123"
}
}

Environment URLs


Was this section helpful?

What made this section unhelpful for you?

On this page
  • CipherStream Enterprise Data Extraction Platform