Best practices for deploying Data Hub in production.
debug: false)Use environment variables for all sensitive configuration:
# Database connections
ERP_DB_HOST=db.production.internal
ERP_DB_USER=vendure_reader
ERP_DB_PASSWORD=secure-password
# API keys
SUPPLIER_API_KEY=sk_live_...
GOOGLE_MERCHANT_API_KEY=...
# AWS credentials (for S3)
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
DataHubPlugin.init({
enabled: true,
debug: false,
retentionDaysRuns: 30,
retentionDaysErrors: 90,
secrets: [
{ code: 'supplier-api', provider: 'ENV', value: 'SUPPLIER_API_KEY' },
{ code: 'erp-db-password', provider: 'ENV', value: 'ERP_DB_PASSWORD' },
],
connections: [
{
code: 'erp-db',
type: 'postgres',
name: 'ERP Database',
settings: {
host: '${ERP_DB_HOST}',
port: 5432,
database: 'erp',
username: '${ERP_DB_USER}',
password: '${ERP_DB_PASSWORD}',
ssl: true,
poolSize: 5,
},
},
],
})
For smaller deployments, the default configuration works:
jobQueueOptions: {
activeQueues: ['default', 'data-hub.run', 'data-hub.schedule'],
}
For high-volume processing, run dedicated workers:
// Main server - handles API requests
jobQueueOptions: {
activeQueues: ['default'],
}
// Worker process - handles data hub jobs
jobQueueOptions: {
activeQueues: ['data-hub.run', 'data-hub.schedule'],
}
// worker.ts
import { bootstrapWorker } from '@vendure/core';
import config from './vendure-config';
bootstrapWorker({
...config,
jobQueueOptions: {
activeQueues: ['data-hub.run', 'data-hub.schedule'],
pollInterval: 1000,
},
})
.then(worker => worker.startJobQueue())
.catch(err => {
console.error('Worker failed to start:', err);
process.exit(1);
});
Limit connection pool size to prevent exhausting database connections:
connections: [
{
code: 'external-db',
type: 'postgres',
settings: {
poolSize: 5, // Limit concurrent connections
},
},
]
For read-heavy operations, configure read replicas:
connections: [
{
code: 'erp-db-read',
type: 'postgres',
settings: {
host: '${ERP_DB_READ_HOST}', // Read replica
},
},
]
Set the minimum level to persist:
mutation {
updateDataHubSettings(input: {
logPersistenceLevel: "info" # debug, info, warn, error
}) {
logPersistenceLevel
}
}
debug - All logs (high storage)info - Info and above (recommended)warn - Warnings and errorserror - Errors onlySend logs to external systems:
// Custom log handler (example)
import { LoggingService } from '@vendure/core';
class CustomLogger extends LoggingService {
log(level: string, message: string, context?: any) {
// Send to CloudWatch, Datadog, etc.
externalLogger.log({ level, message, context });
}
}
Monitor these metrics:
| Metric | Description | Alert Threshold |
|---|---|---|
| Pipeline success rate | % of successful runs | < 95% |
| Average run duration | Execution time | > baseline + 50% |
| Record error rate | % of failed records | > 5% |
| Queue depth | Pending jobs | > 100 |
| Worker health | Active workers | < expected |
Add health check endpoints:
// Check Data Hub status
app.use('/health/data-hub', async (req, res) => {
const isHealthy = await checkDataHubHealth();
res.status(isHealthy ? 200 : 503).json({ healthy: isHealthy });
});
Set up alerts for:
Data Hub supports running multiple instances with automatic coordination:
Distributed Locking:
# Option 1: Redis (recommended for production)
DATAHUB_REDIS_URL=redis://redis.production.internal:6379
# Option 2: Force PostgreSQL (no additional infrastructure)
DATAHUB_LOCK_BACKEND=postgres
What’s Protected:
Deployment Architecture:
┌─────────────────┐
│ Load Balancer │
└────────┬────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Vendure 1 │ │ Vendure 2 │ │ Vendure 3 │
│ + Data Hub │ │ + Data Hub │ │ + Data Hub │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
└────────────────────┼────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ PostgreSQL │ │ Redis │ │ Message Queue│
│ (required) │ │ (optional) │ │ (optional) │
└───────────────┘ └───────────────┘ └───────────────┘
Without Redis:
With Redis:
Protect external APIs:
.extract('api-call', {
throughput: {
rateLimitRps: 10, // Max 10 requests per second
},
})