Common issues and solutions for Data Hub.
Symptoms: Run button does nothing or returns error
Possible causes:
RunDataHubPipeline permissionenabled statusSymptoms: Run completes but 0 records processed
Check:
itemsField points to correct path in responseSymptoms: High error rate, records quarantined
Steps:
Symptoms: Records fail during transform step
Common causes:
Symptoms: Records reach load step but fail
Check:
Database connections:
API connections:
Symptoms: “Connection timeout” or “Request timeout”
Solutions:
Symptoms: API calls fail with auth errors
Check:
Symptoms: “Environment variable X not found”
Solutions:
printenv to verifyAnalyze:
Solutions:
Causes:
Solutions:
Symptoms: Jobs pile up, runs delayed
Solutions:
Adapter code doesn’t exist.
Solutions:
registerBuiltinAdapters is trueConnection code doesn’t exist.
Solutions:
Secret code doesn’t exist.
Solutions:
Pipeline JSON is malformed.
Solutions:
Missing permission.
Solutions:
DataHubPlugin.init({
debug: true,
})
Symptoms: Migrations fail to run or partially complete
Solutions:
npm run migration:show
npm run migration:run
npm run migration:revert
rm -rf dist/migrations
npm run build
Symptoms: “Too many connections” or “Connection pool timeout”
Solutions:
dbConnectionOptions: {
extra: {
max: 20, // Increase from default 10
}
}
Symptoms: “Deadlock detected” or “Lock wait timeout”
Solutions:
throughput: {
concurrency: 1, // Sequential processing
}
throughput: {
batchSize: 20, // Reduce lock contention
}
Symptoms: Pipeline doesn’t run when webhook called
Check:
POST https://your-domain.com/data-hub/webhook/your-path
query {
dataHubLogs(options: { take: 10 }) {
items {
id
level
message
stepKey
createdAt
}
totalItems
}
}
Symptoms: “Invalid signature” or “Unauthorized”
Solutions:
const crypto = require('crypto');
const secret = 'your-secret';
const payload = JSON.stringify(requestBody);
const signature = crypto
.createHmac('sha256', secret)
.update(payload)
.digest('hex');
Symptoms: Same webhook processed multiple times
Solutions:
trigger: {
type: 'WEBHOOK',
idempotencyKey: 'X-Request-ID',
}
Symptoms: Pipeline doesn’t execute at scheduled time
Check:
# Test cron expression
# Use online cron validator
0 2 * * * # Valid: 2 AM daily
trigger: {
type: 'SCHEDULE',
cron: '0 2 * * *',
timezone: 'America/New_York', // Explicit timezone
}
# Check logs for scheduler
pm2 logs vendure | grep "SchedulerService"
Symptoms: Pipeline runs at unexpected times
Solutions:
date
timedatectl # Linux
timezone: 'UTC' // Always use explicit timezone
Symptoms: Vendure event occurs but pipeline doesn’t run
Check:
event: 'ProductEvent' // Must match Vendure event class
filter: {
type: 'updated' // Must match event property
}
query {
dataHubLogs(options: { take: 10 }) {
items {
id
level
message
stepKey
createdAt
}
totalItems
}
}
Symptoms: “ENOENT: no such file or directory”
Solutions:
path: '/var/data/imports/file.csv' // Absolute
// Not: './imports/file.csv' // Relative
ls -la /var/data/imports/
# Ensure vendure process can read
stat /var/data/imports/file.csv
Symptoms: “Invalid CSV” or “Parse error”
Common causes:
extract: {
encoding: 'utf-8', // or 'iso-8859-1', 'windows-1252'
}
delimiter: ';' // European CSV
delimiter: '\t' // TSV
Symptoms: Out of memory errors with large files
Solutions:
throughput: {
batchSize: 100, // Smaller batches
}
checkpointing: {
enabled: true,
intervalRecords: 5000,
}
split -l 10000 large-file.csv chunk-
Symptoms: 429 Too Many Requests errors
Solutions:
throughput: {
rateLimitRps: 5, // 5 requests per second
}
throughput: {
concurrency: 1, // Sequential requests
}
errorHandling: {
maxRetries: 5,
retryDelayMs: 2000,
backoffMultiplier: 2, // Exponential backoff
}
Symptoms: “Cannot read property of undefined”
Solutions:
dataPath: 'data.items' // Must match response structure
curl -X GET https://api.example.com/products \
-H "Authorization: Bearer token"
operators: [
{ op: 'default', args: { path: 'items', value: [] } }
]
Symptoms: “UNABLE_TO_VERIFY_LEAF_SIGNATURE” or “CERT_HAS_EXPIRED”
Solutions:
# Ubuntu/Debian
sudo apt-get update ca-certificates
# macOS
brew upgrade openssl
// NOT recommended for production
process.env.NODE_TLS_REJECT_UNAUTHORIZED = '0';
connectionConfig: {
ca: fs.readFileSync('/path/to/ca.pem'),
}
Symptoms: Search results don’t match database
Solutions:
mutation {
rebuildDataHubSearchIndex(indexName: "products") {
success
itemsIndexed
}
}
Symptoms: “Index error” or documents not appearing
Check:
bulkSize: 500 // Reduce if failing
Symptoms: Pipeline runs through gate without pausing
Check:
approvalType: 'MANUAL' // Requires manual approval
query {
dataHubLogs(options: { take: 10 }) {
items {
id
level
message
stepKey
createdAt
}
totalItems
}
}
Symptoms: Gate doesn’t auto-approve after timeout
Solutions:
timeoutSeconds: 3600 // 1 hour
Symptoms: “Adapter not found” for custom adapter
Check:
DataHubPlugin.init({
adapters: [myCustomAdapter],
})
code: 'my-custom-adapter' // Exact match required
npm run build
import { myAdapter } from './adapters/my-adapter';
Symptoms: Transform step fails with custom operator
Debug:
applyOne(record, config, helpers) {
console.log('Input:', record);
// ... operator logic ...
console.log('Output:', result);
return result;
}
const result = myOperator.applyOne(
{ test: 'data' },
{ /* config */ },
helpers
);
expect(result).toEqual({ /* expected */ });
Symptoms: Memory usage grows over time
Tools:
node --inspect server.js
# Open chrome://inspect
# Take heap snapshots
setInterval(() => {
const used = process.memoryUsage();
console.log('Memory:', Math.round(used.heapUsed / 1024 / 1024), 'MB');
}, 60000);
npm install -g clinic
clinic doctor -- node server.js
// Bad
eventEmitter.on('event', handler);
// Good
const handler = () => { /* ... */ };
eventEmitter.once('event', handler);
// Or: eventEmitter.removeListener('event', handler);
// Bad
const cache = new Map(); // Never cleared
// Good
const cache = new LRU({ max: 1000 }); // Bounded
// Bad
setInterval(fn, 1000);
// Good
const timer = setInterval(fn, 1000);
// Later: clearInterval(timer);
DataHubPlugin.init({
logging: {
level: 'DEBUG', // DEBUG, INFO, WARN, ERROR
logQueries: true,
logSteps: true,
},
})
.hooks({
AFTER_EXTRACT: [{
type: 'INTERCEPTOR',
name: 'Debug log',
code: `
console.log('Extracted records:', records.length);
console.log('Sample:', records[0]);
return records;
`,
}],
})
mutation {
startDataHubPipelineRun(pipelineId: "pipeline-id") {
id
status
}
}
-- Check recent runs
SELECT id, status, started_at, records_processed
FROM data_hub_pipeline_run
ORDER BY started_at DESC
LIMIT 10;
-- Check errors
SELECT * FROM data_hub_record_error
WHERE run_id = 'run-id'
LIMIT 10;
-- Check checkpoints
SELECT * FROM data_hub_checkpoint
WHERE pipeline_id = 'pipeline-id';
.hooks({
BEFORE_TRANSFORM: [{
type: 'INTERCEPTOR',
code: `
context.startTime = Date.now();
return records;
`,
}],
AFTER_TRANSFORM: [{
type: 'INTERCEPTOR',
code: `
const duration = Date.now() - context.startTime;
const rps = Math.round(records.length / (duration / 1000));
console.log(\`Transform: \${duration}ms, \${rps} rec/sec\`);
return records;
`,
}],
})
mutation {
cancelDataHubPipelineRun(id: "run-id") {
success
}
}
# Find process
ps aux | grep vendure
# Kill process
kill -9 PID
psql vendure_db < backup.sql
npm install @oronts/vendure-data-hub-plugin@1.5.0
pm2 restart vendure
-- View queue
SELECT * FROM job_queue
WHERE queue_name = 'data-hub.run'
AND state = 'PENDING';
-- Clear stuck jobs (use with caution)
DELETE FROM job_queue
WHERE queue_name = 'data-hub.run'
AND state = 'PENDING'
AND created_at < NOW() - INTERVAL '1 hour';
If you can’t resolve an issue:
When reporting issues, include:
# System info
node --version
npm --version
npx vendure version
# Plugin version
npm list @oronts/vendure-data-hub-plugin
# Database version
psql --version
# Recent logs
pm2 logs vendure --lines 100
# Pipeline configuration (sanitized)
# Export pipeline as JSON, remove sensitive data