Implementing Bulk Product Import
Bulk import is one-time load of thousands or hundreds of thousands items. Unlike incremental updates, task is: process entire volume in reasonable time without killing database and not occupying all queue workers for day.
Size Matters: Strategies by Volume
| Volume | Method | Processing Time |
|---|---|---|
| Up to 1,000 | Sync in request | Seconds |
| 1,000–50,000 | One queued task with chunks | Minutes |
| 50,000–500,000 | Fan-out: N parallel Jobs | 10–60 minutes |
| Over 500,000 | Batch insert + separate pipeline | Hours |
Chunk + Queue Principle
File with 100,000 rows not processed in one Job. Split into chunks, each chunk = separate Job:
class BulkImportDispatcher
{
private const CHUNK_SIZE = 500;
public function dispatch(ImportFile $file): void
{
$import = ImportRun::create([
'file_id' => $file->id,
'status' => 'dispatching',
'total' => 0,
]);
$chunkIndex = 0;
foreach ($file->parser()->chunks(self::CHUNK_SIZE) as $chunk) {
ProcessImportChunkJob::dispatch($import->id, $chunkIndex, $chunk)
->onQueue('bulk-import');
$chunkIndex++;
}
$import->update([
'status' => 'processing',
'total_chunks' => $chunkIndex,
]);
}
}
Bulk Upsert Instead of Poaching
Main performance tool — INSERT ... ON CONFLICT DO UPDATE (upsert). Laravel supports via Model::upsert():
class ProcessImportChunkJob implements ShouldQueue
{
public int $timeout = 120;
public function handle(): void
{
$rows = [];
foreach ($this->chunk as $item) {
$rows[] = [
'sku' => $item['sku'],
'name' => $item['name'],
'price' => $item['price'],
'updated_at' => now(),
];
}
// One SQL query instead of 500 separate
Product::upsert(
$rows,
uniqueBy: ['sku'],
update: ['name', 'price', 'updated_at']
);
DB::table('import_runs')
->where('id', $this->importId)
->increment('processed_chunks');
}
}
One upsert for 500 rows in PostgreSQL takes ~50–200 ms vs 500 × 5ms = 2500ms for individual queries.
Pre-loading References into Memory
Most expensive operation during import: database queries for resolving dependencies (category by name, supplier by ID, tag by slug). Solution — load all references into memory before processing:
class ImportContext
{
private array $categoryMap; // ['name' => id]
private array $supplierMap; // ['code' => id]
private array $existingSkus; // ['sku' => product_id]
public function preload(int $sourceId): void
{
$this->categoryMap = Category::pluck('id', 'name')->all();
$this->supplierMap = Supplier::pluck('id', 'code')->all();
$this->existingSkus = Product::where('source_id', $sourceId)
->pluck('id', 'sku')->all();
}
public function resolveCategoryId(string $name): ?int
{
return $this->categoryMap[$name] ?? null;
}
}
For 200,000 SKU this dictionary takes ~10–20 MB memory — acceptable for worker.
Load Limiting
During bulk import important not to degrade site:
- Allocate separate
bulk-importqueue with limited workers (2–4) - Don't touch main
defaultqueue - Run heavy import at night via Laravel Scheduler
- Use transactions per chunk, not entire file
Implementation Timeline
- Chunk dispatcher, bulk upsert, references preload — 2 days
- Bus::batch with final callback, post-import pipeline — +1 day
- Progress in admin UI, load limiting, night scheduler — +1 day







