Implementation of Data Validation for Product Import
Supplier data is uncontrolled input. Price lists may contain negative prices, empty SKUs, invalid units, XSS in descriptions, and strings instead of numbers. Without validation, this goes directly into the catalog. A good validation system blocks garbage at the entrance and provides clear error reports.
Two Levels of Validation
Structural validation — checks format and data types:
- Required fields are present
- Numbers are numbers
- Dates in correct format
- String lengths within allowed range
Business validation — checks semantic rules:
- Price doesn't exceed reasonable limit
- SKU is unique within source
- Category exists in system
- Price change doesn't exceed allowed percentage
Building a Validator Based on Laravel Validator
class ProductImportValidator
{
private array $rules = [
'sku' => ['required', 'string', 'max:100'],
'name' => ['required', 'string', 'max:500'],
'price' => ['required', 'numeric', 'min:0.01', 'max:100000000'],
'qty' => ['nullable', 'integer', 'min:0', 'max:9999999'],
'description' => ['nullable', 'string', 'max:100000'],
'category' => ['nullable', 'string', 'max:300'],
'images' => ['nullable', 'array', 'max:20'],
'images.*' => ['url', 'max:2000'],
'weight' => ['nullable', 'numeric', 'min:0', 'max:10000'],
];
public function validate(array $row): ValidationResult
{
$validator = \Illuminate\Support\Facades\Validator::make(
$row,
$this->rules,
$this->customMessages()
);
$errors = [];
if ($validator->fails()) {
$errors = $validator->errors()->toArray();
}
// Business rules
$errors = array_merge($errors, $this->applyBusinessRules($row));
return new ValidationResult(
valid: empty($errors),
errors: $errors,
data: $validator->validated(),
);
}
private function applyBusinessRules(array $row): array
{
$errors = [];
// Check for anomalous price change
if (!empty($row['sku']) && !empty($row['price'])) {
$existing = Product::where('sku', $row['sku'])->value('price');
if ($existing && $existing > 0) {
$change = abs($row['price'] - $existing) / $existing;
if ($change > 0.5) {
$errors['price'][] = "Price change {$change}% exceeds 50% threshold";
}
}
}
// Check for XSS in description
if (!empty($row['description'])) {
$clean = strip_tags($row['description']);
if ($clean !== $row['description']) {
$errors['description'][] = 'HTML tags detected in description';
}
}
return $errors;
}
}
Data Sanitization Before Validation
First clean obvious garbage, then validate:
class ProductDataSanitizer
{
public function sanitize(array $raw): array
{
return [
'sku' => $this->cleanString($raw['sku'] ?? ''),
'name' => $this->cleanString($raw['name'] ?? ''),
'price' => $this->parseDecimal($raw['price'] ?? null),
'qty' => $this->parseInt($raw['qty'] ?? null),
'description' => $this->sanitizeHtml($raw['description'] ?? ''),
'weight' => $this->parseDecimal($raw['weight'] ?? null),
'images' => $this->parseImageUrls($raw['images'] ?? []),
];
}
private function cleanString(?string $value): string
{
if ($value === null) return '';
$value = trim($value);
$value = preg_replace('/\p{C}/u', '', $value); // invisible characters
return mb_substr($value, 0, 1000);
}
private function parseDecimal(mixed $value): ?float
{
if ($value === null || $value === '') return null;
$value = str_replace([' ', ',', "\xc2\xa0"], ['', '.', ''], (string) $value);
return is_numeric($value) ? (float) $value : null;
}
private function parseInt(mixed $value): ?int
{
$decimal = $this->parseDecimal($value);
return $decimal !== null ? (int) $decimal : null;
}
private function sanitizeHtml(?string $html): string
{
if (!$html) return '';
// Allow only safe tags
return strip_tags($html, '<p><br><b><strong><i><em><ul><ol><li>');
}
private function parseImageUrls(mixed $raw): array
{
if (is_string($raw)) {
$raw = array_map('trim', explode(',', $raw));
}
return array_values(array_filter(
(array) $raw,
fn($url) => filter_var($url, FILTER_VALIDATE_URL)
));
}
}
Batch SKU Deduplication Within Import
One file may contain duplicate SKUs (supplier merged multiple price lists):
class ImportDeduplicator
{
private array $seenSkus = [];
public function check(string $sku, int $lineNumber): ?string
{
if (isset($this->seenSkus[$sku])) {
return "Duplicate SKU '{$sku}' at line {$lineNumber}, first seen at line {$this->seenSkus[$sku]}";
}
$this->seenSkus[$sku] = $lineNumber;
return null;
}
}
Error Handling Strategies
Not all errors are equal — need to categorize by severity:
enum ValidationSeverity: string
{
case CRITICAL = 'critical'; // row is skipped
case WARNING = 'warning'; // row imported with flag
case INFO = 'info'; // log only
}
Rules with severity:
| Rule | Severity |
|---|---|
| Empty SKU | CRITICAL |
| Missing name | CRITICAL |
| Price = 0 or negative | CRITICAL |
| Price changed >50% | WARNING |
| HTML in description | WARNING |
| Category not found | INFO |
| Broken image URL | INFO |
Error Collection and Storage
class ImportErrorCollector
{
private array $errors = [];
private int $criticalCount = 0;
public function add(int $line, string $sku, string $field, string $message, ValidationSeverity $severity): void
{
$this->errors[] = compact('line', 'sku', 'field', 'message', 'severity');
if ($severity === ValidationSeverity::CRITICAL) {
$this->criticalCount++;
}
}
public function persist(int $importId): void
{
if (empty($this->errors)) return;
// Write in batches of 1000 rows
foreach (array_chunk($this->errors, 1000) as $chunk) {
ImportError::insert(array_map(
fn($e) => array_merge($e, ['import_id' => $importId, 'severity' => $e['severity']->value]),
$chunk
));
}
ImportRun::find($importId)->update([
'errors_count' => count($this->errors),
'critical_errors_count' => $this->criticalCount,
]);
}
}
Import Abort Threshold
If too many critical errors — stop import entirely:
private const ABORT_THRESHOLD = 0.20; // 20% critical — stop
if ($collector->getCriticalRate() > self::ABORT_THRESHOLD) {
throw new ImportAbortedException(
"Too many critical errors: {$collector->getCriticalRate()}%"
);
}
Implementation Timeline
- Structural validation (Laravel Validator), sanitization, error collection — 1 day
- Business rules, SKU deduplication, severity levels — +1 day
- Error storage in DB, abort threshold, admin UI display — +1 day







