Parsing Product Images to Populate 1C-Bitrix
A catalog without images does not convert. Uploading photos manually for 3,000+ SKUs takes weeks. Parsing images from manufacturer or supplier websites closes the gap in 1–3 days — provided the download process, quality checks, and attachment to infoblock elements are handled correctly.
How 1C-Bitrix stores product images
Images are stored in the b_file table, physically in /upload/iblock/. An infoblock element is linked to an image via the following fields:
-
PREVIEW_PICTURE— thumbnail for listings (ID of the record in b_file) -
DETAIL_PICTURE— main photo for the product card - Property of type
F(file) orG(gallery) — for additional images
For a gallery, use a property of type F with the MULTIPLE = Y flag. The standard bitrix:catalog.element component reads images from this property.
Downloading and saving images
Step 1: download the file
$imageData = file_get_contents($imageUrl);
// or via Guzzle with timeout and retry
Step 2: save via CFile::MakeFileArray()
$tmpFile = tempnam(sys_get_temp_dir(), 'img_');
file_put_contents($tmpFile, $imageData);
$fileArray = CFile::MakeFileArray($tmpFile);
$fileArray['name'] = $filename;
$fileId = CFile::SaveFile($fileArray, 'iblock');
Step 3: attach to the element
CIBlockElement::SetPropertyValuesEx($elementId, $iblockId, [
'MORE_PHOTO' => ['n0' => ['VALUE' => $fileId]]
]);
For multiple images, use indexes n0, n1, n2, etc.
Issues when downloading images
Image rights — legally, you must confirm the right to use the photos. Manufacturer images can generally be used to sell their products — but verify the terms.
Hotlink protection — source sites may check the Referer header. Pass the correct header:
$client->get($url, ['headers' => ['Referer' => 'https://source-site.com']]);
Image quality — not all found photos are usable. Check minimum dimensions before saving:
$imageInfo = getimagesizefromstring($imageData);
if ($imageInfo[0] < 300 || $imageInfo[1] < 300) continue; // skip small images
Duplicates — the same URL appearing on different pages. Cache already-downloaded URL → file_id pairs in memory or a table.
Extracting image URLs from the source
For a single main image:
$src = $crawler->filter('.product-image img')->attr('src');
For a gallery — images are often in data attributes:
$crawler->filter('[data-image]')->each(function($node) use (&$urls) {
$urls[] = $node->attr('data-image');
});
Sometimes an image array is embedded in JS: productImages: ["url1", "url2"] — parse it via regex or JSON-LD.
Handling existing images
Do not overwrite photos uploaded manually or imported from 1C. Logic:
- Check
PREVIEW_PICTURE— if 0 or empty, add it - For the gallery — add only if the
MORE_PHOTOproperty is empty - Mark parsed photos with a prefix in the filename (
parsed_) for later identification
Work timeline
| Phase | Duration |
|---|---|
| Analyzing image structure on the source | 2–4 hours |
| Downloading, validating, saving via CFile | 1–2 days |
| Attaching to infoblock elements (thumbnail + gallery) | 4–8 hours |
| Error handling, retry logic, logging | 4 hours |
| Test run on 500 items | 4 hours |
Total: 3–5 working days. For large catalogs (10,000+ images), add 1–2 days for download speed optimization (parallel workers).

