Parsing product images for 1C-Bitrix content

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.
Our competencies:
Development stages

Parsing Product Images to Populate 1C-Bitrix

A catalog without images does not convert. Uploading photos manually for 3,000+ SKUs takes weeks. Parsing images from manufacturer or supplier websites closes the gap in 1–3 days — provided the download process, quality checks, and attachment to infoblock elements are handled correctly.

How 1C-Bitrix stores product images

Images are stored in the b_file table, physically in /upload/iblock/. An infoblock element is linked to an image via the following fields:

  • PREVIEW_PICTURE — thumbnail for listings (ID of the record in b_file)
  • DETAIL_PICTURE — main photo for the product card
  • Property of type F (file) or G (gallery) — for additional images

For a gallery, use a property of type F with the MULTIPLE = Y flag. The standard bitrix:catalog.element component reads images from this property.

Downloading and saving images

Step 1: download the file

$imageData = file_get_contents($imageUrl);
// or via Guzzle with timeout and retry

Step 2: save via CFile::MakeFileArray()

$tmpFile = tempnam(sys_get_temp_dir(), 'img_');
file_put_contents($tmpFile, $imageData);
$fileArray = CFile::MakeFileArray($tmpFile);
$fileArray['name'] = $filename;
$fileId = CFile::SaveFile($fileArray, 'iblock');

Step 3: attach to the element

CIBlockElement::SetPropertyValuesEx($elementId, $iblockId, [
    'MORE_PHOTO' => ['n0' => ['VALUE' => $fileId]]
]);

For multiple images, use indexes n0, n1, n2, etc.

Issues when downloading images

Image rights — legally, you must confirm the right to use the photos. Manufacturer images can generally be used to sell their products — but verify the terms.

Hotlink protection — source sites may check the Referer header. Pass the correct header:

$client->get($url, ['headers' => ['Referer' => 'https://source-site.com']]);

Image quality — not all found photos are usable. Check minimum dimensions before saving:

$imageInfo = getimagesizefromstring($imageData);
if ($imageInfo[0] < 300 || $imageInfo[1] < 300) continue; // skip small images

Duplicates — the same URL appearing on different pages. Cache already-downloaded URL → file_id pairs in memory or a table.

Extracting image URLs from the source

For a single main image:

$src = $crawler->filter('.product-image img')->attr('src');

For a gallery — images are often in data attributes:

$crawler->filter('[data-image]')->each(function($node) use (&$urls) {
    $urls[] = $node->attr('data-image');
});

Sometimes an image array is embedded in JS: productImages: ["url1", "url2"] — parse it via regex or JSON-LD.

Handling existing images

Do not overwrite photos uploaded manually or imported from 1C. Logic:

  1. Check PREVIEW_PICTURE — if 0 or empty, add it
  2. For the gallery — add only if the MORE_PHOTO property is empty
  3. Mark parsed photos with a prefix in the filename (parsed_) for later identification

Work timeline

Phase Duration
Analyzing image structure on the source 2–4 hours
Downloading, validating, saving via CFile 1–2 days
Attaching to infoblock elements (thumbnail + gallery) 4–8 hours
Error handling, retry logic, logging 4 hours
Test run on 500 items 4 hours

Total: 3–5 working days. For large catalogs (10,000+ images), add 1–2 days for download speed optimization (parallel workers).