Skip to content

Technical Design: Images metadata processing

Sergii Ivashchenko edited this page Jun 10, 2020 · 9 revisions

Module

Magento\MediaGalleryMetadata module should be responsible for images metadata processing

Functionality

MediaGalleryMetadata module should provide an ability to extract the metadata from file and populating Media Asset entity fields when an image is uploaded to Magento.

MediaGalleryMetadata module should provide an ability to update the metadata stored in an image file.

API

image

Two API service and one data interfaces should be introduced in the Magento\MediaGalleryMetadataApi module and implemented in Magento\MediaGalleryMetadata module.

Data:

  • MetadataInterface extending Magento\Framework\Api\ExtensibleDataInterface
    • getKeywords(): array
    • getTitle(): string
    • getDescription(): string

Services:

  • ExtractMetadataInterace::execute(string $content): MetadataInterface retrieve the metadata from the file content
  • AddMetadataInterace::execute(string $content, MetadataInterface $metadata): string update the metadata in the content

Exceptions

LocalizedException should be thrown if the metadata cannot be retrieved or saved

Supported file formats

Magento media gallery supports jpeg, gif and png formats

JPEG images can have the metadata saved in EXIF, XMP and IPTC formats.

PNG images can have the metadata saved in EXIF, XMP and IPTC formats.

GIF images can have the metadata saved only in XMP format.

Implementation

The metadata reading/writing should work with the IIM/IPTC, XMP and EXIF file segments.

Retrieving Metadata

The metadata should be retrieved from the file segments in the following fallback order (the next segment should be used if all values cannot be found in the previous segment)

  • IIM/IPTC
  • XMP
  • EXIF

To achieve that a ReaderPool class should be introduced providing access to readers for each format.

The readers for each segment should be added to the ReaderPool class using DI configuration:

  • Reader\Iptc class
  • Reader\Xmp class
  • Reader\Exif class

image

Each reader should return all the metadata retrieved from the segment in an array format and should implement the internal ReadMetadataInterface::execute(string $content): array

The GetMetadataInterace implementation should extract the values require for the MetadataInterface from the reader's response and call the next reader only if the required values were missing in the previous reader response.

The sequence of readers should be configured by the keys of DI configuration:

<type name="Magento\MediaGalleryMetadata\Model\ReaderPool">
    <arguments>
        <argument name="readers" xsi:type="array">
            <item name="10" xsi:type="object">Magento\MediaGalleryMetadata\Model\Reader\Iptc</item>
            <item name="20" xsi:type="object">Magento\MediaGalleryMetadata\Model\Reader\Xmp</item>
            <item name="30" xsi:type="object">Magento\MediaGalleryMetadata\Model\Reader\Exif</item>
        </argument>
    </arguments>
</type>

Saving Metadata

The metadata should be saved to the 3 segments/formats in the image file

  • IIM/IPTC
  • XMP
  • EXIF

To achieve that a WriterComposite class should be introduced executing all the available writers and implementing the internal AppendMetadataInterface::execute(string $content): string.

The readers implementing AppendMetadataInterface for each segment should be added to the WriterComposite class using DI configuration:

  • Writer\Iptc class
  • Writer\Xmp class
  • Writer\Exif class

image

Integration

MediaGalleryMetadata should be used in:

  • MediaGallerySynchronization module to populate the fields of synchronized images
  • MediaGalleryUi module
    • to populate the fields of images uploaded from the media gallery
    • to update metadata in the image file when the metadata is edited on the frontend
  • MediaGalleryIntegration module to populate the fields of images uploaded outside of media gallery

Processing formats

IPTC

IPTC standard: https://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata

We can use the APP13 IPTC data of the image for reading and saving metadata.

Reading the metadata:

$iptcHeaders = [
    'title' => '2#005',
    'headline' => '2#105',
    'keywords' => '2#025'
];

getimagesize('image.jpg', $info);
$decodedMetadata = iptcparse($info['APP13']);
$keywords = $decodedMetadata[$iptcHeaders['keywords']];
$title = $decodedMetadata[$iptcHeaders['headline']];

Saving the metadata:

$newTitle = 'New Title';
$length = strlen($newTitle);
$retval = chr(0x1C) . chr(2) . chr(substr($iptcHeaders['headline'], 2));
if($length < 0x8000) {
    $retval .= chr($length >> 8) .  chr($length & 0xFF);
} else {
    $retval .= chr(0x80) .
        chr(0x04) .
        chr(($length >> 24) & 0xFF) .
        chr(($length >> 16) & 0xFF) .
        chr(($length >> 8) & 0xFF) .
        chr($length & 0xFF);
}
$data = $retval . $newTitle;

$content = iptcembed($data, $filename);

$file = fopen($filename, "wb");
fwrite($file, $content);
fclose($file);

XMP

EXIF

Clone this wiki locally