Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose ExifKey from libexiv2 for easy interoperability #147

Closed
cmahnke opened this issue Aug 17, 2024 · 4 comments
Closed

Expose ExifKey from libexiv2 for easy interoperability #147

cmahnke opened this issue Aug 17, 2024 · 4 comments

Comments

@cmahnke
Copy link
Contributor

cmahnke commented Aug 17, 2024

As far as I can see it's currently not possible to use Exif tag numbers directly. A use case would be better interoperability with other modules like pillow.
A solution could be the following:

  • Get a tag from the external library - those usually support using the raw numbers of the tags.
  • Use ExifKey to translate them to the strings needed by this API
  • Construct a list / dict to feed them into pyexiv2

This way it would possible have a better interoperability with fine grained control (no need to copy all using a byte buffer).

@LeoHsiao1
Copy link
Owner

Well. Most users don't care about tag numbers, so pyexiv2 never reads or writes them.

  1. Reading the tag number is quite simple, I just need to call the exiv2 API tag().
    I can read numbers like this:

    [tag name  ] Exif.Image.Artist
    [tag number] 315
    [tag name  ] Exif.Image.Rating
    [tag number] 18246
    [tag name  ] Exif.Image.RatingPercent
    [tag number] 18249
    [tag name  ] Exif.Image.Copyright
    [tag number] 33432
    [tag name  ] Exif.Image.ExifTag
    [tag number] 34665

    I'll add this feature in next release of pyexiv2.

  2. Writing metadata based on the tag number can be tricky.
    Most users enter only the tag name, not the tag number. So pyexiv2 has to automatically determine the tag number corresponding to each tag name. This requires storing a mapping table, in pyexiv2 source code.
    But I don't understand why Pillow needs to write metadata based on tag number.
    To make it easier to code, I don't even respect the tag type. The exif tag has multiple data types, but I usually write it as str.

    pyexiv2/pyexiv2/convert.py

    Lines 103 to 104 in cbc6765

    typeName = 'string'
    value = str(value)

  3. As for the byte buffer, When exiv2 opens an image, it must call img->readMetadata() to load all the metadata, discover the byte offset of each tag. So it can't read or write only one tag.

@cmahnke
Copy link
Contributor Author

cmahnke commented Aug 17, 2024

Well, thanks for looking into it, my proposal was a bit simpler: just provide a translation table...

A made up example would be:

img_exif = image.getexif()
exiv2_dict = {}
for k, v in img_exif.items():
    tag = ExifKey.from_tag(k)
    exiv2_dict[tag.key] = v

Note: This example doesn't do any type check for the value - this would be a responsibility of the user.

Where ExifKey would have the following methods:

  • constructor __init__: string creates a key object either from the full path (Exif.Image.XResolution) or only the tag name (XResolution)
  • static from_tag: int - tag number returns Exif key from number
  • name: str - Tag name (XResolution)
  • group: str - Group name (Image)
  • familiy: str - Family name always Exif
  • tag: int (?) - Number of tag
  • key: str - Full qualified names (Exif.Image.XResolution)

This way there would be two ways for interoperbility: by number and by name, since it's possible to use names with Pillow, but these don't follow the hierachy.

References:

Using this approach would avoid the need to be able to use the number for writing, i's up to the programmer to do the mapping for writing...

@LeoHsiao1
Copy link
Owner

LeoHsiao1 commented Aug 17, 2024

I tried calling ExifKey() of exiv2:

Exiv2::ExifKey key = Exiv2::ExifKey(34665, "Image");
std::cout << key.key()        << std::endl;
std::cout << key.familyName() << std::endl;
std::cout << key.groupName()  << std::endl;
std::cout << key.tagName()    << std::endl;
std::cout << key.tag()        << std::endl;

It outputs:

Exif.Image.ExifTag
Exif
Image
ExifTag
34665

I noticed that exiv2 uses decimal for the tag number, while pillow uses hexadecimal, but that's easy to convert.
This code can be wrapped into a python function and then called by Pillow.
It does work with one tag at a time, without opening the image.

However, here's the bad news.
When calling ExifKey(), you need to enter not only the tag number, but also the groupName.
Because there are multiple standards for exif, tag numbers can be duplicated. For example:

0x0001	Exif.Canon.CameraSettings
0x0001	Exif.Nikon1.Version
0x0001  Exif.Samsung2.Version

This is a paradox: now that you know the groupName, you should already know the full tag name as well. So you don't need to convert the tag number into the tag name.

Alternatively, we can manually save all possible exif tags, and their corresponding tag numbers, as a Python dict. But this Python dict needs to be updated frequently so that it is synchronized with exiv2.
https://github.com/Exiv2/exiv2/blob/main/src/tags_int.cpp
https://exiv2.org/metadata.html

@LeoHsiao1
Copy link
Owner

I just released v2.15.0, which adds tagNumber in img.read_exif_detail() and img.read_iptc_detail().
For example:

>>> img.read_exif_detail()
{
    'Exif.Image.ImageDescription': {
        'idx': 1,
        'ifdName': 'IFD0',
        'tagDesc': 'A character string giving the title of the image. It may be a comment such as "1988 company picnic" or the like. Two-bytes character codes cannot be used. When a 2-bytes code is necessary, the Exif Private tag <UserComment> is to be used.',
        'tagLabel': 'Image Description',
        'tagNumber': 270,
        'typeName': 'Ascii',
        'value': 'test-中文-'
    },
    'Exif.Image.Make': {
        'idx': 2,
        'ifdName': 'IFD0',
        'tagDesc': 'The manufacturer of the recording equipment. This is the manufacturer of the DSC, scanner, video digitizer or other equipment that generated the image. When the field is left blank, it is treated as unknown.',
        'tagLabel': 'Manufacturer',
        'tagNumber': 271,
        'typeName': 'Ascii',
        'value': 'test-中文-'
    },
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants