PS3.5 H3.2 compliant #445

9enki · 2023-11-27T09:54:00Z

H3 of PS3.5 indicates that multiple encoding formats are applied within the Patient Name.
https://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_H.3.html
https://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_H.3.2.html

Currently, although it is possible to obtain multiple Specification Character Sets as Strs, decoding in the above case seems to decode at the first element of the Specification Character Set.

I would like to be able to decode the Patient Name in this PR in the correct way when it is encoded in multiple encoding formats.

It is likely that PS3.5 will be able to promptly support PS3.5 Chapter I Korean and Chapter J Chinese by utilizing this support.

9enki · 2023-11-27T11:19:39Z

@Enet4 Hello. I would like to do what we could not do in PR #444 here.

Let me ask you two questions.

I have a function called read_value_pns. I want to execute this function when VR::PN. If Patient Name does not contain delimiter =, read_value_pns executes read_value_strs as before. Is this an acceptable method of implementation for you? If this is a way to deviate from the design concept, I would be glad to get better advice.
- ~~If Patient Name does not contain delimiter =, read_value_pns executes read_value_str as before.~~ I found that read_value_pns reads the value from the buffer, so it would be a bug if read_value_strs was executed afterwards. I will come up with a fix for this.
(If the answer to the first question is yes) Is there any way to get all the values of a Specification Character Set in a read_value_pns? The current implementation of StatefulDecoder keeps only the first one, even if there are multiple Specific Character Sets, as shown here. Since changing self.text from SpecificCharacterSet to Vec<SpecificCharacterSet> is a big change, if there is another way to get the value of the Specification Character Set I would prefer to use that method. I would prefer to use that method if I can get the Specification Character Set value another way.

9enki · 2023-11-27T14:44:03Z

~~If Patient Name does not contain delimiter =, read_value_pns executes read_value_str as before.~~ I found that read_value_pns reads the value from the buffer, so it would be a bug if read_value_strs was executed afterwards. I will come up with a fix for this.

The process after reading information from the buffer of read_value_strs is split into separate processes named read_value_strs_impl, which can be called from both read_value_strs and read_value_pns.
https://github.com/Enet4/dicom-rs/pull/445/files#diff-65231185718d602fd3a63db21c6076af076fd9afb6c3d4303c6baeaffa2e1e13R385

9enki · 2023-11-27T17:38:06Z

(If the answer to the first question is yes) Is there any way to get all the values of a Specification Character Set in a read_value_pns? The current implementation of StatefulDecoder keeps only the first one, even if there are multiple Specific Character Sets, as shown here. Since changing self.text from SpecificCharacterSet to Vec is a big change, if there is another way to get the value of the Specification Character Set I would prefer to use that method. I would prefer to use that method if I can get the Specification Character Set value another way.

I added a member of type Vec<TC> called cs to StatefulDecoder, where Specification Character Set is added each time set_character_set is executed. And modified read_value_pns to decode the character using the value of cs.
I understand that set_character_set is executed by read_value_cs. Fixed set_character_set to run on all character encoding information present in the Specification Character Set, since read_value_cs was implemented in such a way that only the first element was added to the text. Fixed.

9enki · 2023-11-27T17:45:09Z

@Enet4 Implementation of this PR is complete. I would appreciate a review when you have time.

Enet4 · 2023-12-07T15:45:23Z

Hello, @9enki! I'll be looking into DICOM-rs matters the upcoming weeks after a small hiatus.

I have a function called read_value_pns. I want to execute this function when VR::PN. If Patient Name does not contain delimiter =, read_value_pns executes read_value_strs as before. Is this an acceptable method of implementation for you? If this is a way to deviate from the design concept, I would be glad to get better advice.

The way I see it, the concept of Person Name should include all component groups, rather than just one (alphabetic, ideographic, and phonetic), so they would all sit in a single PrimitiveValue. So, once we find out that we have component groups other than the alphabetical one, we decode each group and merge them back together. From the little I saw of the code so far, that seems to be the intended way to go. In any case, I am likely to give it another look later.

Is there any way to get all the values of a Specification Character Set in a read_value_pns? The current implementation of StatefulDecoder keeps only the first one, even if there are multiple Specific Character Sets, as shown here. Since changing self.text from SpecificCharacterSet to Vec<SpecificCharacterSet> is a big change, if there is another way to get the value of the Specification Character Set I would prefer to use that method. I would prefer to use that method if I can get the Specification Character Set value another way.

The main problem is that SpecificCharacterSet is too simple right now. The way forward is to adjust this data type to support more than one character set value. The steps would be something like this:

The existing idea of a SpecificCharacterSet would become a CharacterSetCode (a single encoding).
SpecificCharacterSet is recreated as an opaque structure that can hold up to n codes (depending on which they are; in ISO_IR 192 and some others, only 1 value would be acceptable).

Aside from the fact that one would no longer be able to introspect into the variants available, I think that most of the API would stay intact.

#[derive(Debug)]
pub struct SpecificCharacterSet(SmallVec<[CharacterSetCode; 2]>);

As we are now gathering enough material for a major version (0.7), we can proceed with breaking changes on SpecificCharacterSet and PersonName to fulfill these needs.

If any further guidance is needed, feel free to let me know.

Enet4 · 2024-03-13T17:05:13Z

Hello again. This contribution is a candidate for inclusion in 0.7.0, but then we need to make the proposed change of changing SpecificCharacterSet so that the type becomes opaque and a single value encompasses the possibility of multiple character set codings. Please let me know if you would be able to fulfill this or you would prefer someone else to take over.

Enet4 · 2024-04-13T09:22:57Z

Just letting you know that #489 was merged. It made SpecificCharacterSet opaque, so further contributions on text encoding can be brought in without breaking changes to the API.

Enet4 mentioned this pull request Nov 27, 2023

Tracker issue for text encoding #40

Open

6 tasks

9enki force-pushed the japanese-character-improved branch 2 times, most recently from d31a5e5 to 1be93b0 Compare November 27, 2023 14:41

9enki force-pushed the japanese-character-improved branch 2 times, most recently from c86f433 to fe26c98 Compare November 27, 2023 17:31

9enki changed the title ~~[WIP] PS3.5 H3.2 compliant~~ PS3.5 H3.2 compliant Nov 27, 2023

9enki force-pushed the japanese-character-improved branch from fe26c98 to 1af117b Compare November 27, 2023 17:46

Enet4 self-requested a review November 28, 2023 11:32

Enet4 added A-lib Area: library C-encoding Crate: dicom-encoding labels Nov 28, 2023

Enet4 added this to the DICOM-rs 0.7 milestone Mar 13, 2024

9enki added 3 commits March 28, 2024 11:54

add IsoIr13 and IsoIr87

affe2ca

add test

09912f3

PS3.5 H3.2 compliant

c742803

Enet4 added breaking change Hint that this may require a major version bump on release and removed breaking change Hint that this may require a major version bump on release labels Mar 28, 2024

Enet4 force-pushed the japanese-character-improved branch from c8de225 to 757e126 Compare March 28, 2024 12:09

Special character set starting with \

645cf11

Enet4 force-pushed the japanese-character-improved branch from 757e126 to 645cf11 Compare March 28, 2024 12:27

[parser] fix tests

9b09659

Enet4 removed this from the DICOM-rs 0.7 milestone Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PS3.5 H3.2 compliant #445

PS3.5 H3.2 compliant #445

9enki commented Nov 27, 2023

9enki commented Nov 27, 2023 •

edited

Loading

9enki commented Nov 27, 2023 •

edited

Loading

9enki commented Nov 27, 2023

9enki commented Nov 27, 2023

Enet4 commented Dec 7, 2023

Enet4 commented Mar 13, 2024

Enet4 commented Apr 13, 2024

PS3.5 H3.2 compliant #445

Are you sure you want to change the base?

PS3.5 H3.2 compliant #445

Conversation

9enki commented Nov 27, 2023

9enki commented Nov 27, 2023 • edited Loading

9enki commented Nov 27, 2023 • edited Loading

9enki commented Nov 27, 2023

9enki commented Nov 27, 2023

Enet4 commented Dec 7, 2023

Enet4 commented Mar 13, 2024

Enet4 commented Apr 13, 2024

9enki commented Nov 27, 2023 •

edited

Loading

9enki commented Nov 27, 2023 •

edited

Loading