Duplicate Input DICOM Spatial Coordinates #535
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Will flesh out comment more tomorrow, just wanted to get some preliminary notes down prior to tomorrow's meeting.
Solution stores each SOP instance in a new dictionary (not ideal) and checks if the combination of
StudyInstanceUID
,SeriesInstanceUID
, andImagePositionPatient
has been seen before; if so, it removes the duplicate, with preference for the higherAcquisitionNumber
. In cases where theImagePositionPatient
tag is not present (X-Ray images, some non-image DICOM files like SR, KO, PR, etc. modalities), no duplication check takes place. This approach is effective for handling the case discussed:I tested on ~20 clinical studies each for CT, CR, and MR modalities:
ImagePositionPatient
is absent), therefore no changes from existing loading functionalityImagePositionPatient
values seem to be quite common for T1 mDIXON, mDIXON-Quant, SE-EPI MRE series; will provide anonymized version to illustrateI wonder if it is worth having an input parameter to the operator to control whether or not duplicates are checked for. For example, if a user is targeting MR mDIXON series for inference (not sure how common this is), the duplicate check should probably be turned off to prevent a large number of SOP instances from not being loaded.
Finally, I am wondering how arbitrary referencing the higher
AcquisitionNumber
value is; besides the single test case, do we have any additional justification for using this approach?