Group DOIs with case insensitive comparison to resolve works duplication / not grouping
I think that when comparing the DOIs of works to decide what to group, you aren't parsing them to be case insensitive using ASCII case folding for comparison of text.
The doi:10.1002/14651858.CD001730.pub3 is equivalent to doi:10.1002/14651858.cd001730.pub3
but Scopus and CrossRef are using these two different presentations, which then list as separate works in ORCID.
Although in this this case I can create a manual entry with the Scopus ID and the DOI as listed by CrossRef, I shouldn't have to and the Scopus presentation of the DOI is the one preferred by the publisher.
Thanks for your suggestion to improve the ORCID Registry.
We are pleased to share that we have added a normalization option in the Registry as a part of our API 3.0 work. This allows identifiers which are case insensitive, such as DOIs, to be processed as the same regardless of case on the ORCID record. Therefore, they are grouped as expected. You can find the card with the update on our current development Trello, or view your ORCID record for an example if this issue has affected you previously: https://trello.com/c/5p6QK7bS/4254
Let us know if you find case sensitive identifiers which are not grouping on your record as expected.
ORCID Community Team
I vote for ignoring "case" in DOIs. We just discovered that DataCIte has changed their case from all upper case now to lowercase when reporting to ORCID. So some of my works I gathered in June from DataCite match duplicate records, but now new DataCite records (reported in October) do not (all that is different is the change to lower case in the DOI).
Thanks for your response. Could you please provide some examples of this occurring, either here or to email@example.com ?
ORCID Community Team
Martin Rittner commented
I see this happening even WHEN the DOI is the same, because of differences in the "URL" field (seems to be mainly Scopus' fault)...
When updating an ORCID record from various literature sources like Scopus, CrossRef, own entries, Pure, it happens easily that duplicates are not automatically being detected and hidden in the public view. Instead, you have to manually go through the list and hide duplicates. Only, if records contain the same doi the deduplication works.
Could this please be improved? Please try to establish a robust deduplication algorithm.
Baptiste Cecconi commented
Any news on case insensitive DOI matching ?
Reposting Israel Hanukoglu's comment in a related thread (https://support.orcid.org/forums/175591/suggestions/3342355):
> I confirm the problem reported by Stuart Ray on March 17, 2017.
The duplicate records in ORCID is a very serious problem that should be fixed.
Reposting Stuart Ray's comment from a related thread (https://support.orcid.org/forums/175591/suggestions/3342355), on March 17, 2017 4:32 PM:
> It appears (from my profile) that the DOI merge function fails to match (and merge) duplicates when there are differences in DOI capitalization from different sources (e.g. Scopus vs. ResearcherID vs. CrossRef).
Many thanks for your notice -- it is the same issue as above, and we have started gathering the information we'll use to address this and related issues.