Properly encode UTF-8 characters

Completed

Comments

6 comments

  • David W. Lawrence
    This is essential. Not only for "decorated Roman characters" like Klaus requests but also for persons with names that are represented in other alphabets.
    1
    Comment actions Permalink
  • David W. Lawrence
    +1 This request has been made in different ways by at least two other people. I know from direct experience that this is a simple thing to correct. It required only 90 minutes of billed time to not only make everything on my webstie (SafetyLit.org) solidly UTF-8 but to detect and automatically repair mislabeled character encoded data from multiple outside sources as is imported. No ? character substitutions, no boxes, and no other odd character substitions.
    0
    Comment actions Permalink
  • Ana Cardoso

    Hi,

    We've now addressed all the issues with UTF-8 encoding on emails, search, and display on our website so I'm marking this idea as closed. If you encounter any further issues please let us know so we can fix them too.

    Best,
    -Catalina
    ORCID Support

    --

    Hi -- thanks for bringing up the challenges that you have seen with UTF-8 encoding. While the database and website handles UTF-8 encoding, we discovered an incorrect setting in emails that was not using this encoding.

    We have addressed this problem; emails should now send correctly.

    We will keep this idea open, as there are a couple of other things that we continue to work on including:
    * improving how we handle UTF-8 characters in site searches
    * improving searches within external databases to incorporate UTF-8 characters.

    Thanks for your input as we continue to evolve the ORCID Registry. Best, Laura

    0
    Comment actions Permalink
  • Ulli
    UTF-8 handling is not yet flawless. In some cases, the BiBTEX entries are not properly encoded.
    0
    Comment actions Permalink
  • Jens Locher

    The issue still persists with imported records, e.g.

    https://orcid.org/0000-0003-1913-348X

    Shows works like this "?I don't really have any issue with masculinity?: Older Canadian men's perceptions and experiences of embodied masculinity"

    While the source shows the proper character: https://www.cambridge.org/core/journals/ageing-and-society/article/i-am-busy-independent-woman-who-has-sense-of-humor-caring-about-others-older-adults-selfrepresentations-in-online-dating-profiles/CFD1080D6F0029BBB6B8333A59E8BBD6

    0
    Comment actions Permalink
  • Jens Locher

    Encoding actually seems to get worse, e.g.

    @article{Str_ver_2017, doi = {10.1159/000476071}, url = {https://doi.org/10.1159%2F000476071}, year = 2017, publisher = {S. Karger {AG}}, volume = {30}, number = {4}, pages = {180--189}, author = {Kay Strüver and Wolfgang Friess and Sarah Hedtrich}, title = {Development of a Perfusion Platform for Dynamic Cultivation of in vitro Skin Models}, journal = {Skin Pharmacology and Physiology} }

    I am now seeing all kinds of issues in our author lists with characters like "è","ö","ñ","á","ü"

    0
    Comment actions Permalink

Please sign in to leave a comment.