Comment by strmpnk

7 years ago

An interesting part of this project was dealing with artist names. John has done a very good job normalizing all of this data in a graph which can allow others to discover related works and even merge otherwise distinct aliases.

A subset of this problem emerges from the transliteration of name kanji. Romaji is not always handled consistently, especially in historical contexts, and name characters have their own rough history around digitization of han character code points.

One of the first steps was to adopt name indexes to help with normalization but beyond some of those databases, it's been very interesting to see the graph analysis approach work with a combination of computer vision technology and carefully crafted apps to help archivists and researchers in these communities work together to combine their own data. This is a great example of what technology can do for a community if the intersection between people and technology is done well.