During my graduate studies in Japanese history, interpreting manuscript primary sources from the 19th century and earlier was one of my greatest challenges. Historical typographic records exist, especially in my period of concentration during the Bakumatsu-Meiji transition. But the further the research axis goes back in time, the rarer these documents become. There is a plethora of handwritten material, written in historical cursive, but learning to read it is a significant investment of time and resources beyond the means of most people who might otherwise want to learn.
Additionally, due to a variety of linguistic and educational reforms implemented since the turn of the 20th century, even many native Japanese speakers cannot read historical cursive. Even someone in my position, who had a college affiliation but only sporadic classroom training on the subject, will find it a challenge. I also could not afford to travel to pursue targeted studies in Japanese cursive at other universities. That being the case, I had to combine print resources and online glossaries to learn to read cursive writing by self-study. To this day, I still struggle with it.
To be fair, learning to read handwriting in any language, even contemporary, is a challenge for a non-native speaker. Paleography, which is the study and interpretation of historical writing systems, is even more of a challenge.
But even though I had to teach myself to read Edo period script from mostly ink and paper sources, a new generation of apps promises to make parsing Japanese cursive script, called kuzushiji (“squashed letters”) in Japanese, a much easier proposition for current and future students of Japanese history.
On September 13, Tokyo-based printing company Toppan Inc. announced the next stage in the development of its new kuzushiji Software. Called Fuminoha Zemi, it is an AI-driven OCR (optical character recognition) service. Toppan has been developing its OCR kuzushiji technology with the help of a number of research organizations since 2015. It was based on the Bunsho Gazō Systema character recognition database developed by Professor Terasawa Kengo of Hakodate Mirai University that collected examples of a given character – say, a hiragana に or a kanji 阿 – across different scanned documents.
When a user takes a snapshot of a kuzushiji document, the Fuminoha system analyzes the characters and overlays a typographic equivalent for easier reading and further analysis. This can be exported in plain text, PDF and HTML formats, for greater reading convenience.
Of course, even so, it still requires training and reading experience. historical Japanese (yes, including historical kanji), which is not the same as contemporary standard Japanese. But when turned into a print form rather than kuzushiji, it is a much easier task. Fuminoha is currently deployed in its pre-beta proof-of-concept form in select museums and archives, and is not available for public use. In the new year, however, that should change. Its beta version for iOS is scheduled for January 2023, with a full public release to follow soon after in March for private use. It will be available through app and web browser versions.
A challenger rises
Fuminoha is not the only attempt to kuzushiji transcription app based on OCR and AI templates. I’ve been using a competing app for the past few months, called Miwo, which has similar functionality and has been available for download since August 2021. Miwo was created by the Center for Open Data in the Humanities, part of the Japanese Organization information and systems research. It was trained on two recognition models, both derived from the Kuzushiji dataset of the National Institute of Japanese Literature. As a historian keen to expand the range of sources at my disposal, I’m curious to see how Miwo compares to Fuminoha. I plan to write a sequel to this piece after it’s released, comparing their features and performance.
To learn more about Fuminoha, check out Toppan’s Fuminoha page for more details, examples of how it works, in-depth information and more, at https://www.toppan.co.jp/biz/fuminoha/
For my part, I look forward to having one more tool at my disposal to better and more easily access primary sources. I also look forward to this improved readability of historical documents opening up the large swathes of digitized material for the enjoyment, study and appreciation of the general public in Japan and abroad.