Comment by ____tom____
10 hours ago
>Have you ever daydreamed about talking to someone from the past?
It's going to be more like corresponding with someone from the past. We don't have much in the way of recorded speech from that area, so this will be built from written records. Much more than now, the written records are going to be formal and edited, reflecting a different pattern than casual speech or writing.
Having said that, this is cool. I recently had to OCR a two-hundred year old book with the usual garish fonts from that era. It was remarkably easy to do, and accurate.
You just reminded me of reading a free ebook of Burton’s translation of The Arabian Nights and getting frustrated by “cloth” being used as a verb and not being able to figure out its meaning until I got frustrated and gave up on the experience. Only later did I realize it was an OCR error (or post-OCR correction error) and the intended word was “doth” as in “this transcription doth sucketh.”
> We don't have much in the way of recorded speech from that area
We may not have a ton, but do have a lot of news reels and radio broadcasts from the time surrounding WWI. Certainly enough to style-transfer a voice model to plug into the text model.