← Back to context

Comment by throwup238

1 day ago

I use the Portal de Archivos Españoles [1] for Spanish colonial documents. Each country has their own archive but the Spanish one has the most content (35 million digitized pages)

The hard part is knowing where to look since most of the images haven’t gone through HRT/OCR or indexing so you have to understand Spanish colonial administration and go through the collections to find stuff.

[1] https://pares.cultura.gob.es/pares/en/inicio.html

Want to collab on a database and some clustering and analysis? I’m a data scientist at FAIR with an interest in antiquarian docs and books

  • Sadly I'm just an amateur armchair historian (at best) so I doubt I'd be of much help. I'm mostly only doing the translation for my own edification

    • You may be surprised (or not?) at how many important scientific and historical works are done by armchair practitioners.

  • You should maybe reach out to the author of this blog post, professor Mark Humphries. Or to the genealogy communities, we struggle with handwritten historical texts no public AI model can make a dent in, regularly.

  • Spaniard here. Let me know if I can somehow help navigate all of that. I’m very interested in history and everything related to the 1400-1500 period (although I’m not an expert by any definition) and I’d love to see what modern technology could do here, specially OCRs and VLMs.