Comment by gthompson512
12 hours ago
Sorry if I missed it, but what about keeping the suffixes and trying to do some finetuning on the source then clustering sentences or at least pages which given the media should be consistent-ish
12 hours ago
Sorry if I missed it, but what about keeping the suffixes and trying to do some finetuning on the source then clustering sentences or at least pages which given the media should be consistent-ish
Great question — and something I've been thinking about. I stripped suffixes mostly to normalize some of the repeated endings (aiin, dy, etc.) that felt like filler, but you’re totally right that preserving them might preserve structure I lost.
Clustering by sentence or page would be interesting too — I haven't gone that far yet, but it’d be fascinating to see if there’s consistency across visual/media sections. Appreciate the insight!