← Back to context

Comment by IIAOPSW

3 years ago

I know your examples are intentionally extreme to prove a point, I'm biting anyway.

Parenthetical type grammar with an explicit start character and end character is pivotal for encoding information unambiguously. You can't replicate that with any system that uses the same characters for the start and end, because it would be ambiguous as to if you are starting a nested context or ending the present one. Double, single, and even the rare triple quote allow for nested quotation. In principle a clean open and close quotation mark would also solve this (no subtle pixel hunting). You're right that we don't truly need four redundant variations on bracketing, but reducing it to just one is probably too few as it would be representing too many possible things at once. How about one pair for a narrative context (aka a quote), one pair for linguistic recursion (like I'm doing right now), one pair for collections of objects such as a list or a set. Colons probably could be skipped, everything beyond that is strawmanning me. A certain small number of delimiters / particles / whatever are needed to have expressive completeness. You need to be able to build sequential lists, unordered lists, one of several possibility sets, and / or / not type relations. In other words, a natural language at the very least needs some sort of regex subsystem, but it need not be much more more sophisticated than regex. I'm not a grammar denialist in fact quite the opposite. I want the information coded in simple grammar rules, not ad hoc arbitrary tables continually expanding.

I say this as someone who had a 12th grade vocabulary in 5th grade and its only gone up since, vocabulary is a waste of time.

Actually, I'm almost with you on 'c', but I'd rather throw out 'k' because its one of the few that don't fit on a 7 segment display. Capital letters also don't add much information. Yes actually, I'm fine with all of those going away. I couldn't tell you why the people who design way finding signage avoid serifs like a pox, yet other design fields refuse to read without them. With or without seems to read just fine. I really don't care too much either way. Letters would be better if they all worked more like EFHLT. Right now, too many clashing elements. Some are boxy, some are round, some have sharp diagonals. I'm not saying it has to be a 7 segment design, but it would certainly be pleasing if learning the alphabet, its ordering, how to write it, could all happen much faster by just noticing a few easy repeating patterns. Yes actually, lets do language reform.

>It seems to me that many people seem to draw a line between what is acceptable and what is not based on whatever they are comfortable and familiar with by the time they reach the end of their schooling.

Well I'll agree with you there. All to often pointless pedantry comes down to "my school must be right otherwise I am wrong". Love or hate my reasoning, at least you can't accuse me of doing that.

> Parenthetical type grammar with an explicit start character and end character is pivotal for encoding information unambiguously.

You argue against multiple types of dashes because context is sufficient, despite there being typographical ambiguity. But you insist that we must have typographically unambiguous bracket characters. I must admit that I am struggling in this conversation to determine when we can depend on context and when we need unambiguous markers. Perhaps I am just incapable of picking up on the subtle context that backs up this position of yours. (:

> everything beyond that is strawmanning me

In fact, you will find examples of real human languages that exhibit more extreme versions of the things I have suggested.

FOREXAMPLELATINWASORIGINALLYWRITTENINASINGLECASEWITHNOSPACESBETWEENWORDS SENTENCESWERESEPARATEDBYASINGLESPACE OBVIOUSLYALLOFTHEPUNCTUATIONISUNNECESSARY SOALLARGUMENTSABOUTTYPOGRAPHYOTHERTHANTHATOFFONTSAREBASEDINREALITY

There are languages with simpler tense systems than what English has. Slavic languages, for example, tend not to have a pluperfect. So, the example of removing tenses is based in reality.

Hawaiian has an alphabet of just 13 letters. So, removing letters from the 26 in the English alphabet is based in reality.

The Dictionnaire de l'Académie française is being updated to its 9th edition and is expected to have ~60K words[0], whereas English dictionaries report an order of magnitude more[1] (even with the issues in the linked source, this is a large gap). Basic English[2] has a vocabulary of less than 1,000 words (if you desire a vast overhaul of the existing norms of typography, I hope that you are at least willing to entertain prior art in the area of overhauling the use of natural language as a valid example, even if you disagree with the intention or outcome). If you wanted me to go to extremes (which again, I did not in the post you replied to), I could have just suggested we use Toki Pona. Of course, if I did suggest such a conlang, you may have been correct that I was strawmanning you and going to extremes just for a point. Nevertheless, we can definitely conclude that there are, in fact, natural human languages with substantially fewer words than modern English, and there are definitely constructed and artificially restricted natural languages with enormously smaller vocabularies.

You need not agree that these examples constitute best practice, or that they represent desirable goals in the continued evolution of language and written communication. I hope, though, that you can recognize that none of these are strawmen, but based in reality, many in natural languages, and some in artificially constrained natural languages for specific purposes. If anything, I presented examples that do not represent the extremes of any position (I could easily have brought up languages with no written representation, for example). I merely selected additional examples that conform to a broad categorization of removing stuff from modern English.

I welcome further discussion on the topic, but I worry you might dismiss things I say you disagree with, as you have done once above by ascribing an intention of strawmanning you, and as you seem wont to do with typographical conventions you dislike. And if you want to eliminate the punctuation you dislike, what might you do to a person whose arguments you dismiss? (;

It seems though, that you just don’t like the various dashes, which is totally fine. Many other people and I find value in them. Still more probably just go along because, as I said, a big part of language norms comes from inertia. The point of language (other than perhaps some, but not all, artistic expression) is communication. Why abandon the norms that facilitate this communication? Is it better to stand on preference (or perhaps principle) and harm your attempts at communication or to yield to norms and be better understood (though perhaps annoyed)? I do not know that there is a correct answer to this question.

I do hope, though, that I have disabused you of the fanciful notions that I was cherry-picking ideas that are extreme just to prove a point and that I was strawmanning your argument. I have shown above numerous examples that back up each of my suggestions, grounded in the reality of natural human languages. Further, I have shown several examples that are truly extreme to show that my original suggestions were not “intentionally extreme to prove a point.”

[0] https://www.thoughtco.com/academie-francaise-1364522 [1] https://www.merriam-webster.com/help/faq-how-many-english-wo... [2] https://simple.wikipedia.org/wiki/Basic_English

  • I don't care about multiple types of parenthesis per se, I do care about there being a spanning set of grammatical constructs. I don't think period and comma alone would be enough. You need to have constructs for compressing and abstracting. "John/Paul/Ringo/George were in the Beatles." Notice how I just made 4 sentences for the price of one. I could have written "John was in the beatles", "Paul was in the Beatles" ... all four statements fully unrolled. You need constructs which let you FOIL sentence structure just like in math class, presenting (option A, B and C) to (you, and everyone else). You also need a handful of "client server type" interaction structures. Header information. A thing to indicate if the content is a question, request, demand, greeting etc. Grammar is not about encoding literal speech pausing, its about encoding how to deserialize the linear sequence of words.

    In theory you could just make "(" and ")" the universal sub-context denoting symbol. You would just need a different extra symbol to clarify between what a parenthesis means. The three systems makes sense. One for data agnostic compression like a JSON object / foiling a math expression, one for relaying text itself as an object in the domain of discussion rather than as the thing being said (aka a "quotation"), and one for scopes that are part of the discussion per se (not quotation).

    Context suffices when the parts of speech have no chance of being in the same slot. Compound words and numbers.. your machine screw example was pretty rare. I think the dashes are too specialized in meaning and too hard to tell apart to justify code points in the docs and buttons on my keyboard. If need be, distinguish the various flavors of hyphen with some rule about touching the letter or having two in a row. Our symbol set is reasonable. Not as succinct as Hawaiian, not so bloated as Chinese. 13 chars fits in 4 bits. 26 chars fits in 5. With great strain you can maybe find a workable set of grammatical symbols without blowing past 32 chars, but will probably end up using a 6th. I'm against bloating the raw number of symbols and rules everyone has to rote learn, not dashes in particular. If its already in frequent use like all the paren styles then fine, but lets not make anything worse than it has to be.

    • > Grammar is not about encoding literal speech pausing,

      This is absolutely correct.

      > its about encoding how to deserialize the linear sequence of words

      This is absolutely incorrect. Grammar is the collection of rules that prescribes the combination of words to make valid collections of the same in a language. Specifically, grammar is distinct from semantics, which is concerned with meaning. A nonsense statement may be grammatically correct.

      Punctuation is the collection of non-character glyphs that are used to capture the nuances of spoken language into a written form.

      Punctuation is orthogonal to grammar.

      Put more briefly: spoken language has grammar and no punctuation; written language has the same grammar as the same spoken language and also punctuation.

      Parenthetical asides are represented in spoken language with some combination of marker words, pauses, tone of voice, word choice, and perhaps other indicators I may have forgotten. The purpose of punctuation is to lend some of the nuance of spoken communication to the otherwise sparse written word.

      The argument of the number of bits to encode glyphs is also orthogonal to the purpose or usefulness of language, writing, and communication. Computers are tools. A keyboard should justify the paucity of its glyphs, rather than the other way around. Once we get here, we are in the realm of pure opinion and preference, which I don't have much interest in pursuing.

      1 reply →