Comment by mathisfun123

5 days ago

> it is nuts that in an object method, there is a performance enhancement through caching a member value

i don't understand what you think is nuts about this. it's an interpreted language and the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want). so there's no way for the interpreter/compiler/runtime to know you're accessing a field of the class itself (let alone that that field isn't a computed property or something like that).

lots of hottakes that people have (like this one) are rooted in just a fundamental misunderstanding of the language and programming languages in general <shrugs>.

If you dig into JS engine implementations they deal with a lot of the same sorts of things. Simple objects with straightforward properties are tagged such that they skip the dynamic machinery with fallback paths to deal with dynamism when it is necessary.

A common approach is hidden classes that work much like classes in other languages. Reading a simple int property just reads bytes at an offset from the object pointer directly. Upon entry to the method bits of the object are tested and if the object is not known to be simple it escapes into the full dynamic machinery.

I don't know if those exact techniques would work for Python but this is not an either-or situation.

See also: modern Objective-C msg_Send which is so fast on modern hardware for the fast-path it is rarely a performance bottleneck. Despite being able to add dynamic subclasses or message forward at runtime.

What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed. It's dynamism taken to an unnecessary extreme. Nobody in the real world expects this behaviour. Making it just a bit less dynamic wouldn't change the fundamentals of the language but it would make it a lot more tractable.

  • > What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed.

    There is no such thing as 'successive references to the same member value' here. It's not that you look up the same object and it can change, it's that you are not referring to the same object at all.

    self.x is actually self.__getattr__('x'), which can in fact return a different thing each time. `self.x` IS a string lookup and that is not an implementation detail, but a major design goal. This is the dynamism, that is one of the selling points of Python, it allows you to change and modify interfaces to reflect state. It's nice for some things and it is what makes Python Python. If you don't want that, use another language.

    • ok, then it is nuts that __getattr__ (itself a specially blessed function) is not required to be pure at least from the caller point of view.

      1 reply →

  • In Python attribute access aren't stable! `self.x` where `x` is a property is not guaranteed to refer to the same thing.

    And getting rid of descriptors would be a _fundamental change to the language_. An immeense one. Loads of features are built off of descriptors or descriptor-like things.

    And what you're complaining about is also not true in Javascript world either... I believe you can build descriptor-like things in JS now as well.

    _But_ if you want that you can use stuff like mypyc + annotations to get that for you. There are tools that let you get to where you want. Just not out of the box because Python isn't that language.

    Remember, this is a scripting language, not a compiled language. Every optimization for things you talk about would be paid on program load (you have pyc stuff but still..)

    Gotta show up with proof that what you're saying is verifiable and works well. Up until ~6 or 7 years ago CPython had a concept of being easy to onboard onto. Dataflow analyses make the codebase harder to deal with.

    Having said all of that.... would be nice to just inline RPython-y code and have it all work nicely. I don't need it on everything and proving safety is probably non-trivial but I feel like we've got to be closer to doing this than in the past.

    I ... think in theory the JIT can solve for that too. In theory

    • >Remember, this is a scripting language, not a compiled language

      This is the fundamental issue and "elephant in the room" that everyone is seems to be overlooking, and putting under the carpet.

      The extreme compiled type language guys going gung-ho with very slow to compile and complicated Rust (moreso than C++), while the rest of the world gladly hacking their shiny ML/AI codes in scripting language aka Python "the glue duct tapes language" with most if not all the fast engine libraries (e.g PyTorch) written in unsafe C/C++.

      The problem is that Python was meant for scripting not properly designed software system engineering. After all it's based on ABC language for beginners with an asterisk attached "intended for teaching or prototyping, but not as a systems-programming language" [1].

      In ten years time people will most probably look in horror at their python software stacks tech debt that they have to maintain for the business continuity. Or for their own sanity, they will rewrite the entire things in much more stable with fast development and compiled modern language eco-system like D language with native engine libraries, and seamless integration C, and C++ (to some extend) if necessary.

      [1] ABC (programming language)

      https://en.wikipedia.org/wiki/ABC_(programming_language)

      3 replies →

  • > What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable.

    The language supports multiple threads and doesn’t have private fields (https://docs.python.org/3/tutorial/classes.html#private-vari...), so the runtime cannot rule out that the value gets changed in-between.

    And yes, it often is obvious to humans that’s not intended to happen, and almost never what happens, but proving that is often hard or even impossible.

    • wouldn't a concurrent change without synchronization be UB anyway? Also parent wants to cache the address, not the value (but you have to cache the value if you want to optimize manually)

      3 replies →

  • > Nobody in the real world expects this behaviour.

    For example, numbers and strings are immutable objects in Python. If self.x is a number and its numeric value is changed by a method call, self.x will be a different object after that. I'd dare say people expect this to work.

  • basically all object oriented languages work like that. You access a member; you call a method which changes that member; you expect that change is visible lower in the code, and there're no statically computable guarantees that particular member is not touched in the called method (which is potentially shadowed in a subclass). It's not dynamism, even c++ works the same, it's an inherent tax on OOP. All you can do is try to minimize cost of that additional dereference. I'm not even touching threads here.

    now, functional languages don't have this problem at all.

    • OOP has nothing to do with it. In your C++ example, foo(bar const&); is basically the same as bar.foo();. At the end of the day, whether passing it in as an argument or accessing this via the method call syntax it's just a pointer to a struct. Not to mention, a C++ compiler can, and often does, choose to put even references to member variables in registers and access them that way within the method call.

      This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it, which is a problem that extends to the "self" object. In contrast in C++ the compiler knows everything there's to know about the type of this which avoids the issue.

      4 replies →

  • > same member value within the same function body are stable

    Did you miss the part where I explained to you there's no way to identify that it's a member variable?

    > Nobody in the real world expects this behaviour

    As has already been explained to you by a sibling comment you are in fact wrong and there are in fact plenty of people in the real world who do actually expect this behavior.

    So I'll repeat myself: lots of hottakes from just pure. Unadulterated, possibly willful, ignorance.

    • The above is a very thick response that doesn't address the parent's points, just sweeps them under the rag with "that's just how it was designed/it works".

      "Did you miss the part where I explained to you there's no way to identify that it's a member variable?"

      No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.

      "this just how things are dunn around diz here parts" is not an argument.

      7 replies →

> the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want).

The name `self` is a convention, yes, but interestingly in python methods the first parameter is special beyond the standard "bound method" stuff. See for example PEP 367 (New Super) for how `super()` resolution works (TL;DR the super function is a special builtin that generates extra code referencing the first parameter and the lexically defining class)

I don't think it's a hot take to say much of Python's design is nuts. It's a very strange language.