Comment by elashri
3 days ago
This reminds me of how physicists will define a tensor. So a second rank tensor is the object that transforms according as second rank tensor when the basis (or coordinates) changes. You might find it circular reasoning but it is not, This transformation property is what distinguishes tensors (of any rank) from mere arrays of numbers.
Looking at things from abstract view does allow us not to worry about how we visualize the geometry which is actually hard and sometimes counter intuitive.
This is a tendency among physicists that I find a bit painful when reading their explanations: focusing on how things transform between coordinate systems rather than on the coordinate-independent things that are described by those coordinates. I get that these transformation properties are important for doing actual calculations, but I think they tend to obfuscate explanations.
In special relativity, for example, a huge amount of attention is typically given to the Lorenz transformations required when coordinates change. However, the (Minkowski) space that is the setting for special relativity is well defined without reference to any particular coordinate system, as an affine space with a particular (pseudo-)metric. It's not conceptually very complicated, and I never properly understood special relativity until I saw it explained in those terms in the amazing book Special Relativity in General Frames by Eric Gourgoulhon.
For tensors, the basis-independent notion is a multilinear map from a selection of vectors in a vector space and forms (covectors) in its dual space to a real number. The transformation properties drop out of that, and I find it much more comfortable mentally to have that basis-independent idea there, rather than just coordinate representations and transformations between them.
I agree that focusing on Lorentz transformations is the wrong way to approach thinking about special relativity. But It might be the right way to teach it to physics students.
The issue is the level of mathematical sophistication one has when a certain concept is introduced. That often defines or at least heavily influences how one thinks about it forever.
The basics of special relativity came up in my first year of university, and the rest didn't really get focused on until my second year.
The first time around I was still encountering linear algebra and vector spaces, while for the second I was a lot more comfortable deriving things myself just given something like the Minkowski "inner product".
(As an aside: I really love abstract index notation for dealing with tensors)
> The issue is the level of mathematical sophistication one has when a certain concept is introduced. That often defines or at least heavily influences how one thinks about it forever.
That was one of the most interesting things of my EE/CS dual-degree and the exact concept you're describing has stuck with me for a very long time... and very much influences how I teach things when I'm in that role.
EE taught basic linear algebra in 1st year as a necessity. We didn't understand how or why anything worked, we were just taught how to turn the crank and get answers out. Eigenvectors, determinants, Gauss-Jordan elimination, Cramer's rule, etc. weren't taught with any kind of theoretical underpinnings. My CS degree required me to take an upper years linear algebra course from the math department; after taking that, my EE skills improved dramatically.
CS taught algorithms early and often. EE didn't really touch on them at all, except when a specific one was needed to solve a specific problem. I remember sitting in a 4th year Digital Communications course where we were learning about Viterbi decoders. The professor was having a hard time explaining it by drawing a lattice and showing how you do the computations, the students were completely lost. My friend and I were looking at what was going on and both had this lightbulb moment at the same time. "Oh, this is just a dynamic programming problem."
EE taught us way more calculus than CS did. In a CS systems modelling course we were learning about continuous-time and discrete-time state-space models. Most of the students were having a super hard time with dx/dt = A*x (x as a real vector, A as a matrix)... which makes sense since they'd only ever done single-variable calculus. The prof taught some specific technique that applied to a specific form of the problem and that was enough for students to be able to turn the crank, but no one understood why it worked.
> But It might be the right way to teach it to physics students.
Having studied physics, I would disagree rather strongly. I only really started understanding Special Relativity once I had a clear understanding of the math. (And then it becomes almost trivial.) Those of my fellow class mates, however, who didn't take the time to take those additional (completely optional) math classes, ended up not really understanding it at all. They still got confused by what it all meant, by the different paradoxes, etc.
I saw the same effect when, later, I was a teaching assistant for a General Relativity class.
Yeah, I had a slightly odd introduction to these things as I studied joint honours maths and physics. That meant both that I had a bit more mathematical maturity than most of the physics students and that I was being taught the more rigorous underpinnings of the maths while it was being (ab)used in all sorts of cavalier ways in physics. I liked the subject matter of physics more, but I greatly preferred the intellectual rigour of the maths.
Eric Gourgoulhon is a product of the French education system, and I often think I would have done better studying there than in the UK.
1 reply →
Taylor & Wheeler's Spacetime Physics is similar. They emphasize the importance of frame invariant representations. (I highly recommend the first edition over the second edition, the second edition was a massive downgrade.)
Kip Thorne was also heavily influenced by this geometric approach. Modern Classical Physics by Thorne & Blandford uses a frame invariant, geometric approach throughout, which (imo) makes for much simpler and more intuitive representations. It allows you to separate out the internal physics from the effect of choosing a particular coordinate system.
One of the worst examples is Weinberg’s book on GR, which I found nearly unreadable due to the morass of coordinates/indices. So much more painful to learn from than Wald or other mathematically modern treatments of GR.
That's good to know about Wald. I bought a copy to finally get my head round General Relativity, but its brief explanation of Special Relativity right at the start made it clear that I hadn't properly understood that, which led to me getting Gourgoulhon's book. I should be better placed to tackle it now.
3 replies →
I think _Spacetime Physics_ takes roughly the same approach (they call it “the invariant interval”), but with much less mathematical sophistication required.
Thanks for the book recommendation.
I found the physicist definition of a tensor is actually more confusing, because you are faced with these definitions how to transform these objects, but you never are really explained where does it all come from. While the mathematical definition through differential forms, co-vectors, while being longer actually explains these objects better.
I don't get why people act like this definition is so circular. If you were to explain in detail what "transforms as a second rank tensor" means then it wouldn't be circular anymore. This just isn't the full definition.
> You might find it circular reasoning but it is not
Um, yes it is. "A foo is an object that transforms as a foo" is a circular definition because it refers to the thing being defined in the definition. That is what "circular definition" means.
To be fair to physicists, the standard physicists' definition isn't "a tensor is a thing that transforms like a tensor", it's "a tensor is a mathematical object that transforms in the following way <....explanation of the specific characteristics that mean that a tensor transforms in a way that's independent of the chosen coordinate system...>".
When people say "a tensor is a thing that transforms like a tensor" they're using a convenient shorthand for the bit that I put in angle brackets above.
My favourite explanation is that "Tensors are the facts of the universe" which comes from Lillian Lieber, and is a reference to the idea that the reality of the tensor (eg the stress in a steel beam or something) is independent of the coordinate system chosen by the observer. The transformation characteristic means that no matter how you choose your coordinates, the bases of the tensor will transform such that it "means" the same thing in your new coordinates as it did in the old ones, which is pretty nifty.
https://www.youtube.com/watch?v=f5liqUk0ZTw&pp=ygURdGVuc29yc...
> a convenient shorthand for the bit that I put in angle brackets above.
Yes, but the "convenient shorthand" only makes sense if you already know what a tensor is. That renders the "definition" useless as an explanation or as pedagogy. It's only useful as a social signal to let others know that you understand what a tensor is (or at least you think you do).
> My favourite explanation is that "Tensors are the facts of the universe"
That's not much better. "The earth revolves around the sun" is a fact of the universe, but that doesn't help me understand what a tensor is.
What matters about tensors are the properties that distinguish them from other mathematical objects, and in particular, what distinguishes them from closely related mathematical objects like vectors and arrays. Finding a cogent description of that on the internet is nearly impossible.
> the reality of the tensor ... is independent of the coordinate system chosen by the observer
Now you're getting closer, but this still misses the mark. What is "the reality of a tensor"? Tensors are mathematical objects. They don't have "reality" any more than numbers do.
> no matter how you choose your coordinates, the bases of the tensor will transform such that it "means" the same thing in your new coordinates as it did in the old ones
That is closer still. But I would go with something more like: tensors are a way to represent vectors so that the representation of a given vector is the same no matter what basis (or coordinate system) you choose for your vector space.
3 replies →
Right, but if you fill in the shorthand there’s no reason to think it’s circular; it’s just a normal definition at that point, albeit one without much motivation.
1 reply →