Comment by throwawaymath
6 years ago
Upvoting this because I think it's really important that we start getting more open source textbooks which specialize in exposition[1]. That being said, I'd make a different arrangement of the material.
Here are a few thoughts:
1. This book embraces the modern take on a rigorous course in linear algebra, which I disagree with. That is to say, it covers vector spaces front and center in the first chapter. Conversely, systems of linear equations (Gaussian elimination) and corresponding notions like row reduction of coefficient/augmented matrices are pushed back several chapters. You see this in popular "flagship" textbooks like Axler's Linear Algebra Done Right because they're prioritizing the underlying theory. I think Gaussian elimination should be covered first because it motivates the material - you're never going to cover Linear Algebra in a way that makes all the pieces fall together in a purely straightforward dependency.
But if you first cover systems of linear equations, you actually motivate the purpose of vector spaces. Moreover, many (most?) proof-based problems in linear algebra reduce to solving a system of linear equations (and therefore reducing a matrix) which is modeled after the thing you're trying to prove. What's the basis of a null space of this linear transformation? You can solve it with row reduction. Is this spanning set a basis of this vector space? Solvable through row reduction. You get a lot of equipment for solving vector space problems if you build up the row reduction scaffolding to a minimal level before vector space coverage. This does not have to temper the theoretical rigor of later, abstract material; if anything it grounds and augments it.
This is why I'm a huge fan of the way that Hoffman & Kunze cover this material. That is a heavily theoretical book, which easily surpasses undergrad-level linear algebra in later chapters. A couple of decades ago it had the spotlight as the really good linear algebra book (it displaced Halmos' Finite Dimensional Vector Spaces, and was itself displaced by Friedberg or Axler). But it motivates a very rigorous coverage of vector spaces by starting with Gaussian elimination, matrix representation of systems of linear equations and (reduced) row echelon forms. If you don't cover this first, you have to either assume the student knows it already or you resign yourself to wading through several chapters of potentially unmotivated abstraction.
2. In my opinion, linear maps should be covered directly after vector spaces. They're a logical succession of the same concept - after you've taught linear combinations, linear (in)dependence, spanning sets, bases, subspaces, and related theorems, it's very easy to jump right into functions between vector spaces. It's just an expansion on all of these things. I would therefore rather see chapter 9 immediately following chapter 2. And much like my prior point, linear maps are an abstraction that empowers a lot of the other ideas; this is mainly because so many things are representable as a linear transformation (much like they're representable as a matrix).
3. The rank is a very important concept, but I would argue it should be subsumed into the chapter on matrices. You can cover anything that doesn't fit cleanly in the matrices chapter in the chapter on linear maps, since you'll need to do that anyway before you cover the dimension theorem. That might be a good argument for simply covering matrix multiplication and addition in prerequisite material and subsuming all of matrices under Gaussian elimination and/or linear maps too...
4. I'm a big fan of the prerequisite material covering basic trigonometry. Trigonometric functions are a good source of exercises for vector spaces. I would like to see an expansion which does minimal coverage of naive set theory (sets, subsets, unions/intersections, etc), functions (injection, surjection, bijection) and polynomials. If you cover the basic definition and arithmetic of polynomials here, you save yourself some space when you cover Lagrange interpolation and factorization later on. It's probably a lot to ask for, but some coverage of differentiation of polynomials would also be useful because the theory of vector spaces gets a lot richer when you incorporate exercises which use a bit of calculus/analysis. In particular, using vector spaces of functions and linear maps of differentiation give you a lot of leeway to show how abstract vector spaces can be without going nuts on the moon math.
5. I've received pushback on this before, but I think it's really important to cover fields as their own subject before covering vector spaces. If you're going to cover vector spaces rigorously, you really need to cover the field axioms. Scalar multiplication doesn't make sense without first defining a field (you can handwave it, but it's easy to make subtle mistakes - like missing that the set of all integer $n$-tuples cannot comprise a vector space). Likewise you ordinarily need your linear map to agree on a field to be well-defined. This coverage would ideally also establish the basic facts of the real and complex fields, which are almost entirely the fields you work with in undergrad linear algebra.
6. I don't see any exercises. A good math textbook challenges its reader to internalize each chapter's material by substantially using it to prove further material. By not including exercises, this text becomes more of an illustrated monograph. That's totally fine if it's what they're going for, but I think there's a huge opportunity to tailor it even further with targeted exercises that expand on the theorems in each chapter.
Note that I'm mostly nitpicking here, even if I think my critiques have merit. The exposition of material is much more important than the arrangement of material, as long as things are building on each other rigorously. I'm not trying to say the textbook is bad; this is just a stream of consciousness about my opinions on arranging linear algebra for maximal pedagogy. You generally have to supplement textbooks in order to get a rich understanding of the subject matter anyway.
On the plus side, I'm a huge fan of the way they present vectors geometrically. That's a huge win for illustration/animation purposes. I do think there's a missed opportunity to make a sexy illustration of transforming a system of linear equations into coefficient, unknown and constant matrices; then from there illustrating the transformation of the augmented matrix into its reduced row echelon form.
__________________________________
1. For pretty good open source, undergrad-level textbooks in Abstract and Linear Algebra respectively, see http://abstract.ups.edu/aata/ and http://linear.pugetsound.edu/html/fcla.html respectively.
I like this idea. It's also how the subject developed historically. That said, don't you get a very computational book in which students spend a lot of time working with matrices filled with numbers if you teach Gaussian elimination first? How do you make sure that students understand the big picture? After all, modern linear algebra is a lot more than the Gaussian elimination algorithm. I'd even go so far as to say that if a student don't understand Gaussian elimination but understands everything else and that there exists some algorithm to solve linear equations, they'd still get 95% of the value of a linear algebra course.
Maybe you can do that if you make it very visual like the book linked in this post, so that you bring in the geometric concept of a vector and link it to linear equations, rather than only working with tuples of numbers. For example, linking the visual concept of linear independence of a set of vectors to a system of equations having a unique solution. If you do that you may be able to teach linear algebra from a linear equation solving point of view, while not getting bogged down in manipulating matrices of numbers.
> After all, modern linear algebra is a lot more than the Gaussian elimination algorithm
This is kind of my point - there actually isn't a whole lot more than Gaussian elimination at the core of linear algebra until you get to rings and modules. Of course I don't mean that in the sense that there aren't important definitions or topics. The theory is very rich. What I mean is that most theoretical questions you can ask about vector spaces and linear maps (or more abstractly, the linearity of a thing), are reducible to a matrix representation and Gaussian elimination.
You don't have to use Gaussian elimination, and I'm not arguing the course should become rote computation of matrices. There is a middle ground. For example if I hand you a linear map and ask you about the dimension of its range, you can identify a spanning set, refine that to a basis and you have your answer. You could also just take the rank of the matrix representing that linear map reduced to row echelon form. I don't think you lose much theory one way or the other.
But I could be wrong - if you have a specific example I'd be interested in hearing it. I've tried to declare front and center that I'm biased from learning linear algebra through Hoffman & Kunze. Halmos also grounded his (very theoretical) book in a generous use of matrices. Not covering matrices until later is a fairly recent development, and I think it's because (for reasons I've stated elsewhere) many professors don't distinguish between the rote computation of matrices and the rich theory of matrices.
Yeah, I agree that most questions in introductory linear algebra are reducible to Gaussian elimination. However, if I had to play devil's advocate, I'd say:
1) Gaussian elimination is only one of the algorithms for solving linear equations. Why that one? Perhaps there is a better one for pedagogical reasons.
2) Why focus on the algorithm in paricular? Maybe we could formulate the thing Gaussian elimination is doing more abstractly with less reference to a particular algorithm. For instance: any matrix A can be written as LDU where L is strictly lower triangular, D is diagonal, U is strictly upper triangular. Or maybe: any linear map can be written as a projection on the first k coordinates relative to some basis, where k is the rank. Or maybe there is a geometric way to understand it.
3) Why focus on this particular concept in particular? That concept is just one of many concepts in introductory linear algebra. Although you can reduce everything in introductory linear algebra to it, you could also pick some other concept around which you could center the course, such as linear independence, the determinant, or something else. Or some other decomposition, such as A = XDY where X,Y are products of elementary row operations (so they are square and have determinant 1), and D is a rectangular diagonal matrix. This decomposition is arguably more important than the LDU decomposition associated to Gaussian elimination. Why even focus on one concept in particular?
4) This is only for the very basics. You also need a plan for Gram-Schmidt, eigenvalues, spectral theorem, Jordan decomposition, and so on, which form the meat the linear algebra courses.
Apart from Gauss elimination, I think eigenvalues, eigenvectors, scalar product, orthogonal subspaces, and the relation to statistics are important. Statistics associated with SVD, principal components and other topics are a main source of insights. Also Fourier basis is an important concept far beyond Gauss elimination.
Of course, I'd agree. But my comment is pretty long as it is, and is more concerned with the arrangement of material than how many extended topics you cover. I'm treating perfection as nothing left to remove, not nothing left to add :)
This is interesting, but I think there's an implicit assumption here that a particular topic must be covered in depth, at which point its study is completed, not to be returned to.
And indeed, this is way many textbooks are structured. But I see no reason for this to be the case. If topics are covered lightly via an exposition and then subsequently drilled into in depth, the order, and resulting motivation becomes almost a side note.
I say this as I'm currently studying linear algebra via Coursera, and the resource that has become most valuable to me is a Scapple board with a bunch of screenshots, text and arrows. It is this (non-linear) diagram that I return to most often to help internalize and solidify my learning.
In this regard, the order of exposition becomes almost incidental.
I think the problem is a tension between what’s valuable to us in learning and what's valuable to us after learning.
So when I was tutoring, I noticed that my “standard approach” failed: a kid would have a homework problem, I would say, “Here's how you want to think about that problem, it’s (say) a conservation of energy problem, calculate the energy before and after...” and it fostered no real learning! I thought the problem was that I was subsidizing laziness—stepped back a bit but gave lessons after some struggling—and that didn’t hurt much but it didn’t help much either. They were motivated, just the didactic structure was not working for them. I started to ask questions, and that was a bit more helpful, but found that I myself was somewhat unguided about how to accomplish “Socratic teaching” so that they could benefit from it—if you can believe it, my default approach was to try to introduce the same theory ideas as questions from the start, “would conservation of energy apply here?”—so no wonder I only got mixed results: it was basically the same thing.
But questions gave me a way to get off that map, because I started to ask tutees to give me expectations and examples, and somehow those worked well. “What can we expect that this system does, within a few orders of magnitude? Can you give me some examples of similar systems and how they work?”
I think that in order to learn abstract things, we need to first have a bunch of examples in our heads, things that all kind of fit together but do not make sense. The abstraction unifies the examples, helps us remember them and fit them together. We can use each example to test the abstraction for fit, to massage our cognitive ability from one context into another.
When we build a building, we need frameworks and scaffolds to build it. When the building is freestanding, all of that effort of building the scaffolds can be undone, the scaffolding comes down, it's no longer necessary. We go back to a textbook and we want a picture-book: show me the building, and just the building, and nothing but the building. Let me feel lushly indulged by its architecture, by photographs that go into detail in the really rich areas—I want indulgent artistic experience, because that's what the building is!
We have a constant tension in textbooks between the how-to-manual—spartan, practical, containing all of these scaffolding steps and nitty-gritty–and the coffeetable book—rich, whimsical, pop-simplified.
Griffiths’ Introduction to Quantum Mechanics begins with a prologue to this effect, though I thought it was just a quantum quirk at the time I think it's actually much more general now. He says [these will not be exact quotes] that, “This book will teach you to do quantum mechanics because I strongly believe that this has to come before any sort of strange discussions about what quantum mechanics is.” It’s one step between Feynman’s “don’t even try to understand quantum mechanics, just shut up and calculate” and Searle’s response “I’m sorry, I am a philosopher and literally the only thing anyone pays me for is to try and understand things like quantum mechanics.”
And I think that is where I want to focus my teaching efforts, that every textbook should kind of have the appendix coffee-table section, I’d even like to carve out a new name for it and call it an “abridgement” or so, at the end. The book Mazes for Programmers has an appendix like this of “let me summarize all of these maze generation algorithms and their essential properties and their essential approaches” that I really loved, it makes it an absolute joy to keep this thing on the bookshelf and come back to it time and time again. The main text has the “Here’s how to do it” information, the appendix has the polite overview of what was just built that is great for returning to.
Yes, I think you've captured the problem pretty nicely. This is the problem I have with doing something like defining vector spaces on arbitrary fields F (not even R or C!) before you even talk about linear equations. The theory of linear algebra is rich and useful, but it's hard to motivate it that way. I don't mean "motivate" in the sense of, "Why should I care about this, and when will I use it?", because you don't always need that kind of motivation to do math. I mean motivation in the sense of, "Why did people come up with this theory? What led them to these definitions, and why are these definitions the right ones?"
I have seen the hierarchy of math presented like this:
1. In high school, you're taught how to compute arc length. You need to do nothing but calculation, and it doesn't matter why it works.
2. In undergrad, you're taught to prove the formula for computing arc length. Given the theorem statement and the requisite axioms, you can show that the formula works.
3. In grad school, you're taught to derive the formula for arc length from first principles. Given a set of axioms, you come up with the theorem and prove it from scratch.
4. In research, you're not taught anything. Instead you solve the questions, "How should I define arc length? Why does my definition of arc length matter, and where is it useful? What can I prove with it?"
Contrary to (quite a bit of) popular opinion, I would hold that a truly rich understanding of the theory can only happen in the context of the trigger for the theory. You don't need to understand the context of systems of linear equations, or why linearity is a useful concept, to understand vector spaces. But it's a lot easier if you have that context. Likewise you don't need to use Gaussian elimination to solve a lot of questions which ask you to prove something about a vector space or a linear map. But if you have that context, you can use it in your proofs instead of miring in towering heights of complexity.
Incoming tech seems about to change the texts-and-tutoring constraint and opportunity space. I wonder if it's time to start thinking and exploring ahead?
Consider an art project, where you sit in a booth, and watch an interesting video drama. But why is it interesting? The video is a graph of segments, like an old "choose your own adventure" story. And there's an eye tracker watching you. So if you're interested in characters A and B, the story mutates to emphasize them.
Consider a one-on-one tutor on a good day. Noticing a student engaged and enthused, they might reorder content on the fly to leverage that. Might emphasize different aspects, and alter presentation, based on observed interests and weaknesses. Or consider working with a young child, watching them read word by word, noticing their where they hesitate and frown, probing for their thoughts, adjusting difficulty, providing a path of development.
What if that was math content rather than a drama or children's book? What if we could do this at scale? Eye tracking is just one tech coming in on the coattails of VR/AR. A setting for personalization AI is another.
What if saying "the best way to organize and present this topic in a textbook", becomes like saying "the Capital mandates, that every teacher of this grade, will today all teach the following lesson, by saying the following words, regardless of local context"? While not on a national level, that is a real thing.
What might it take to start encoding an adventure graph for linear algebra? The "oh, if you like this perspective on this topic, you might like this similar perspective on this other topic"?
Or if we don't have the tooling for that yet, can we start thinking about the tooling? Or fruitfully do something else now, in preparation for opportunity? Perhaps Kahn academy problems in more flavors, in a richer graph? ML-based textbook aggregation, synthesis and retheming? Perhaps it's all not ripe yet. But something else, maybe?
I feel a bit bad for saying this, but I don't think the interactive visualizations here really contribute very much. Yes, you can move the vectors, but the point is already made by the static picture.
Similarly, you can already traverse, not only a single math book in a non-linear order, but any number of different books and other sources concurrently, and this is how everyone I know of already learns. Many textbooks already have a dependency graph in the beginning showing how you can read the chapters! So every person is already traversing their own personalized "adventure graph" for linear algebra and will be throughout their entire education. It is rather the idea of a totalizing tech solution that will be perfect for everyone that smacks of central planning.
1 reply →
I think your idea can be developed today by selecting online forums centered on topics. The main problem is moderation and how to plan the journey.
If you read the older editions of physics (and my comment applies to physics, not math) textbooks like Lanczos and Kittel (let’s leave Landau out of this) where the examples and problems are interspersed in the text often with solutions, the clear implicit invitation is to the students to come up with their own problems (ideally paradoxes!). This is related to point 4 in throwawaymath’s comment: https://news.ycombinator.com/item?id=19265709
If you don’t expect them to be potential future faculty, then by all means let other people hand you the problems and paradoxes unless you’re in contact with experiment.
This. It’s a daily trade-off, as a working professional, between how deep I should learn the fundamentals and how quickly can I get to solving problem at hand.
As an example, I want to learn how to deploy my models on the cloud but the mechanics of how computing on the cloud works is a depth trade-off. I wish I could learn all about distributed computing concepts but I find that I have neither the time or energy for it, at times.
> When we build a building, we need frameworks and scaffolds to build it.
This SIAM published Linear Algebra textbook makes a point about the scaffolding, too: http://matrixanalysis.com/Scaffolding.html
nit: The Feynman bit is a myth: https://physicstoday.scitation.org/doi/full/10.1063/1.176865...
And Searle is an interesting argument for Griffith's perpsective. Searle rose to fame talking about stuff he knew how to do ("speech acts") and in his later years embarassed himself with bad criticisms of AI, as in the Chinese Room, his garbled continuation of the fallacy of the Philosophical Zombie.
It makes perfect sense, since Griffiths’ book is simply not enough to have a working knowledge of quantum, but might be barely enough to pass the GREs (I learned the Heisenberg and interaction pictures from Sakurai, not the pathetic footnote to a problem Griffiths relegates it to)
I like this pedagogical structure. Why do you think so many people recommend Linear Algebra Done Right as a first exposure to the subject?
Because in fairness, it does do so many things right. It's kind of controversial[1] because of Axler's notation and de-emphasis on determinants. But it's an ideal book to give a student who has had a first, computational course in linear algebra (i.e. a course entirely devoted to solving systems of linear equations and drilling matrices).
I think Hoffman & Kunze is really the best textbook for the arrangement of material I'm talking about, but it's outdated at this point and its exposition is strictly less pedagogical than Axler. My vision for the "perfect" linear algebra textbook would be loosely based on Hoffman & Kunze, but replacing a lot of the theoretical exposition with Axler's exposition from Linear Algebra Done Right, then keeping (and even expanding) the mterial on Gaussian elimination and matrices. You could get a solid three-semester sequence of computational and proof-based linear algebra out of such a textbook, and you'd be all set to go to something grad-level like Roman.
Unfortunately it's also hard to judge a lot of linear algebra textbooks because the bifurcation between proof-based and computation-based linear algebra isn't as clean as it is in calculus. Calculus has its own separate course sequence in real/complex analysis, whereas linear algebra doesn't have a distinct name for the more rigorous coverage of the subject. So you have a lot of textbooks which choose one or the other thing, which then results in an over-emphasis on computational stuff in the first course and a complete de-emphasis on the motivation in the second course. When you are learning these back to back in distinct courses that's usually fine, but many first courses actually jump right to Axler despite the words of caution he writes in his own preface.
__________________________________________
1. Noam Elkies uses it when he teaches Harvard's Math 55a. See his list of comments and errata for the book: http://www.math.harvard.edu/~elkies/M55a.17/index.html
have you looked at "Matrix Analysis and Applied Linear Algebra" by Carl Meyer? it seems very in line with what you're saying and the exercises are absolutely excellent. I've slowly made my way half way through the book and am very happy over all
If rank is put where you want, the presentation of the Nullity-Rank theorem will be less than ideal.
Why is that? Note that when I mention the dimension theorem, I'm actually referring to the fundamental rank-nullity theorem.
To be more clear, what I was suggesting is to take the section on rank, toss that into the section on matrices, and then toss that into the section on Gaussian elimination (preceding vector spaces). That's useful because the rank of a coefficient matrix tells you the nature of the solution space of the corresponding system.
Then when you cover the rank-nullity theorem in the chapter on linear maps, you'll have the context of 1) what rank means in a matrix, and 2) what rank means in a linear map. That sets up a deeper understanding of how to verify the rank-nullity theorem for any linear map by finding the rank and the nullity using the basic and free variables of the reduced row echelon matrix.
I learned from Apostol and Lax, so maybe those are my biases talking, but I don’t remember rank being covered in more than one place at least in the former. Apostol proves it directly from the indices of linear maps. I could see introducing rank in the context of saying in real life, instead of basic Gaussian elimination, we have to consider condition number, then take the reciprocal of the condition number, talk about rank singularity, and then from there touch upon rank (punting to linear maps for the whole treatment, which you did acknowledge as an option) I think a straightforward more real world example oriented to say a basic REPL like that will pique their interest.
Also I hated hated hated the finding reduced row echelon problems in Apostol so maybe it’s just me and I’m lazy.