As evidenced by the confusion of at least one commenter, I do not think it is a good didactic way to introduce vectors by how they can be written in a particular basis.
It is just unhelpful in many ways. It fixates on one particular basis and it results in a vector space with few applications and it can not explain many of the most important function vector spaces, which are of course the L^p spaces.
In most function vector spaces you encounter in mathematics, you can not say what the value of a function at a point is. They are not defined that way.
The right didactic way, in my experience, is introducing vector spaces first. Vectors are elements of vector spaces, not because they can be written in any particular basis, but because they fulfill the formal definition. And because they fullfil the formal definition they can be written in a basis.
Haha, this works if you already know what a vector space is. But I think pedagogy needs to provide motivating examples. I'll quote one section of a text by Poincaré (translated by an LLM since most here do not speak French).
> We are in a geometry class. The teacher dictates: “A circle is the locus of points in the plane that are at the same distance from an interior point called the center.”
The good student writes this sentence in his notebook; the bad student draws little stick figures in it; but neither one has understood. So the teacher takes the chalk and draws a circle on the board. “Ah!” think the students, “why didn’t he say right away: a circle is a round shape — we would have understood.”
> No doubt, it is the teacher who is right. The students’ definition would have been worthless, since it could not have served for any demonstration, and above all because it would not have given them the salutary habit of analyzing their conceptions. But they should be shown that they do not understand what they think they understand, and led to recognize the crudeness of their primitive notion, to desire on their own that it be refined and improved.
The learning comes from making the mistake and being corrected, not from being taught the definition, I think.
It's trivial to provide motivating examples for vector spaces, and there's no reason you can't do so while explaining what they actually are, which is also very simple for anyone who understands the basic concepts of set, function, associativity and commutativity. The notion of a basis falls out very quickly and allows you to talk about lists of numbers as much as you like without ever implying any particular basis is special.
I hesitate to call anything pedagogically "wrong" as people think and learn in different ways, but I think the coyness some teachers display about the vector space concept hampers and delays a lot of students' understanding.
Edit: Actually, I think the "start with 'concrete' lists of numbers and move to 'abstract' vector spaces" approach is misguided as it is based on the idea that the vector space is an abstraction of the lists of numbers, which I think is wrong.
The vector space and the lists of numbers are two equivalent, related abstractions of some underlying thing, eg. movements in Euclidean space, investment portfolios, pixel colours, etc. The difference is that one of the abstractions is more useful for performing numerical calculations and one better expresses the mathematical structure and properties of the entities under consideration. They're not different levels of abstraction but different abstractions with different uses.
I'd be inclined to introduce the one best suited to understanding first, or at least alongside the one used for computations. Otherwise students are just memorising algorithms without understanding, which isn't what maths education should be about, IMO. (The properties of those algorithms can of course be proved without the vector space concept, but such proofs are opaque and magical, often using determinants which are introduced with no better justification than that they allow these things to be proved.)
I have nothing against starting out with motivating examples, obviously they are needed for understanding.
But they should motivate the definition of a vector space. Not the definition of vectors as mappings of indices.
Functions are actually a great motivating example for the definition of a vector space, precisely because they are first look nothing like what student think of as a vector.
> It fixates on one particular basis and it results in a vector space with few applications and it can not explain many of the most important function vector spaces, which are of course the L^p spaces.
Except just about all relevant applications that exist in computer science and physics where fixating on a representation is the standard.
In physics it is common to work explicitily with the components in a base (see tensors in relativity or representation theory), but it's also very important to understand how your quantities transform between different basis. It's a trade-off.
Completely agree. In uni, I (re)-learned about vectors in linear algebra, and for a good chunk of the course, we didn't write anything in "standard vector notation". We learned about vector axioms first, and then vectors were treated as "anything that satisfies the vector axioms". (When doing more practical examples, we just used the reals instead of something like R^3, but the entire time it was clear that for any proof, anything that can be added and multiplied in the way that the vector axioms describe would fit.) I think adopting this structuralist view really helps with a lot of mathematical studies.
Yes, the L^p spaces are not vector spaces of functions, but essentially equivalent classes of functions that give the same result in an Lebesgue integral. For these reason, common operations on functions, like evaluating at a point or taking a derivative are undefined.
If you care about these you need something more restrictive, for example to study differential equations you can work in Sobolev spaces, where the continuity requirement allows you to identify an equivalent class with a well-defined function.
Now I’m thinking that I have missed the point of the article. I didn’t read it as an introduction to vector spaces, but rather that the introduction served as to give an intuition how functions may be viewed as vectors (going back to the article, it’s even in the section heading).
I found the next parts well written and to the point, leading along the steps to show that indeed the requirements for a Hilbert space are met by L^2 (even though those requirements are only spelled out in the end).
I’m not actively working with mathematics any more, but I didn’t notice any major corner cutting. It’s not text book rigorous but lays out the idea in an easy to follow way.
I took something away from it - or not, depending on whether I missed some inconsistency.
> But we can take it even further; what if we allow any real number as an index?
How can an uncountably infinite set be used as an index? I was fine with natural numbers (countably infinite) being an index obv, but a real seems a stretch. I get the mathematical definition of a function, but again, this feels like we suddenly lose the plot…
We do it all the time. An index is just indicative that there is a mapping (a function), usually from the integers. However we don't use the subscript notation when indexing by a continuum due to the discomfort you describe.
The point is that we need some way to deal with objects that are inherently infinite-dimensional.
Okay I suppose the axiom of choice is somewhat necessary to make it make sense. But only because otherwise such an indexed object may fail to exist.
Anyway arbitrary indexes are useful, you often end up doing stuff like covering a space by finding a covering set for each individual point. And then using compactness to show you only need finitely many to cover the whole space. It is doable without uncountable indices, but it makes it very difficult to write down.
I think getting hung up on words (in this case index) in mathematics is a trap. They are often stretched to their breaking point and you just kind of go with the flow.
> When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’
> ’The question is,’ said Alice, ‘whether you can make words mean so many different things.’
> ’The question is,’ said Humpty Dumpty, ‘which is to be master — that’s all.
I think that’s why the author put “vector” in quotes. I kind of imagine it as an ephemeral, infinite list where for some real, when we use that real value as an index into our “vector”/function, we get the output value as the item in this infinite, ephemeral list.
I think the only thing that matters is that the indices have an ordering (which the reals obviously do) and they aren’t irrational (i.e. they have a finite precision).
Imagine you have a real number, say, e.g. 2.4. What stops us from using that as an index into an infinite, infinitely resizable list? 2.4^2 = 5.76. Depending on how fine-grained your application requires you could say 2.41 (=5.8081) is the next index OR 2.5 (=6.25) is the next index we look at or care about.
A vector is always a vector -- an element of something that satisfies the axioms of a vector space. The author starts with the example of R^n, which is a very particular vector space that is finite-dimensional and comes with a "canonical" basis (0,...,1,...,0). In general, a basis will always exist for any vector space (using the axiom of choice), but there is no need to fix it, unless you do some calculations. The analogy with R^n is the only reason the "indices" are mentioned, and I think this only creates more confusion.
> and they aren’t irrational (i.e. they have a finite precision)
No, if you want only rational "indices", then your vector space has a countable basis. Interesting vector spaces in analysis are uncountably infinite dimensional. (And for this reason the usual notion of a basis is not very useful in this context.)
> and they aren’t irrational (i.e. they have a finite precision).
I'm not sure if I'm misunderstanding what you mean by 'finite precision' but the ordinary meaning of those words would seem to limit it to rational numbers?
You can look at use cases for an index, and see how well they hold up.
Asking where the smallest greater number (next number) is no longer makes sense.
Taking two numbers and asking whether one is greater than the other still makes sense. (and hence also whether they are equal)
Taking two numbers and asking how far separated from each other still makes sense.
You may already observe some uses for indexes in programming that don't use all of these properties of an index. For example, the index of a hash set "only cares about equality", and "the next index" may be an unfilled address in a hash set.
I'm probably ignorant of how indexes work at a nuts-and-bolts level, but intuitively this seems like a good idea for certain situations. E.g if we want to keep entries in a specific order but don't know ahead of time how many entries will be added between two existing ones. House numbers in areas with a lot of development are an example of the kind of problem this seems ideal to solve, when there's a clear 'order' based on geography but no clear limit on the number of addresses that could be added 'between' existing addresses.
I think you're still describing a countably infinite set: there's a bijection between the natural numbers and the set of houses.
One way to think about it is that, even though you're defining an index that permits infinite amounts of subdivision, from any given house there's always a "next house up" in the vector: you can move up one space.
In a real-indexed vector, that notion doesn't apply. It's "infinity plus one" all the way down: whatever real value you pick to start with, x, there's no delta small enough to add to it such that there's no number between x and x+d.
This Has it's use. The continouus Fourier Transform is is based on that. You are asking what frequencies is this continouus signal made of. Time is normally defined as a real number in that context, but If you have a continouus time you need continouus frequencies to map time space to frequency space. You can think about an Index as a lego Block, that you need to construct Something.
The only difference of note, I think, is that you can't enumerate the elements. Instead of being able to say "for each element, ..." you'd have to say "for all elements, ...", like the example of vector length defined as an integral over the full number range.
The author is stretching an analogy, it's a price to pay for starting with R^3 as a motivational example.
There is nothing in the general definition of a vector space that requires it's elements to be "indexed"
What do you understand “index” to mean here? To me, a family indexed by some set is mostly just a different notation for, and attitude towards, a function with domain the indexing set.
As evidenced by the confusion of at least one commenter, I do not think it is a good didactic way to introduce vectors by how they can be written in a particular basis.
It is just unhelpful in many ways. It fixates on one particular basis and it results in a vector space with few applications and it can not explain many of the most important function vector spaces, which are of course the L^p spaces.
In most function vector spaces you encounter in mathematics, you can not say what the value of a function at a point is. They are not defined that way.
The right didactic way, in my experience, is introducing vector spaces first. Vectors are elements of vector spaces, not because they can be written in any particular basis, but because they fulfill the formal definition. And because they fullfil the formal definition they can be written in a basis.
Haha, this works if you already know what a vector space is. But I think pedagogy needs to provide motivating examples. I'll quote one section of a text by Poincaré (translated by an LLM since most here do not speak French).
> We are in a geometry class. The teacher dictates: “A circle is the locus of points in the plane that are at the same distance from an interior point called the center.” The good student writes this sentence in his notebook; the bad student draws little stick figures in it; but neither one has understood. So the teacher takes the chalk and draws a circle on the board. “Ah!” think the students, “why didn’t he say right away: a circle is a round shape — we would have understood.”
> No doubt, it is the teacher who is right. The students’ definition would have been worthless, since it could not have served for any demonstration, and above all because it would not have given them the salutary habit of analyzing their conceptions. But they should be shown that they do not understand what they think they understand, and led to recognize the crudeness of their primitive notion, to desire on their own that it be refined and improved.
The learning comes from making the mistake and being corrected, not from being taught the definition, I think.
Anyway, it's from Science and Method, Book 2 https://fr.wikisource.org/wiki/Science_et_m%C3%A9thode/Livre...
There's more to the section that talks about the subject. I just find this particular paragraph amusingly germane.
It's trivial to provide motivating examples for vector spaces, and there's no reason you can't do so while explaining what they actually are, which is also very simple for anyone who understands the basic concepts of set, function, associativity and commutativity. The notion of a basis falls out very quickly and allows you to talk about lists of numbers as much as you like without ever implying any particular basis is special.
I hesitate to call anything pedagogically "wrong" as people think and learn in different ways, but I think the coyness some teachers display about the vector space concept hampers and delays a lot of students' understanding.
Edit: Actually, I think the "start with 'concrete' lists of numbers and move to 'abstract' vector spaces" approach is misguided as it is based on the idea that the vector space is an abstraction of the lists of numbers, which I think is wrong.
The vector space and the lists of numbers are two equivalent, related abstractions of some underlying thing, eg. movements in Euclidean space, investment portfolios, pixel colours, etc. The difference is that one of the abstractions is more useful for performing numerical calculations and one better expresses the mathematical structure and properties of the entities under consideration. They're not different levels of abstraction but different abstractions with different uses.
I'd be inclined to introduce the one best suited to understanding first, or at least alongside the one used for computations. Otherwise students are just memorising algorithms without understanding, which isn't what maths education should be about, IMO. (The properties of those algorithms can of course be proved without the vector space concept, but such proofs are opaque and magical, often using determinants which are introduced with no better justification than that they allow these things to be proved.)
I have nothing against starting out with motivating examples, obviously they are needed for understanding. But they should motivate the definition of a vector space. Not the definition of vectors as mappings of indices.
Functions are actually a great motivating example for the definition of a vector space, precisely because they are first look nothing like what student think of as a vector.
1 reply →
> It fixates on one particular basis and it results in a vector space with few applications and it can not explain many of the most important function vector spaces, which are of course the L^p spaces.
Except just about all relevant applications that exist in computer science and physics where fixating on a representation is the standard.
In physics it is common to work explicitily with the components in a base (see tensors in relativity or representation theory), but it's also very important to understand how your quantities transform between different basis. It's a trade-off.
Most relevant applications use L^2 spaces which can not be defined point wise.
If you want to talk about applications, then this representation is especially bad. Since the intuition it gives is just straight up false.
1 reply →
reminded me of "tensor is a bunch of numbers that transform in a certain way"; this should be illegal to teach, especially in physics
Completely agree. In uni, I (re)-learned about vectors in linear algebra, and for a good chunk of the course, we didn't write anything in "standard vector notation". We learned about vector axioms first, and then vectors were treated as "anything that satisfies the vector axioms". (When doing more practical examples, we just used the reals instead of something like R^3, but the entire time it was clear that for any proof, anything that can be added and multiplied in the way that the vector axioms describe would fit.) I think adopting this structuralist view really helps with a lot of mathematical studies.
> In most function vector spaces you encounter in mathematics, you can not say what the value of a function at a point is.
Could you spell out what you mean by that? Functions are all defined on their domains (by definition)
Are you referring to the L^p spaces being really equivalence classes of functions agreeing almost everywhere?
Yes, the L^p spaces are not vector spaces of functions, but essentially equivalent classes of functions that give the same result in an Lebesgue integral. For these reason, common operations on functions, like evaluating at a point or taking a derivative are undefined.
If you care about these you need something more restrictive, for example to study differential equations you can work in Sobolev spaces, where the continuity requirement allows you to identify an equivalent class with a well-defined function.
1 reply →
Now I’m thinking that I have missed the point of the article. I didn’t read it as an introduction to vector spaces, but rather that the introduction served as to give an intuition how functions may be viewed as vectors (going back to the article, it’s even in the section heading). I found the next parts well written and to the point, leading along the steps to show that indeed the requirements for a Hilbert space are met by L^2 (even though those requirements are only spelled out in the end). I’m not actively working with mathematics any more, but I didn’t notice any major corner cutting. It’s not text book rigorous but lays out the idea in an easy to follow way. I took something away from it - or not, depending on whether I missed some inconsistency.
Hey thanks for writing this article. I found it to be really great.
> But we can take it even further; what if we allow any real number as an index?
How can an uncountably infinite set be used as an index? I was fine with natural numbers (countably infinite) being an index obv, but a real seems a stretch. I get the mathematical definition of a function, but again, this feels like we suddenly lose the plot…
We do it all the time. An index is just indicative that there is a mapping (a function), usually from the integers. However we don't use the subscript notation when indexing by a continuum due to the discomfort you describe.
The point is that we need some way to deal with objects that are inherently infinite-dimensional.
Well there's no law against it.
Okay I suppose the axiom of choice is somewhat necessary to make it make sense. But only because otherwise such an indexed object may fail to exist.
Anyway arbitrary indexes are useful, you often end up doing stuff like covering a space by finding a covering set for each individual point. And then using compactness to show you only need finitely many to cover the whole space. It is doable without uncountable indices, but it makes it very difficult to write down.
I think getting hung up on words (in this case index) in mathematics is a trap. They are often stretched to their breaking point and you just kind of go with the flow.
> When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’
> ’The question is,’ said Alice, ‘whether you can make words mean so many different things.’
> ’The question is,’ said Humpty Dumpty, ‘which is to be master — that’s all.
In mathematics it is the author's privilege.
I think that’s why the author put “vector” in quotes. I kind of imagine it as an ephemeral, infinite list where for some real, when we use that real value as an index into our “vector”/function, we get the output value as the item in this infinite, ephemeral list.
I think the only thing that matters is that the indices have an ordering (which the reals obviously do) and they aren’t irrational (i.e. they have a finite precision).
Imagine you have a real number, say, e.g. 2.4. What stops us from using that as an index into an infinite, infinitely resizable list? 2.4^2 = 5.76. Depending on how fine-grained your application requires you could say 2.41 (=5.8081) is the next index OR 2.5 (=6.25) is the next index we look at or care about.
I could be misunderstanding it, though.
A vector is always a vector -- an element of something that satisfies the axioms of a vector space. The author starts with the example of R^n, which is a very particular vector space that is finite-dimensional and comes with a "canonical" basis (0,...,1,...,0). In general, a basis will always exist for any vector space (using the axiom of choice), but there is no need to fix it, unless you do some calculations. The analogy with R^n is the only reason the "indices" are mentioned, and I think this only creates more confusion.
> and they aren’t irrational (i.e. they have a finite precision)
No, if you want only rational "indices", then your vector space has a countable basis. Interesting vector spaces in analysis are uncountably infinite dimensional. (And for this reason the usual notion of a basis is not very useful in this context.)
> and they aren’t irrational (i.e. they have a finite precision).
I'm not sure if I'm misunderstanding what you mean by 'finite precision' but the ordinary meaning of those words would seem to limit it to rational numbers?
1 reply →
You can look at use cases for an index, and see how well they hold up.
Asking where the smallest greater number (next number) is no longer makes sense.
Taking two numbers and asking whether one is greater than the other still makes sense. (and hence also whether they are equal)
Taking two numbers and asking how far separated from each other still makes sense.
You may already observe some uses for indexes in programming that don't use all of these properties of an index. For example, the index of a hash set "only cares about equality", and "the next index" may be an unfilled address in a hash set.
I'm probably ignorant of how indexes work at a nuts-and-bolts level, but intuitively this seems like a good idea for certain situations. E.g if we want to keep entries in a specific order but don't know ahead of time how many entries will be added between two existing ones. House numbers in areas with a lot of development are an example of the kind of problem this seems ideal to solve, when there's a clear 'order' based on geography but no clear limit on the number of addresses that could be added 'between' existing addresses.
I think you're still describing a countably infinite set: there's a bijection between the natural numbers and the set of houses.
One way to think about it is that, even though you're defining an index that permits infinite amounts of subdivision, from any given house there's always a "next house up" in the vector: you can move up one space.
In a real-indexed vector, that notion doesn't apply. It's "infinity plus one" all the way down: whatever real value you pick to start with, x, there's no delta small enough to add to it such that there's no number between x and x+d.
2 replies →
That’s kind of how I understand it as well.
This Has it's use. The continouus Fourier Transform is is based on that. You are asking what frequencies is this continouus signal made of. Time is normally defined as a real number in that context, but If you have a continouus time you need continouus frequencies to map time space to frequency space. You can think about an Index as a lego Block, that you need to construct Something.
The only difference of note, I think, is that you can't enumerate the elements. Instead of being able to say "for each element, ..." you'd have to say "for all elements, ...", like the example of vector length defined as an integral over the full number range.
To a mathematician “each” and “all” are synonyms.
The author is stretching an analogy, it's a price to pay for starting with R^3 as a motivational example. There is nothing in the general definition of a vector space that requires it's elements to be "indexed"
Consider a function on R as an |R|-dimensional vector...
Which defeats the purpose of thinking about functions as a vector space. It's all smokes and mirrors
What do you understand “index” to mean here? To me, a family indexed by some set is mostly just a different notation for, and attitude towards, a function with domain the indexing set.