Comment by freehorse

2 months ago

Why is the `[1,2,3] + [4;5;6]` syntax a footgun? It is a very concise, comprehensible and easy way to create matrices in many cases. Eg if you have a timeseries S, then `S - S'` gives all the distances/differences between all its elements. Or you have 2 string arrays and you want all combinations between the two.

The diag is admittedly unfortunate and it has confused me myself, it should actually be 2 different functions (which are sort of reverse of each other, weirdly making it sort of an involution).

5 comments

freehorse

fph 2 months ago

What happens most of the time with unexperienced or distracted users is that they write things like `norm(S - T)` to compute how close two vectors are, but one of them is a row vector and the other is a column vector, so the result is silently completely wrong.

Matlab's functions like to create row vectors (e.g., linspace) in a world where column vectors are more common, so this is a common occurrence.

So `[1,2,3] + [4;5;6]` is a concise syntax for an uncommon operation, but unfortunately it is very similar to a frequent mistake for a much more common operation.

Julia tells the two operations (vector sum and outer sum) apart very elegantly: one is `S - T` and the other is `S .- T`: the dot here is very idiomatic and consistent with the rest of the syntax.

BobbyTables2 2 months ago

What does it even mean to add a 1x3 matrix to a 3x1 matrix ?

freehorse 2 months ago
This is about how array operations in matlab work. In matlab, you can write things such as
>> [1 2 3] + 1 ans = [2 3 4]
In this case, the operation `+ 1` is applied in all columns of the array. In this exact manner, when you add a (1 x m) row and a (n x 1) column vector, you add the column to each row element (or you can view it the other way around). So the result is as if you repeat your (n x 1) column m times horizontally, giving you a (n x m) matrix, do the same for the row vertically n times giving you another (n x m) matrix, and then you add these two matrices. So basically adding a row and a column is essentially a shortcut for repeating adding these two (n x m) matrices (and runs faster than actually creating these matrices). This gives a matrix where each column is the old column plus the row element for that row index. For example
>> [1 2 3] + [1; 2; 3] ans = [2 3 4 3 4 5 4 5 6]
A very practical example is, as I mentioned, getting all differences between the elements of a time series by writing `S - S'`. Another example, `(1:6)+(1:6)'` gives you the sums for all possible combinations when rolling 2 6-sided dice.
This does not work only with addition and subtraction, but with dot-product and other functions as well. You can do this across arbitrary dimensions, as long as your input matrices non-unit dimensions do not overlap.
quietbritishjim 2 months ago
It means the same thing in MATLAB and numpy:
Z = np.array([[1,2,3]]) W = Z + Z.T print(W)
Gives:
[[2 3 4] [3 4 5] [4 5 6]]
It's called broadcasting [1]. I'm not a fan of MATLAB, but this is an odd criticism.
[1] https://numpy.org/devdocs/user/basics.broadcasting.html#gene...
- adgjlsfhk1 2 months ago
  
  One of the really nice things Julia does is make broadcasting explicit. The way you would write this in Julia is
  Z = [1,2,3] W = Z .+ Z' # note the . before the + that makes this a broadcasted
  This has 2 big advantages. Firstly, it means that users get errors when the shapes of things aren't what they expected. A DimmensionMismatch error is a lot easier to debug than a silently wrong result. Secondly, it means that julia can use `exp(M)` etc to be a matrix exponential, while the element-wise exponential is `exp.(M)`. This allows a lot of code to naturally work generically over both arrays and scalars (e.g. exp of a complex number will work correctly if written as a 2x2 matrix)