dice = 1:6
using Statistics
μ = Statistics.mean(dice)
sum(dice) / length(dice)
Statistics.var(dice, corrected=false)
sum((dice .- μ).^2 / length(dice))
Statistics.var(dice)
Statistics.std(dice)^2
sum((dice .- μ).^2 / (length(dice) - 1))
[1:6; 1:6]
Statistics.cor([1:6; 1:6])
[1:6; 6:-1:1]
Statistics.cor([1:6; 6:-1:1])
[1:6 6:-1:1]
Statistics.cor([1:6 6:-1:1])
rand
produces independent output so we should see a zero correlation. For small values we still see a bit of correlation
rand(6, 2)
Statistics.cor(rand(6, 2))
But as we increase, the correlation goes to zero
Statistics.cor(rand(100, 2))
Statistics.cor(rand(100000, 2))
using Plots
μ = 0
σ = 1
f(x) = exp(-1/2*((x - μ) / σ)^2) / (σ * √(2π))
x = range(-4.0, stop=4.0, length=1000)
plot(x, f.(x))
z = randn(100)
scatter!(z, f.(z), markersize=2)
See Example 7.1 of Boyd & Vandenberghe "Convex Optimization" book.
Suppose we observe where has a normal distribution .
We want to recover from the observation such that the are likely to occur. That is, we want to maximize . This is multi-objective. Instead, we can maximize the likelihood that they all occur. Since they are independent, it's the product.
Since is an increasing function, that is equivalent to maximizing the logarithm:
which is equal to
The first two terms do not depend on so we can drop them.
is a negative constant so maximizing this expression is equivalent to minimizing
In terms of , this is
If and does not depend on (same for all samples, we say they are independent and identially distributed (i.i.d)), this gives
where the th row of is . So the classical linear regression we saw during the first week assumes i.i.d. normal noise of zero mean.
How to interpret the scaling in terms of influence on for very noisy samples ?
n = 101
σ = [1000; ones(n - 1)]
μ = 100rand(n)
v = randn(n) .* σ .+ μ
m = 10
x = rand(m)
A = rand(n, m)
y = A * x + v
Without taking the noise into account, we get large errors:
x - A \ y
Taking into account can be done as follows:
x - A \ (y - μ)
How do we take into account ?
We know that \
solves the least square
but we want it to solve
instead. How do we do this ? The expression is equal to
so we can just scale and :
x - (A ./ σ) \ ((y - μ) ./ σ)
Now that's much better.
This page was generated using Literate.jl.