In the previous post, I investigated the convex hull of random point clouds. I found that as we add more dimensions while keeping the number of points the same, the expected volume decreases sharply. One way to think of this is that in high dimensions, almost all points in a random point cloud will lie on the edge of the convex hull, very few will be "surrounded" by the other points.
In this post, I want to see what happens if the points are not random, but correlated to each other. Why might we think so? Well, strongly correlated systems have in a sense fewer degrees of freedom, so we expect them to behave like lower dimensional problems. Like this:
How to sample correlated points in R^N
Generate a square matrix A with uniform random elements, each with expectation 0. For each point in the point cloud, generate N gaussian random numbers, and apply A to these. The covariance matrix will be AA^T.
2 dimensions
Making 100 random trials gives that the average k-fold inclusion is 91%, compared to 88% for just random point clouds. So the difference is not big, not even statistically significant. The empiric variance of an estimate probability p in a binary distribution is n*p*(1-p), where n is the number of trials. With n=100 and p=0.9 in our case, this comes out to about 8. So the standard deviation is about 2.8, pretty much the same as the difference between the sets. We might want to apply a real test here, something that respects the binary distribution. But that's silly; if we wanted something more accurate we should just make more tests.
10 dimensions
Doing this in 10 dimensions give an estimate of 5.6% for the independent point clouds, and 6.0% for the correlated point clouds. Once again, not significant.
Conclusion
When we observe high-dimensional systems, almost all observations will be "extreme" in some direction, even if the system is structured.
Inga kommentarer:
Skicka en kommentar