+MikeofKorea Posted September 1, 2013 Share Posted September 1, 2013 I've read and seen a lot of folks talking about centroid geocaches, but I can't find anything sensible on how to determine your centroid geocache. Is there a program or an equation with your farthest East West North South caches, or what? Quote Link to comment
+terrkan78 Posted September 3, 2013 Share Posted September 3, 2013 I was curious so I looked into it. According to this old thread, there's a program for it: Link to forum thread Quote Link to comment
+The Cheeseheads Posted September 3, 2013 Share Posted September 3, 2013 I'm moving this to a more appropriate forum for discussion. Quote Link to comment
+fizzymagic Posted September 3, 2013 Share Posted September 3, 2013 As I stated in that thread, I think the best definition of your caching centroid is the point on the surface of the Earth corresponding to the 3-dimensional average of all your cache positions. That is, calculate each cache position as X, Y, and Z in 3 dimensions. Average them all, and that point will be somewhere inside the Earth. Find the point on the surface directly above it, and that is your centroid. Quote Link to comment
+unabowler Posted September 4, 2013 Share Posted September 4, 2013 As I stated in that thread, I think the best definition of your caching centroid is the point on the surface of the Earth corresponding to the 3-dimensional average of all your cache positions. That is, calculate each cache position as X, Y, and Z in 3 dimensions. Average them all, and that point will be somewhere inside the Earth. Find the point on the surface directly above it, and that is your centroid. Part of my doctoral dissertation in statistics dealt with methods of computing a sample mean on a manifold (ie on a surface such as a sphere or a higher dimensional analog) and this is one of the methods. We'd call it an extrinsic mean since you leave the surface and then project back to it. An "intrisic" method of finding the mean would be to find the point y which minimizes sum(d(x_i,y)^2) over points x_i where you've found caches, and where d(x_i,y) is the geodesic distance between x_i and y, ie the great circle distance between x_i and y. If a person has found two caches, at points on the opposite side of the earth from each other, both methods fail (assuming the earth is a sphere which it is not). Quote Link to comment
+fizzymagic Posted September 4, 2013 Share Posted September 4, 2013 As I stated in that thread, I think the best definition of your caching centroid is the point on the surface of the Earth corresponding to the 3-dimensional average of all your cache positions. That is, calculate each cache position as X, Y, and Z in 3 dimensions. Average them all, and that point will be somewhere inside the Earth. Find the point on the surface directly above it, and that is your centroid. Part of my doctoral dissertation in statistics dealt with methods of computing a sample mean on a manifold (ie on a surface such as a sphere or a higher dimensional analog) and this is one of the methods. We'd call it an extrinsic mean since you leave the surface and then project back to it. An "intrinsic" method of finding the mean would be to find the point y which minimizes sum(d(x_i,y)^2) over points x_i where you've found caches, and where d(x_i,y) is the geodesic distance between x_i and y, ie the great circle distance between x_i and y. If a person has found two caches, at points on the opposite side of the earth from each other, both methods fail (assuming the earth is a sphere which it is not). Yes, actually I have played around with this intrinsic mean as well. Its main disadvantage is that it is "hard" to calculate (it requires a calculation of N distances per iteration, where N is the number of caches). But it has the advantage of always giving an answer on the surface of the Earth. One comment on your proposed mean: since distance on the surface is a metric, I am not convinced that minimizing the distance squared is correct. That gives you more of an RMS average than a mean, IMO. Just minimizing the sum of the distances is probably better. Quote Link to comment
+unabowler Posted September 4, 2013 Share Posted September 4, 2013 As I stated in that thread, I think the best definition of your caching centroid is the point on the surface of the Earth corresponding to the 3-dimensional average of all your cache positions. That is, calculate each cache position as X, Y, and Z in 3 dimensions. Average them all, and that point will be somewhere inside the Earth. Find the point on the surface directly above it, and that is your centroid. Part of my doctoral dissertation in statistics dealt with methods of computing a sample mean on a manifold (ie on a surface such as a sphere or a higher dimensional analog) and this is one of the methods. We'd call it an extrinsic mean since you leave the surface and then project back to it. An "intrinsic" method of finding the mean would be to find the point y which minimizes sum(d(x_i,y)^2) over points x_i where you've found caches, and where d(x_i,y) is the geodesic distance between x_i and y, ie the great circle distance between x_i and y. If a person has found two caches, at points on the opposite side of the earth from each other, both methods fail (assuming the earth is a sphere which it is not). Yes, actually I have played around with this intrinsic mean as well. Its main disadvantage is that it is "hard" to calculate (it requires a calculation of N distances per iteration, where N is the number of caches). But it has the advantage of always giving an answer on the surface of the Earth. One comment on your proposed mean: since distance on the surface is a metric, I am not convinced that minimizing the distance squared is correct. That gives you more of an RMS average than a mean, IMO. Just minimizing the sum of the distances is probably better. There is a reason for using the squared distance. For s standard pdf function f(x) the point a which minimizes integral[(a-x)^2 f(x)] dx is the mean mu that you get from integral(x f(x)) dx. All integrals here are -inf to +inf. The discrete anolog for a sample mean is the summation, and the (a-x)^2 discretizes to the squared distances. We had a fast iterative method to give a computational approximation to this mean and the method worked for the infinite dimensional manifolds we dealt with. Quote Link to comment
+Walts Hunting Posted September 5, 2013 Share Posted September 5, 2013 (edited) I've read and seen a lot of folks talking about centroid geocaches, but I can't find anything sensible on how to determine your centroid geocache. Is there a program or an equation with your farthest East West North South caches, or what? That info won't do it since the centroid is weighted by each cache. As said before. Get GSAK. Install Centroid Macro. Run against current database and presto all is well. Mine is here Centroid of Database: Found N 38° 31.883 W 116° 33.216 Edited September 5, 2013 by Walts Hunting Quote Link to comment
+The Jester Posted September 5, 2013 Share Posted September 5, 2013 I've read and seen a lot of folks talking about centroid geocaches, but I can't find anything sensible on how to determine your centroid geocache. Is there a program or an equation with your farthest East West North South caches, or what? That info won't do it since the centroid is weighted by each cache. As said before. Get GSAK. Install Centroid Macro. Run against current database and presto all is well. Mine is here Centroid of Database: Found N 38° 31.883 W 116° 33.216 Or run the FindStatGen macro and under "Some Numbers" section is the centroid with a link to a map showing it. Quote Link to comment
+fizzymagic Posted September 5, 2013 Share Posted September 5, 2013 As I stated in that thread, I think the best definition of your caching centroid is the point on the surface of the Earth corresponding to the 3-dimensional average of all your cache positions. That is, calculate each cache position as X, Y, and Z in 3 dimensions. Average them all, and that point will be somewhere inside the Earth. Find the point on the surface directly above it, and that is your centroid. Part of my doctoral dissertation in statistics dealt with methods of computing a sample mean on a manifold (ie on a surface such as a sphere or a higher dimensional analog) and this is one of the methods. We'd call it an extrinsic mean since you leave the surface and then project back to it. An "intrinsic" method of finding the mean would be to find the point y which minimizes sum(d(x_i,y)^2) over points x_i where you've found caches, and where d(x_i,y) is the geodesic distance between x_i and y, ie the great circle distance between x_i and y. If a person has found two caches, at points on the opposite side of the earth from each other, both methods fail (assuming the earth is a sphere which it is not). Yes, actually I have played around with this intrinsic mean as well. Its main disadvantage is that it is "hard" to calculate (it requires a calculation of N distances per iteration, where N is the number of caches). But it has the advantage of always giving an answer on the surface of the Earth. One comment on your proposed mean: since distance on the surface is a metric, I am not convinced that minimizing the distance squared is correct. That gives you more of an RMS average than a mean, IMO. Just minimizing the sum of the distances is probably better. There is a reason for using the squared distance. For s standard pdf function f(x) the point a which minimizes integral[(a-x)^2 f(x)] dx is the mean mu that you get from integral(x f(x)) dx. All integrals here are -inf to +inf. The discrete analog for a sample mean is the summation, and the (a-x)^2 discretizes to the squared distances. We had a fast iterative method to give a computational approximation to this mean and the method worked for the infinite dimensional manifolds we dealt with. Yeah, you are right. I am naturally skeptical of squared summations because they frequently involve an implicit assumption that f(x) is normal. I think it does here, also, but I can't prove it. Nonetheless, it is certainly better than minimizing the sum of the distances. I typically use a more-or-less brute-force method to minimize functions using a simplex algorithm. It is stable and works for a variety of geometric problems, though it is not as fast as other solutions tuned to their problem space. If I get time I will compare the centroids obtained in 3-space vs. those from the ellipsoid. Quote Link to comment
+unabowler Posted September 5, 2013 Share Posted September 5, 2013 As I stated in that thread, I think the best definition of your caching centroid is the point on the surface of the Earth corresponding to the 3-dimensional average of all your cache positions. That is, calculate each cache position as X, Y, and Z in 3 dimensions. Average them all, and that point will be somewhere inside the Earth. Find the point on the surface directly above it, and that is your centroid. Part of my doctoral dissertation in statistics dealt with methods of computing a sample mean on a manifold (ie on a surface such as a sphere or a higher dimensional analog) and this is one of the methods. We'd call it an extrinsic mean since you leave the surface and then project back to it. An "intrinsic" method of finding the mean would be to find the point y which minimizes sum(d(x_i,y)^2) over points x_i where you've found caches, and where d(x_i,y) is the geodesic distance between x_i and y, ie the great circle distance between x_i and y. If a person has found two caches, at points on the opposite side of the earth from each other, both methods fail (assuming the earth is a sphere which it is not). Yes, actually I have played around with this intrinsic mean as well. Its main disadvantage is that it is "hard" to calculate (it requires a calculation of N distances per iteration, where N is the number of caches). But it has the advantage of always giving an answer on the surface of the Earth. One comment on your proposed mean: since distance on the surface is a metric, I am not convinced that minimizing the distance squared is correct. That gives you more of an RMS average than a mean, IMO. Just minimizing the sum of the distances is probably better. There is a reason for using the squared distance. For s standard pdf function f(x) the point a which minimizes integral[(a-x)^2 f(x)] dx is the mean mu that you get from integral(x f(x)) dx. All integrals here are -inf to +inf. The discrete analog for a sample mean is the summation, and the (a-x)^2 discretizes to the squared distances. We had a fast iterative method to give a computational approximation to this mean and the method worked for the infinite dimensional manifolds we dealt with. Yeah, you are right. I am naturally skeptical of squared summations because they frequently involve an implicit assumption that f(x) is normal. I think it does here, also, but I can't prove it. Nonetheless, it is certainly better than minimizing the sum of the distances. I typically use a more-or-less brute-force method to minimize functions using a simplex algorithm. It is stable and works for a variety of geometric problems, though it is not as fast as other solutions tuned to their problem space. If I get time I will compare the centroids obtained in 3-space vs. those from the ellipsoid. If you take integral[(a-x)^2 f(x)]dx and differentiate with respect to a, to minimize you get 2*integral[(a-x)f(x)]dx = 0 and solving for a gives a = integral(x f(x)) dx. This is true for any pdf f(x), so no assumption of normality is necessary. It seems like a simplex algorithm would work but the method we had was a gradient method. Is the comparison you're talking about a comparison between the implicit and explicit methods? We never did that comparison because we were concerned with other stuff. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.