Jump to content

Project-GC vs MyGecoachng Profile Distance Difference


Recommended Posts

I recently had a challenge cache published (GC8EZ3C) which requires one to have logged 225,291 miles (362,570.72 km) or more in total distance (measured in straight lines) between all found caches, in the order they have been logged, which is the distance from the Earth to the Moon.

 

When I run the Project-GC challenge checker, it says I qualify with 421182.966 km (or 1.16 the required distance).  But when I look at the "same" distance calculated by MyGeocachingProfile.com, it says I have 498,473.22 km (or 1.3 the distance to the moon).

 

Does anyone know why the two are different?  (And by quite a bit!)  I believe that the Project-GC calculation excludes locationless caches.  Maybe the calculation performed by MyGeocachingProfile.com does not?  Could that be the difference?  Any other thoughts?

 

This question has come up because some friends thought they qualified, based upon the number shown in their profile created by MyGeocachingProfile.com.  So they were surprised when the checker said that they did not qualify.  I thought it odd, too, because they have over 11,000 more finds than me, including some finds out of the U.S., which I do not have.

 

 

Link to comment
59 minutes ago, Old River Runner said:

Does anyone know why the two are different?

 

The data is different. For example, PGC calculates distance between posted coordinates but MGP calculates between final coordinates.

 

You can not compare these results as they are not based on the same data.

Edited by arisoft
  • Helpful 1
Link to comment
48 minutes ago, arisoft said:

 

The data is different. For example, PGC calculates distance between posted coordinates but MGP calculates between final coordinates.

 

You can not compare these results as they are not based on the same data.

 

I don't think that explains it.  As I understand it, Project-GC gets its data directly from GC.com, while one has to upload the data into MyGeocachingProfile.  Therefore, it would seem that PGC is more likely to have information on the final coordinates.  Since mystery cache finals are required to be within 2 miles of the posted coordinates, I am not sure that would explain the significant difference.  Of course, if the multi-caches found were significantly far enough away from the posted, since there isn't a 2-mile restriction on them, that could add a little more difference.  But again, we are talking about a 71,000 km (44,000 mile) difference in my numbers!  That's a lot of multis and mystery caches!

 

But I agree 100% with you that the results must not be based on the same data.  I just want to find out and understand the differences!  Is one bogus and the other not?

 

Link to comment
1 hour ago, Old River Runner said:

But I agree 100% with you that the results must not be based on the same data.  I just want to find out and understand the differences!  Is one bogus and the other not?

 

I tested the MGP with two traditional caches separated by 388.80 kilometers via the geodesic. (From GC72 to GC4EEWT)

 

The MGP total distace shows me total 385.61 kilometers with these two finds.

 

From this simple test I can say that the result you get from the MGP has only entertainment value.

 

PS. Direct distance thru the earth between these caches is 388.74 kilometers.

Edited by arisoft
Link to comment
7 minutes ago, arisoft said:

I tested the MGP with two traditional caches separating by 388.80 kilometers via the geodesic. (From GC72 to GC4EEWT)

 

The MGP total distace shows me total 385.61 kilometers with these two finds.

 

From this simple test I can say that the result you get from the MGP has only entertainment value.

 

PS. Direct distance thru the earth between these caches is 388.74 kilometers.

 

Yes, I just tested with my own finds and that appears to be what's happening. It should also be noted that MGP specifically states the following on this stat (bolding mine):

Quote

Total Distance Between Caches (in the Order Found and in Straight Lines)

 

I assume they've done this simply to reduce the processing power required to crunch through the many great circle calculations that would need to be performed. It should be noted that the other distance stats, like farthest from home, longest distance between two caches, etc., all use great circle distance. It's only the total distance that, as arisoft correctly put it, is purely for entertainment value.

Link to comment

I rarely visit PGC so I tested it out of curiosity. I was surprised at how quickly the calculation was done but I'm a little ignorant of how this sort of thing works. Result was 1.3 times qualification. Then I came across a heap of Challenge caches listed for Oz which was new to me, presumably because they had the word"challenge" in the title (I know of a few around here that don't have "challenge" in their title). A new place for me to explore I perhaps.

Link to comment
45 minutes ago, colleda said:

Then I came across a heap of Challenge caches listed for Oz which was new to me, presumably because they had the word"challenge" in the title

 

PGC knows challenges pretty well. The title is not the only crtiteria.

 

Calculating distances is not extremely time consuming. There is no reason to tweak distance calculations. At least so much as MGP does.

 

Link to comment

@arisoft -- Are you saying that MGP calculates distance straight line only, point-to-point, without considering the curvature of the earth, and that PGC calculates considering the curvature of the earth?

 

If that is the case, shouldn't the MGP results be less than the PGC results?  I am seeing the opposite (i.e., MGP > PGC).

 

Also, could one be including locationless cache finds (MGP) and the other excluding them (PGC)?  That would definitely make a difference in the direction observed, but I am curious if anyone knows the working of MGP vs. PGC for something like this.

 

Edited by Old River Runner
Link to comment
8 hours ago, Old River Runner said:

@arisoft -- Are you saying that MGP calculates distance straight line only, point-to-point, without considering the curvature of the earth, and that PGC calculates considering the curvature of the earth?

 

If that is the case, shouldn't the MGP results be less than the PGC results?  I am seeing the opposite (i.e., MGP > PGC).

 

Also, could one be including locationless cache finds (MGP) and the other excluding them (PGC)?  That would definitely make a difference in the direction observed, but I am curious if anyone knows the working of MGP vs. PGC for something like this.

 

 

I am not saying anyhing like this. If you read my post scriptum you will find that the absolutely straight distance is also longer than the distance estimated by the MGP. There is no way to get the distance shorter than the distance thru the earth but MGP did.

 

Because the distance is wrong in the very basic case there is no reason to find out differencies in special cases. One may guess that MGP calculates distance to locationless caches based to the cooridinates in the entered data or the algorithm calculates all distances so randomly that this one test is not sufficient to estimate the error at all. It only tells that the calculation is based on moon logic.

Edited by arisoft
Link to comment
22 hours ago, Old River Runner said:

If that is the case, shouldn't the MGP results be less than the PGC results?  I am seeing the opposite (i.e., MGP > PGC).

 

That's what I'm seeing too. Regardless of the reason, MGP seems to be the outlier. When I compare the distance generated by the GSAK macro FindStatsGen (which is using puzzle final coordinates) with PGC (which uses posted coordinates), the difference is small enough to consider these in agreement. MGP is way out to lunch, so they must either have an error in their calculations, or they're performing completely different calculations than we're expecting. Either way, their number isn't meaningful.

GSAK: 150174 km

PGC: 152952 km

MGP: 194306 km

 

As far as usefulness, MGP might as well tell you you've traveled "purple" miles. If anyone is so inclined, this should be brought to the attention of the MGP admins.

Link to comment

I have posted on PGC about the difference between what GSAK and what PGC come up with, I believe it was more about caching centroid than cache to cache distance, though I think both rely on great circle calculations.  I believe the basic answer was, they use different data and calculations, and the results are a little different as a result.

 

(I can't directly compare my results between GSAK and PGC, as I've manually added in lab caches to GSAK, and that would skew the results regardless of other steps.)

 

MGP is out in left field, though.  I'm glad that stat doesn't appear in the official Geocaching stat tab, since that was built in partnership with MGP.

Link to comment

here is what shows for me: 

 

"I didn't make it with a distance of only 18362.100 km and I needed 362570.72 km 
This is only 0.05 times the required distance
Caches excluded from groundspeek statistics are excluded
If a non Premium member uses the checker all premium caches will be excluded from the calculation because of licence issues"

 

guess lift off is still  in the future!

Link to comment
2 hours ago, Wet Pancake Touring Club said:

If the straight line calculation does not handle crossing the anti-meridian properly, that could explain why the straight line distance is greater than the great circle distance. 

 

Passing the anti-meridian is quite rare event but you are right that in this case it may add a huge error.

 

Please note that straight line distance needs to be defined better as the great circle distance is one definition for straight lines. The best definition on the surface is called geodesic and it is not same as the great circle which is just one way to make calculations wrong way. The most straightest line goes thru the earth and it is immune to the anti-meridian problem.

Edited by arisoft
Link to comment
4 hours ago, Wet Pancake Touring Club said:

If the straight line calculation does not handle crossing the anti-meridian properly, that could explain why the straight line distance is greater than the great circle distance. 

 

That's an interesting thought, but I'm not sure that's what's happening here. Based on where I've found caches, the shortest route to or from any of my finds should never have needed to cross the anti-meridian, yet my MGP distance is still wildly inflated.

Link to comment

I contacted both MGP and PGC to ask them about this.  I never received a response from PGC, but MGP did respond quickly and I provided information to them so they could work through the difference between the two websites.  They did identify and fix a bug that was resulting in MGP counting locationless caches (it should not have been), plus they "upgraded" the distance formula.  They asked me to upload my stats again to see how the distance between two caches found in the same day and the total distance between all caches compared with those values calculated by PGC.  After doing this, I found that the distance-related results from MGP are now in very good agreement with those provided by PGC.

 
Specifically,
  • PGC has shown all along that the longest distance between two caches found by me on the same day was on 08/25/14, with a distance of 1,726 miles.  With the exclusion of the locationless caches, the corrected MGP now shows the same date as PGC, with a distance of 1,723.45 miles, or a difference of less than 3 miles (0.2 %).
     
  • For the total distance between caches in the order found, PGC currently shows 261,889 miles (1.096 times the distance to the moon), while the corrected MGP shows a distance of 262,036.39 miles (1.1 times the distance to the moon).  This represents a difference in total distance of 147.9 miles (or 0.06 %), with the corrected MGP showing the higher value.  But, while the MGP calculation is based upon data downloaded from Geocaching.com today (10/30/19), the PGC calculation is based upon data from 10/23/19 and therefore does not include four finds I made and logged since then.  So this accounts for some of the difference, if not all.
So, in conclusion, it looks like this difference in distance-related stats between the two websites stats has been fixed.  Curiously, the total distance shown by the PGC challenge checker is slightly greater than that shown in the table on the My Profile Stats page on the PGC website (a difference of about 573 km), but I am not worried about that small difference.  My main concern was the big difference between the two websites, and I will ask my friends who questioned the challenge checker results to run their stats on MGP again, so they can see that the two are in reasonable agreement.
 
Edited by Old River Runner
  • Helpful 2
Link to comment

I just generated a fresh MyFinds PQ and updated MGP. PGC is updated enough to reflect my most recent finds, so both services should be working with mostly the same information (with the exception of one or two retracted caches I found, which I believe PGC won't know about). My numbers now compare as follows:

MGP: 153010 km

PGC: 153106 km

 

Ideally, these numbers should match, but retracted caches and slight differences in the way the calculations are being performed could explain the slight difference. I'd say they're now close enough to be considered in agreement for most purposes. Thanks for getting in touch with MGP and triggering them to fix the issue. I don't have any Locationless finds, so that likely wasn't the issue. It must have been the "upgraded" formula that fixed it.

Link to comment

The numbers for both are going to be incorrect; they both average about 10% high for most people. This has nothing to do with great circle vs. ellipsoid or any details of the calculations; it's just that before the last couple of years, the data had no way of specifying the exact order of your finds.  Add to that some old, ridiculously-placed coordinates for caches that were far from the actual location, and the algorithm always gives a longer distance than would be correct.

 

I keep my own database with the correct order and location of my finds, so I can get the error of the PGC estimate.  In my case, the actual distance is 781,025 miles, while PGC says it is 804,652 miles, an overestimate of almost the circumference of the Earth!

 

In short:  Both estimates are wildly inaccurate and the differences are not important.

Edited by fizzymagic
  • Surprised 1
Link to comment
8 hours ago, arisoft said:

 

Generally, differences are irrelevant. But when you try to qualify a challenge based on the distance you must use the same algorithm  as PGC.

 

I agree.  This is akin to the old conundrum, "If you have two clocks, which one has the correct time?"  In this case, it doesn't matter, as long as the same "clock" (i.e., method for calculation of distance) is used consistently.  Geocaching HQ has mandated that challenge caches have challenge checkers from PGC, so that is our "clock" for this case.  I was just curious as to why there was such a large difference between the two website results.

 

Edited by Old River Runner
Link to comment
20 hours ago, fizzymagic said:

...it's just that before the last couple of years, the data had no way of specifying the exact order of your finds.

 

I'm curious about this. I assumed these systems would use the log ID to determine the order of finds (I believe this is what GSAK does), and the log IDs aren't new. Why wouldn't they be using these IDs, and what changed a couple of years ago?

Link to comment
25 minutes ago, The A-Team said:

 

I'm curious about this. I assumed these systems would use the log ID to determine the order of finds (I believe this is what GSAK does), and the log IDs aren't new. Why wouldn't they be using these IDs, and what changed a couple of years ago?

 

It is nearly impossible for log IDs to reliably correspond to find order. You have to 100% log caches in exactly the order you find them. If you have a log deleted and you reinstate it: log ID out of order.  If you log some caches by phone and some online: log IDs out of order.  If you mark a challenge cache as a note and later log it as a find: log IDs out of order. If you happen to remember a find you forgot to log during a long caching day and do it later: log IDs out of order.  None of these situations is correctable since log IDs are not changeable. Using log IDs to indicate find order is inherently unreliable.

 

I have been complaining about this issue since about 2003.  The field used in the database has always been a date/time field; it has always (in principle) been possible to assign times to finds so that the actual order is correct.  Apparently HQ thought that would be too difficult for users; however, starting a couple of years ago, when you log a find on the phone apps it includes the time.  There are still ambiguities, however.  If you forget to log a cache on the phone and do it later, the times will be out of order.  AFAICT, it is impossible to edit the log times of finds.  That's one reason I keep my own independent database of my finds.

Edited by fizzymagic
  • Helpful 1
Link to comment
3 minutes ago, fizzymagic said:

It is nearly impossible for log IDs to reliably correspond to find order. You have to 100% log caches in exactly the order you find them. If you have a log deleted and you reinstate it: log ID out of order.  If you log some caches by phone and some online: log IDs out of order.  If you mark a challenge cache as a note and later log it as a find: log IDs out of order.

 

I have been complaining about this issue since about 2003.  The field used in the database has always been a date/time field; it has always (in principle) been possible to assign times to finds so that the actual order is correct.  Apparently HQ thought that would be too difficult for users; however, starting a couple of years ago, when you log a find on the phone apps it includes the time.  There are still ambiguities, however.  If you forget to log a cache on the phone and do it later, the times will be out of order.  AFAICT, it is impossible to edit the log times of finds.  That's one reason I keep my own independent database of my finds.

 

Gotcha.

 

I try really hard to keep my finds in the correct order (deletions are very rare and I always log via the website), so I hadn't considered this aspect. I can only think of a few cases where I know one of my logs is out of order, and the difference in the distance from the "correct" order in those cases is pretty minimal and effectively negligible for me. However, I can understand how others may have more cases of out-of-order logs and the distances can certainly add up over time.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...