Archived Caches

August 14, 2005

I just got back from a depressing time in Ft Lauderdale. I did a recent PQ before leaving and loaded it into GSAK. I filtered all found (none) and disabled caches, and loaded my GPSr. Spent lots of time searching for four caches that had been archived long ago. Why? I had loaded the area last year. The caches were archived since. The recent PQ did not include ALL caches. The archived caches were still listed as active caches.

Temporarily disabled caches show up, but once they are actually archived, I can not find a way to access them. This leaves "hanging caches" in my GSAK listing that I cannot easily get rid of. I can always sort by GPX date and delete ones that don't update recently, but I'd like to save them and update the status, as well as store the log explaining why it was disabled and archived. Is there a way to download archived caches? If not, can this easily be added as an option?

Bisanabi

9.2k · August 14, 2005

Use fresh data. I load new pocket queries before going on trips, and never have a problem.

August 14, 2005

In order to keep my GSAK up to date, after loading an area, I check that area for records that have not been recently updated.

Then I just click on the "Online waypoint URL" for each of these OLD caches.

If the cache has been archived, I can download the individual GPX for that cache or delete the record (if I don't want to maintain the entire record.)

If you have lots of OLD caches, this can take quite a while, but it can be done.

GC.com has decide not to provide easy access to archived caches.

7.2k · August 14, 2005

Use fresh data. I load new pocket queries before going on trips, and never have a problem.

Our problem came with trying to get the caches in the first place. Trying to figure out the center point of the circle and the radii to get the area we wanted and then deviding the couple thousand up by date, etc. can be time consuming. This is one of the reasons for keeping a database offline as many of us do. Then, in order to pare down the number of PQs we use the "changed in last 7 days" but this complicates the weeding out of archived caches.

What happens is the caches that don't get visited very often is mixed in with the archived caches so you manually check them on the site.

Instead of Jeremy fighting the scrapers he could establish a page somelike ...geocaching.com/archived/?wp=gcxxxx and the result is "archived" or "not archived." I'm sure Clyde of GSAK would jump all over that for weeding out archvied caches in the program. It's simple, low impact, gives away only two pieces of information, and gets the job done.

...or...

Two delimited files with nothing but the waypoint ID of archived caches. One for all of them, one for those archived in the past month. These could be updated daily.

There are many solutions for having your cake and eating it, too.

6.4k · August 14, 2005

So once again, people want THIS SITE to change things to fix a problem in SOMEONE ELSE'S program.

Why not ask Clyde to fix the problem with GSAK?

Don't get me wrong, I find GSAK a useful tool, but as it stands it encourages people to use stale data.

Perhaps by default GSAK should not archive stale data.

Perhaps the GSAK database could include the date a cache was last updated

Perhaps it should have a "stale data" flag on caches that havent updated in x days.

Updated doesnt even have to mean a new cache log. The mere presence of the cache in a new PQ could be enough to zero the timer.

Just a few ideas for ways the 3rd party software vendor might address a problem with his software instead of expecting other people to change their software so his will work properly.

5.3k · August 14, 2005

Don't get me wrong, I find GSAK a useful tool, but as it stands it encourages people to use stale data.

I agree with Mopar. A sure sign of the Apocalypse. Run, hide the children!

While building your own offline database is tempting and fun, and may seem like it would help reduce the load on this site, it is not compatible with the business model that TPTB have decided on.

It's not gonna happen. And I've decided that I am OK with that.

I've completely quit running regularly-scheduled PQs; I only run one-time queries now. As long as the site keeps the response time for those reasonably low, I am going to remain happy, because I can get current cache information for wherever I am going caching on a given day that same day.

There are two exceptions, though -- I think we ought to be able to get a complete record of the caches we've found and of the caches we've hidden, archived or not. The former should include all logs by us, and the latter should include all logs by anyone. It seems to me that we have some ownership of that data, and I think it would be considerate of the site to allow improved access to it.

7.2k · August 15, 2005

...instead of expecting other people to change their software so his will work properly.

ROFL!

So, it's GSAK's fault because it's not getting complete information? That's a laugh.

If Jeremy didn't want GSAK-type database programs out there, why did he allow only those caches that have any changes in the last 7 days as an option? It wouldn't make sense otherwise.

None of your suggestions are viable because if you're doing it like myself and getting only the caches recently visited or changed then those that aren't visited often would get flagged wrongly.

No, the solution is to fix that is what is broken. GSAK and programs like it are here to stay. In stead of blindly following the wrong path, don't you think it's about time to recognize something needs to be fixed? All you need to do is give GSAK all the information it needs to work properly.

August 15, 2005

So once again, people want THIS SITE to change things to fix a problem in SOMEONE ELSE'S program.

Why not ask Clyde to fix the problem with GSAK?

This is not just a problem for GSAK, it's an intrinsic problem of updating any data which you aren't receiving without knowing why you aren't receiving the data. Such an archived waypoint will never come in a PQ and so may not be removed from whatever hardware or software system you use - even perhaps your GPSr, however they work their waypoint databases.

GSAK is not encouraging people to cache with stale data, but they do need to know how to use such a powerful database to avoid coming to the wrong conclusions.

Perhaps by default GSAK should not archive stale data.

Yep, that's the problem of how GSAK will know it's stale since gc.com won't send data indicating the caches which should be staled.

Perhaps the GSAK database could include the date a cache was last updated

Perhaps it should have a "stale data" flag on caches that havent updated in x days.

Updated doesnt even have to mean a new cache log. The mere presence of the cache in a new PQ could be enough to zero the timer.

GSAK has the date of last update (GPX and user) and I believe the Last GPX Date is on the default view, too (it's been on mine for a while) - people need to be sure to include that in their filters, and then it will not display those caches. However, due to the uncertainties of the PQ system it is possible that a cache is not updated (coming in a PQ) do to the cache no longer falling within the 500 (or whatever) cache limit of a PQ due to new caches in the radius pushing it out of the search results. So using such a filter will potentially eliminate caches which are searchable but which simply aren't updated.

Although PQ radius elimination issues are a sidebar here, this can also result in disabled caches not being received (although they were previously received as enabled). Unless you are able to run enough PQs to guarantee latest information on your trip and don't cache near the boundaries of your PQ, there are other effects you could run across like this. With the 5 PQ daily limit in a very cache dense area, you could find yourself caching at the boundaries (as I did in my trip to the UK last summer - I had PQs centered on Waddesdon, London and Cambridge, but had some dead zones where the 500 cache limit did some funny things)

Just a few ideas for ways the 3rd party software vendor might address a problem with his software instead of expecting other people to change their software so his will work properly.

I don't think the majority of GSAK users would like to simply delete all the caches in their databases which haven't come from gc.com lately. People are using GSAK more and more for other kinds of waypoint management (not just geocaching and related games). In addition, it is possible that you want to keep a cache in the database because of its history. So a more accommodating approach would be to get archived cache information in an automated way so that caches could be updated more easily.

If people want to keep their cache data updated or clean out stale caches, it is still a manual process to either archive or delete the caches or get the GPX files for each archived cache manually. GSAK does have macros and filters to make it easier to keep your data clean and updated, but the point is still valid for gc.com to provide status update services for archived caches (and perhaps, to some extent, newly disabled caches in the border area of a PQ - although these can be extracted with yet another PQ which should not hit the count before the radius). We'd be talking about a status data subset of the GPX schema.

August 15, 2005

What about the "Is not active" flag? I guess that's only for temporarily disabled caches...

Edited August 15, 2005 by Tidalflame

August 15, 2005

What about the "Is not active" flag? I guess that's only for temporarily disabled caches...

The disabled flag is of great use. Caches which are disabled and come in a PQ before they eventually get archived usually don't get accidentally hunted...

5.3k · August 15, 2005

A third-party program such as GSAK can identify archived caches from the existing PQs without any ambiguity. So I don't understand all the fuss. Just because GSAK doesn't identify those caches for you doesn't make it geocaching.com's problem.

Seriously. The algorithm is not that complicated. There are some things that geocaching.com could do to make it a little easier (like putting the PQ name in the PQ, for example), but those do not involve adding archived caches to regular PQs.

In fact, I don't see any significant issues that have been brought up of late that cannot be solved by third-party software. That includes keeping more than 5 logs, corrected coordinates, your own notes, intermediate waypoints, etc. The only one I cannot see how to solve with third-party software is the issue of getting all logs for a set of caches. Everything else is quite solvable -- in fact, I have done much of it myself.

August 15, 2005

A third-party program such as GSAK can identify archived caches from the existing PQs without any ambiguity. So I don't understand all the fuss. Just because GSAK doesn't identify those caches for you doesn't make it geocaching.com's problem.

Seriously. The algorithm is not that complicated. There are some things that geocaching.com could do to make it a little easier (like putting the PQ name in the PQ, for example), but those do not involve adding archived caches to regular PQs.

In fact, I don't see any significant issues that have been brought up of late that cannot be solved by third-party software. That includes keeping more than 5 logs, corrected coordinates, your own notes, intermediate waypoints, etc. The only one I cannot see how to solve with third-party software is the issue of getting all logs for a set of caches. Everything else is quite solvable -- in fact, I have done much of it myself.

Please share your magic algorithm which knows how a cache which isn't in a given PQ would have been there if and only if it were not archived the day before.

5.3k · August 15, 2005

Please share your magic algorithm which knows how a cache which isn't in a given PQ would have been there if and only if it were not archived the day before.

Easy.

R = maximum disatnce from origin of PQ to caches included in the query results. (NOTE - this is generally not the "maximum radius" you specified.)

r = distance from PQ origin to the cache in question.

If r < R and the cache is not in the query results, it is archived.

If r > R then you need to run another query with different coordinates to determine if the cache is still active.

Simple, see? A third-party app can quite easily go through the "missing" caches from a previous PQ and determine whether they are archived or they have moved outside the PQ circle.

The "problem," if there is one, with GSAK is that it considers the PQ an addition to its existing database, instead of considering the PQ the base data and adding additional information to (or inferring additioal information from) it.

Edited August 15, 2005 by fizzymagic

August 15, 2005

Please share your magic algorithm which knows how a cache which isn't in a given PQ would have been there if and only if it were not archived the day before.

Easy.

R = maximum disatnce from origin of PQ to caches included in the query results. (NOTE - this is generally not the "maximum radius" you specified.)

r = distance from PQ origin to the cache in question.

If r < R and the cache is not in the query results, it is archived.

If r > R then you need to run another query with different coordinates to determine if the cache is still active.

Simple, see? A third-party app can quite easily go through the "missing" caches from a previous PQ and determine whether they are archived or they have moved outside the PQ circle.

The PQ definition is not in the PQ result file, thus neither the origin nor radius of the original request are known.

Would that be something you would think WOULD be a valid request from gc.com (PQ definition information in the schema of the PQ-based GPX files)?

The "problem," if there is one, with GSAK is that it considers the PQ an addition to its existing database, instead of considering the PQ the base data and adding additional information to (or inferring additioal information from) it.

Actually, it does both add and update. What it cannot do is know data outside of its database and outside of the PQ file.

I currently have a fully functional macro and filter to find caches like this and archive or delete them myself (based on date of GPX update and an arbitrary area which I know is safely covered by PQs).

And in the end, your proposed solution would only work for origin/radius PQ-based GPX files and not PQ-based GPX files in general, nor non-PQ based GPX files (of which currently, there is only the singleton file from the cache page and this does indicate complete cache status and logs, regardless of whether it's archived or not.) And I would hesitate to enable that algorithm for someone who is combining PQ circles to find caches along a route without better tools to stop it from archiving the caches it just loaded due to the overlapping circles.

There is, of course, the general solution which is fully and only within gc.com's abilities (check a new Is Archived filter on the PQ, and sending only GC waypoint identifier and archive status on those caches). And this would be especially useful for people other than GSAK owners who have to cull their paper binders and don't have such a powerful database tool at their disposal.

In the global scheme of things, having better tools on the site to filter caches and make hitlists (still can't download bookmark lists in GPX files, map-based tools to select multiple caches and add to a bookmark list, etc.) would tend to eliminate the need for GSAK or Watcher or any of these tools for a lot of people who are just trying to organize their lists of caches to hunt. This request is one of many which would be completely mitigated and unnecessary with the evolution of better tools on the site to organized people's caching needs.

1.1k · August 15, 2005

Please share your magic algorithm which knows how a cache which isn't in a given PQ would have been there if and only if it were not archived the day before.

Easy.

R = maximum disatnce from origin of PQ to caches included in the query results. (NOTE - this is generally not the "maximum radius" you specified.)

r = distance from PQ origin to the cache in question.

If r < R and the cache is not in the query results, it is archived.

If r > R then you need to run another query with different coordinates to determine if the cache is still active.

Simple, see? A third-party app can quite easily go through the "missing" caches from a previous PQ and determine whether they are archived or they have moved outside the PQ circle.

The "problem," if there is one, with GSAK is that it considers the PQ an addition to its existing database, instead of considering the PQ the base data and adding additional information to (or inferring additioal information from) it.

Fizzymagic, this is really exciting. Since you know this to be simple and I know you to be a reasonable person, it looks like a solution is near for me. (As I have indicated in another thread, it is not a big issue for me, but if it is indeed simple, then I will be able to implement this)

You speak of R, knowing that it is not the number specified in the query. This number is also not in the PQ result. So where will you get this number? Please - I honestly want to know how you will calculate it from a only bit of data in the gpx file that could be used for this; a line like:

(I'm sure you are aware that it is trapezoidal area on the surface of a sphere which does NOT circumscribe the circle in question). .

Oh and one bit of information you ignored also comes back to bite you: you need to know the origin of the circle. It is also not in the PQ results and would also need to be determined.

5.3k · August 15, 2005

You speak of R, knowing that it is not the number specified in the query. This number is also not in the PQ result. So where will you get this number? Please - I honestly want to know how you will calculate it

R is the distance from the origin of the PQ to the point in the PQ furthest from the origin. The bounding box does not need to be used. And the maximum radius you gave the PQ is also irrelevant.

Oh and one bit of information you ignored also comes back to bite you: you need to know the origin of the circle. It is also not in the PQ results and would also need to be determined.

While it would be convenient for this information to be included in the PQ (and there is really no excuse for why it is not), it can be estimated quite accurately from the PQ data.

The key here is that the PQ results are ordered by distance from the origin. Thus, you know for each cache returned in the PQ that it is further from the origin than the previous one and closer than the next one. Using this information, you can do a maximum-likelihood estimate of the position of the origin. This estimate does not require that the caches be distributed evenly around the origin; in fact, all it requires is that the results are not all in a straight line and on one side of the origin. My guess is that using this technique one can get the position of the origin to an accuracy better than 0.1 mile.

Once you have the origin, R is trivial to calculate and the archived caches can be identified as I described above. Since I personally don't care about this information, I haven't bothered to implement it, but if somebody would like to pay me, I would be more than happy to code it up.

1.1k · August 15, 2005

Fizzymagic, that's disappointing. I thought maybe that it WAS simple and you could share with us how. I do a lot of data fitting and I would not characterize that step as simple for anyone. Also your caveat excludes many of the PQs I run. You say the data should not be a linear group. How about the following, is this too linear?:

Create a PQ with center off in the Pacific Ocean, Radius large enough to grab just the San Francisco Peninsula in such a way that the circle goes through the Bay, grabs as little of Marin County and St Cruz city as possible, yet gets some of the city of San Francisco and down past Palo Alto and surrounds, but to specifically exclude San Jose and Fremont. For the others reading this: note that this is fairly dense with caches to the north and central, a bit sparse to the south west - the general shape of these data will be like a fat sausage. I see no way that you would be able to extract the correct origin and radius from the data you receive.

Ditto for this one: Center on Soda Springs, CA (that's in the Sierra Nevada Mountains on I-80 towards Nevada); radius fairly large to include a lot of mountain area (radius extending well into Nevada past Reno and Carson City) BUT restrict PQ to only return California caches. I doubt you can calculate the radius or center from the PQ results with any degree of reliability. My point on this is that the estimate of R that you would make with data fitting techniques is likely to be a poorer estimate than the number that you know to be too large (the one you typed into the PQ specification). The problem here is sparseness of data and the fact that it not a nice circular group of points.

There are many other PQs that I run routinely as part of geocaching where plotting out the data does not result in a circular "cloud" of points on the map, thus defeating your approach. I think you are overestimating the simplicity of something I consider virtually impossible with any useful degree of precision (I work with data-fitting and modeling in my profession). I am not interested in paying you anything since that is apparently a waste of our time; I was hoping you would be able to share a reliable method that had the attributes of being

Easy.

and

Simple

.

5.3k · August 15, 2005

Fizzymagic, that's disappointing. I thought maybe that it WAS simple and you could share with us how.

There are many other PQs that I run routinely as part of geocaching where plotting out the data does not result in a circular "cloud" of points on the map, thus defeating your approach.

The "linear" examples you gave me were not linear; they were half-planes. The method I outlined would work fine for those. The only situation for which it will not work is one in which the waypoints are all in a line pointing directly away from the PQ center and all on one side.

On further reflection, I realized that the problem is a linear-constraints problem. Find the centroid of the polytope given by the ordering constraints and you'll have the center of your PQ.

But I won't be involved in any such solution.

Let's review, shall we?

The problem I described as "simple" is the problem of determining archived caches from a PQ given a known origin for that PQ. That problem remains simple and easy.

However, I (apparently erroneously) assumed that since the person who made the PQ knows the origin of the PQ, it could be used by a third-party app to determine which caches are archived.

It appears that the task of entering those coordinates is too taxing for most users.

So I presented a method for estimating those coordinates only from the results of the PQ. This is no longer trivial, but it is not horrifically difficult.

In thanks, I get attacked for presenting it. I can certainly see why Jeremy is so anxious to accommodate your requests. Given the tone of this and other posts, I am no longer willing to discuss this issue.

2.8k · August 15, 2005

Here's the filter criteria I use in GSAK for "Likely archived":

Distance: Less then or equal to 25 miles

Found Status: Not Found / Exclude caches placed by me

Available Status: Available

Last Update GPX date: Not during the last 7 days

What I end up with, assuming there's any caches that fit the filter criteria, is a list of caches within 25 miles that haven't been downloaded in a PQ in the last 7 days. I could narrow down the Last Update period but I don't have a full not found PQ sent every day. The miles need adjusting from time to time as the range of closest 500 not found fluctuates.

It's not perfect, but it's close enough for government work. I have a similar filter for caches I've found. The reason I maintain found caches as well as there are times when I want to know if a cache goes missing and I could look on it if I'm in the area. Also, if it's archived, I could pick it up if the cacher is no longer active to remove any geo-litter from the landscape.

If I received a PQ of archived caches within a certain mile radius, it would eliminate any guess work or manual flagging of the caches I do now.

August 15, 2005

So I presented a method for estimating those coordinates only from the results of the PQ. This is no longer trivial, but it is not horrifically difficult.

In thanks, I get attacked for presenting it. I can certainly see why Jeremy is so anxious to accommodate your requests. Given the tone of this and other posts, I am no longer willing to discuss this issue.

Your input is very welcome, but there are plenty of PQs - such as caches in a state, caches from a watchlist, locationless caches, or caches across multiple states, or caches broken up by the calendar date partition workaround - for which the technique just doesn't work. Because it doesn't work in general, and because you can't tell from a PQ-generated GPX what the parameters were, I don't see how this is useful other than in a special case.

People can already use existing GSAK macros and filters to avoid seeing caches which are most likely to be archived and avoid hunting them, and they need to learn about those features and use them first before asking for features from gc.com.

However, when people request some kinds of things here, they are told that external tools already do this, so use the external tools. So people use GSAK and other tools because they offer features not available on gc.com - in the discussion below, I have indicated such potential features with (n).

There is a limit to the number of PQs per day (1) and caches per PQ (2), so there is a very real possibility that any tool combining PQs for a route will have archived caches in them. Because you cannot request a PQ just for a bookmark list (3) or a user-defined set of caches (4), nor does the site offer a caches along a route function (5), you have no way of re-checking all the caches along a route in a timely manner. And so, thinking around the problem, a file of archived caches in the last week for the whole country would be small and could easily be used to automatically remove all these caches from the list.

This request stems from working around five potential features on the site which do not exist. My personal opinion is that implementing some or all of those features would be preferable and please more people and be more useful, but I would support getting a list of archived caches if none of those are deemed higher priority. (1) and (2) are not likely to change. (3) has been promised for a long time, (4) or a related web service to pick up cache status may or may not be on the table (and scraping is against the TOU) and for (5) we have been told to use external tools (which all, including Watcher, have this problem).

1.1k · August 15, 2005

In thanks, I get attacked for presenting it. I can certainly see why Jeremy is so anxious to accommodate your requests. Given the tone of this and other posts, I am no longer willing to discuss this issue.

Apparently I need to apologize because in my excitement and disappointment I have come across as an attacker. I really hate it when I make suggestions in the forums and folks jump all over me for something-or-other. I certainly had no such intent and if it does seem that way then I apologize to you Fizzymagic and will try to craft my messages with less emotion. The feedback has been valuable for me, even if it did not generate the exact information that I was hoping for. I honestly hope you will continue to stick around to discuss the issue. I recall some of your other posts and I have always been impressed with you analytical skills.

Unfortunately Jeremy is NOT anxious to accommodate our requests and has made that pretty clear. So we are indeed in a position were we look elsewhere for a solution to the annoyance of occaisionally searching for an archived cache.

20.7k · August 15, 2005

*** LOOK HERE, THE SOLUTION IS IN THIS MEANDERING POST ***

Call me crazy, but Fizzy's method actually made sense to me. It reminded me of a trip to LA last year.

You see, I was going to be back and forth to the west coast regularly for about a month. The hotel that my company had me in had no internet connection, so I had to bring all the caches with me, loaded to my laptop. What I did was divide the area into four PQs. All were loaded into my pda's Plucker app, but my GPSr could only hold 500 waypoints.

My solution was to keep the files separate and load them all into MS S&T, with different colored tacks for each of the four files. Obviously, at the fringes, you would have the same waypoint in more than one file. When loaded to S&T, the last file's color showed, while the earlier ones weere masked.

Now for my totally non-mathmatical solution to this dilemna, incorporating what Fizzy taught us.

I can simply dump all of my 'historical' waypoints from GSAK to S&T. Next, I can drop in new PQs, with a different color tack. If there are any of the 'historical' waypoints showing within the spread of the new PQ's tacks, they are archived.

Edited August 15, 2005 by sbell111

1.1k · August 15, 2005

Indeed! and with the new macro language in GSAK we should be able to calculate the same thing in GSAK somehow.

20.7k · August 15, 2005

I don't know nothin' 'bout no macros.

7.2k · August 15, 2005

Something folks keep missing is if you're trying to pare down the number of PQs with the "changed in past 7 days" because the 500 cache limit, then many of the in-GSAK solutions aren't. Simply put, you don't know the reason any particular cache is not included in a PQ; it could be because it's archived or simply not visited in the previous 7 days.

The reason I use the "last 7 days" option is to cut the number of PQs I run a week by nearly half.

20.7k · August 15, 2005

I'm lucky that I don't require thousands and thousands of caches to get my fix. I can just go in and target the caches that I want. A couple of PQs a week and I'm gooood.

BTW, the solution was up there. ----^

Edited August 15, 2005 by sbell111

August 16, 2005

Personally I wish that the "updated within 7 days" PQ option would not include caches that were mearly found, it seams like every week half of the caches in the area are found and this really cuts down on the PQ's radius. In my area, a 500 cache PQ set to show all updated within a week will have only a 15 mile radius on a good day. But I would like it to include those caches that were archived in the last 7 days.

August 16, 2005

So once again, people want THIS SITE to change things to fix a problem in SOMEONE ELSE'S program.

Why not ask Clyde to fix the problem with GSAK?

Don't get me wrong, I find GSAK a useful tool, but as it stands it encourages people to use stale data.

Sorry to cause a stir. I knew I shouldn't have mentioned GSAK by name.

My dad has a Garmin and loads a PQ into MapSource. Nice icons based on cache type and availability. Before going out for a weekend cache run, he loads another PQ of the area overtop of the old one. This adds new caches, leaves the ones on the fringes alone (that have been dropped from the PQ due to the new caches), and updates the status for each "temporarily disabled" cache. "Archived" caches are still shown as valid caches!

My friend has a Magellan and does similar stuff with MapSend. Same result.

Darn all these third party software packages for having the same problem!! [ :rolleyes: ]

In one breath, it sounds like we shouldn't be using third party software because they are screwing up GC's PQ files.

In the next breath, it sounds like GC is delinquent in this capability, and we need to acquire third party software to make up for it.

I'm not casting blame. I'd just like to see a checkbox when choosing cache type. Active, not active, REALLY not active, etc. I don't see any harm that this could possibly cause and would simplify things with non-computer savvy people.

Myself, I would like to have the logs explaining why caches in my query were archived. I would not send others to my favorite caches if they didn't exist anymore. Since we aren't allowed to scrape the site for these non-updated caches, we have to search for each by name, select the GPX download icon, and grab them individually. Jeremy isn't preventing us from accessing the data, so why tie up the resources by having us do it online, one by one, over a long period of time? How often have you gotten server load errors? Give me a simple checkbox, send me my PQ during non-peak hours, and I'll read the logs offline the next day over lunch (when I don't even have a network connection available!)

Anyways, my question was answered. It can't be done on this site, and the likelihood of it being implemented is nil. Oh, well...

Bisanabi

5.6k · August 16, 2005

Fizzymagic has the answer (I think, I'm not completely sure I followed all he said, some of the terms used were new to me). Here is the macro I use to check for archived caches after each PQ load. It checks all the caches within the range of the PQ and filters out all disabled, archived and updated caches.

FILTER Name="User Flag Set"
SORT By="Distance"
GOTO Position=bottom
SET $DistOut=$d_Distance #This finds the farthest cache from center
CANCELFILTER
MFILTER IF=($d_Distance<=$DistOut) .and. .not. ($d_TempDisabled .or. $d_Archived) .and. .not. ($d_UserFlag)

To use it, make sure the load sets the user flag on caches in the PQ. The center point of the DB must match the centerpoint of the PQ (I generally use my home co-ords). The Filter "User Flag Set" just filters to those with the user flag set (hence the name). Not all the caches listed will be archived, as there may be caches equal to the farthest distance but outside the number limit of the PQ (this is one point where Fizzymagic's theory breaks down - not everything is < or > R). I also only grab active caches, so some are merely disabled. After running this I have a short list to check on GC.com to see the status.

The main reason I maintain a database of caches, is to get all the logs (at least since I first loaded the cache) for that cache. Sometimes updated co-ords are more than five logs down.

August 16, 2005

So once again, people want THIS SITE to change things to fix a problem in SOMEONE ELSE'S program.

Sorry. The problem is with GC.com in this instance.

If I have PQs set up, with the "last 7 days" option active, how is GSAK supposed to know why a particular cache didn't download (because it was archived, rather than simply not accessed in the last 7 days?) GSAK isn't magic, it can't jump to the correct conclusion without the required data.

The "last 7 days" option has to include caches that were archived in the last 7 days as well.

20.7k · August 16, 2005

...If I have PQs set up, with the "last 7 days" option active, how is GSAK supposed to know why a particular cache didn't download (because it was archived, rather than simply not accessed in the last 7 days?) GSAK isn't magic, it can't jump to the correct conclusion without the required data. ...

The solution described by Fizzy and operationalized by me and The Jester will not work if you take the 'last 7 days' option. Cast your net wider.

5.3k · August 16, 2005

I really like SBell's and The Jester's solution to graphically finding archived caches. Maybe somebody will whip together a little app that will do this more automatically.

Meanwhile, I have been thinking about the problem, and I propose another algorithm that will detect archived caches, but does not require any knowledge about the PQ center or radius. It will not detect quite as many archived caches as the previous algorithm, but it is just about as simple.

The insight for this algorithm is the observation that if a point is outside a circle, all the points inside the circle subtend an angle of less than 180 degrees from its perspective. So if the angle subtended by the other points is greater than 180 degrees, the point is in the circle.

Here's how the algorithm works:

Pick a point that doesn't appear in a "radial-type" PQ.
Divide the 360 degree circle around it into 10-degree wedges (exact size is up to you; 10 degrees is used as an example).
Calculate the bearings from the point in question to the waypoints in the PQ.
For each of the waypoints in the PQ, mark the wedge that contains the bearing to it as "filled."
When you are done, if there are 180 degrees worth of contiguous wedges that are empty, consider the point "outside" the PQ and its archived status is not known. In this example, you would look for 18 empty wedges in a row. Remember to wrap at zero degrees!
If there are not 18 empty wedges in a row, then the point is "inside" the PQ circle and it should be marked as archived.

I understand that it is mathematically possible for a point to be marked as "outside" the PQ circle when it is actually inside, but this is going to be a relatively rare occurrence, and can usually be addressed in another PQ. But the key point of this algorithm is that it will never mark a point as "inside" the PQ circle unless it actually is. So those caches that this algorithm reports as archived are guaranteed to be archived.

I hope this is helpful.

5.3k · August 16, 2005

The above algorithm can be refined to speed it up for most caches. If you can find a waypoint in the PQ in each of the four quadrants surrounding the waypoint in question, then it is "inside" the PQ and, if it didn't appear in the PQ, it is archived.

The four quadrants are defined as:

Lat1 > Lat, Long1 > Long

Lat2 < Lat, Long2 > Long

Lat3 < Lat, Long3 < Long

Lat4 > Lat, Long4 < Long

Where Lat and Long are the latitude and longitude of the point in question, and Lat1-4 and Long1-4 are the latitudes and longitudes of 4 waypoints in the query.

Does that make sense? Once again, there will be caches "inside" the PQ for which this is not true, but every point for which this is true is guaranteed to be inside the PQ. So it is a very quick and simple way to identify archived caches.

5.3k · August 16, 2005

Oh -- one last thing. The algorithm described above does not require that the PQ you are using to test the points has the same origin as the PQ you originally used to get the points you are testing. The only constraints are:

It can only work on one radial PQ at a time; you can't group PQs together and then apply this test using the grouped PQs.
The point you are testing would not be excluded from the PQ for any other reason than being too far from the origin or being archived.

This test is now just about as simple as the first one I described. It should be quite easy, given a database of waypoints and a radial PQ, to write a little script that will spit out archived caches from the database.

Edited August 16, 2005 by fizzymagic

9.2k · August 16, 2005

Doesn't the refined algorithm assume that you're either (1) obtaining a pocket query of ALL caches in the area, or (2) that none of the cache's parameters have changed? For example, assume a cache that started its life as a "regular" sized cache. A geocacher who hates anything smaller than an ammo box obtains this cache in a PQ that looks for only caches within 50 miles that are "regular" or "large." A month later, the cache owner switches the size to "small" after receiving complaints that finders can't fit trade goods and travel bugs into the sandwich-sized container. Our ammo-box loving PQ recipient will now obtain an updated PQ that does NOT include this modified cache. Would this not produce a false positive and flag a cache as archived, when in fact it's just been modified?

5.3k · August 16, 2005

Doesn't the refined algorithm assume that you're either (1) obtaining a pocket query of ALL caches in the area, or (2) that none of the cache's parameters have changed?

Yeah, that is what I meant by my second constraint: that the only reasons that the cache would not appear in the PQ would be that it was outside the radius or archived.

If the parameters of the PQ change to exclude cache types, or only look at caches modified in the last 7 days, or things like that, the algorithm will not work.

18.9k · August 16, 2005

Got to agree with Mopar here - If you want fresh up-to-date data - get it from the site. By definition all offline databases will be stale.

I think not including archived caches in PQs actually encourages people to return for the "fresh" data.

Don't get me wrong - I use GSAK and I like it but I understand that it is simply off-line data and if I want new up-to-date stuff - I have to check with the site. Seems simple to me. I have heard all the arguments for why some think they need large chunks of data that isn't thiers but buttom line is: there is only one up-to-date source of the data.

7.2k · August 16, 2005

Got to agree with Mopar here - If you want fresh up-to-date data - get it from the site. By definition all offline databases will be stale.

Yeah, I agree too.

Since I travel a lot, I dump the GPS and the PDA often. I use GPX Spinner so I can change the icons used for caches. I change all caches to the Garmin provided geocache container icon except micros, which have their own icon. My personal waypoints that always stay on the GPS are the simple "dot" waypoint icon. Spinner will also change the first letter of a waypoint to whatever you want, so traditionals are GCXXXX, virtuals are VCXXXX, multicaches are MCXXXX, puzzles are PCXXXX and so on.

All you have to do is delete by symbol and the old geocaching data is easy to remove from the GPS. I load up the new "spun" GPX file, sync up the PDA and off I go. It takes very little time to do this, about 10 minutes at the most for 500 caches. I always have fresh data each time. Since I cache by waypoint only and only read the description if I have to, I know what type of cache it is when I go for it by the icon and the first letter of the GC waypoint. If it is a VCXXXX or an MCXXXX or a PCXXXX I get the PDA out right away. If not, I jump out of the car and go for it. More than half the time I never see the description or the name of the cache until I get home and log the cache online.

August 17, 2005

Spinner will also change the first letter of a waypoint to whatever you want, so traditionals are GCXXXX, virtuals are VCXXXX, multicaches are MCXXXX, puzzles are PCXXXX and so on.

I prefer to use GTXXXX traditionals, GVCXXXX virtuals, GMCXXXX multicaches, are GPXXXX and so on. I like the fact all the geocache waypoints all start with G. The main reason I use GSAK instead of something like spinner (especially since I wrote my own spinner like program), is that I want to use its filters. I use the route filtering a lot to pick the few geocaches close by the route. Depending on the trip this sometimes requires getting more than one pocket query.

6.4k · August 17, 2005

Got to agree with Mopar here - If you want fresh up-to-date data - get it from the site. By definition all offline databases will be stale.

I think not including archived caches in PQs actually encourages people to return for the "fresh" data.

Don't get me wrong - I use GSAK and I like it but I understand that it is simply off-line data and if I want new up-to-date stuff - I have to check with the site. Seems simple to me. I have heard all the arguments for why some think they need large chunks of data that isn't thiers but buttom line is: there is only one up-to-date source of the data.

Of course I agree with you agreeing with me!

I guess all the automating is an interesting challenge if nothing else; but I have a much simpler approach. I admit I do use GSAK to maintain a personal offline database for my state, mainly to have access to more then the last 5 logs. When I travel I will start building a similar database for that area a few weeks before a trip. I use this data to build the file for my PDA, and for pre-planning/PQ tweaking on trips.

For the PDA I don't really care if I have a few archived caches thrown in.

However, when it comes to loading my GPS and actual routing info I only use the freshest PQs, not the stale GSAK database. If I happen to have archived caches in the PDA it's no big deal since they aren't even in my GPS or displayed on my laptop for me to search for.

I'm not in such a rush to go caching that I can't spare the extra couple of keystrokes it takes me to do it this way.

7.2k · August 17, 2005

What you guys are failing to understand is some data will not be any different from one PQ to the next. Any cache that is not visited or changed in any way is no "fresher" in the next PQ. The system allows it to send just the caches that have changed in the last 7 days, BUT not all of the changes. The critical one missing is the one that should be removing the cache from the list because it is no longer available.

It's all well and fine that some of you like to live within the limits forced on you, but some of us see a better way of doing it. We will continue to use GSAK even though we are forced to manually remove archived caches. It 's a pain, but we feel it's a better solution than starting all over again every week.

Sure, you can go to the site, set up a PQ and have it sent to you within minutes. You then run it through spinner, then plucker, then upload to your GPS.

Are you can do it like us and have it automated through GSAK where when we decide where we are going, we upload directly to the GPS and, if we're using it, CacheMate, and we're off. Boom, done, over and out. Additionally, what if we change our minds as we are going out the door? Well considering we have a vast database of caches in our laptop that covers our complete stomping grounds, we can change our minds in mid-stride and go to some place completely different!

That's the convenience of GSAK and proper PQ working for one--you don't have to decide what you want to do until just before you get there. Can your scheme do that?

7.2k · August 17, 2005

Additionally, you can look at a PQ with ALL of the changes in the past 7 days as a "differential query" where it includes ALL of the changes necessary to make the present data match a data set as if it were pulled in it's entirety.

In other words, a PQ with all changes, including archived caches, would make a database just as "fresh" as if you pulled the complete set. Not getting the archived caches is what is making some of the data "stale."

5.2k · August 17, 2005

Sure, you can go to the site, set up a PQ and have it sent to you within minutes. You then run it through spinner, then plucker, then upload to your GPS.

You haven't looked at Spinner in a while, have you? The current version of Spinner can run a batch file automatically. The batch file can do the Plucking (or iSilo-ing) and upload it to your GPS via Babel. There is even a sample batch file that includes many common actions.

5.3k · August 17, 2005

It's all well and fine that some of you like to live within the limits forced on you, but some of us see a better way of doing it.

I don't like to live within the limits forced on me. Not at all. But I have accepted that some things are just not going to happen, and I am trying to avoid beating my head against a brick wall over them.

It would be quite simple for TPTB to put archived caches into what I call "delta" PQs -- those PQs that include all changes in the last week. But it's not going to happen. The classic pattern is playing itself out as it has many times in the past -- the nonsensical excuses for why it has to be the way it is, the piling on of the Greek chorus, etc. etc. etc.

So I choose, instead of getting upset over something that is not going to change, to try to come up with clever ways around the constraints. That's what I have attempted to offer here. It's too bad they don't meet your needs. But maybe they will help some people. And I stand by what I wrote earlier in this thread -- I still don't see any substantial issues that can't be solved by appropriate third-party software.

But don't think for a second that I buy any "liability" excuse for not including archived caches in PQs, or that I am happy with the limits imposed by the current PQ system. I'm not. I'm just trying to not let it ruin my day.

7.2k · August 17, 2005

You haven't looked at Spinner in a while, have you?

Actually, I use it to generate the HTML files I like to upload to iSilo and a custom TXT file for MSS&T.

I prefer the single character rating scheme that GSAK offers; 1, A, 2, B, 3, C, 4, D, 5. There is no translation for the whole numbers.

It's too bad they don't meet your needs.

It doesn't work for me because it doesn't let me know the reason the cache wasn't in the PQ.

What I've taken to doing is ordering the caches by earliest GPX load. Press F4 to bring up a browser page with that cache on it. Resize the window so I can see both windows. If the cache is archived, click back on the same cache in GSAK and press F3, (twice, I think to make to unavailable, then archived.) If the cache isn't archived on gc.com, I click the next cache and repeat. After I'm done, I move the caches to my "Archived Caches" database just in case one comes back I can grab the earlier logs.

You can move fairly fast. It's not 100% if you don't do the whole list. I've got 5000 caches to weed through so I only to the earliest in the sort. But, it does provide you with a 100% certainty of the status of the caches you do check.

Actually, I don't think I'm getting too upset with Jeremy. I think I get more exasperated with those that say the system isn't broken.

20.7k · August 17, 2005

It's too bad they don't meet your needs.

It doesn't work for me because it doesn't let me know the reason the cache wasn't in the PQ.

I don't think the system is broken. As explained above, it is possible to quickly verify whether caches have been archived. Once you identify that a specific cache has been archived, its a trivial task to mark it so in GSAK.

Yesterday evening, because I was bored and given that I haven't deleted any of the PQ emails that I receive, I played around with how long it would take to positively identify the archived caches. I was able to identify all of the archived caches received in the previous PQ and mark them as such within two minutes.

Therefore, in my mind, this is a problem that is easily solved with third party software and should not require any changes to GC.com.

9.5k · August 17, 2005

Neither GSAK (or any other database manager that works with GPX files) or the PQs are broken.

It is a lack of a small feature, a small piece of information that is not being provided By Design that allows the off-line database to go stale and that's fine.

I merely ask that an alternative method be provided for current notifications of an archival with the same feature functionality as the PQ. The Insta-notification is close to this. If I can choose to receive a GPX of those notifications, and choose a regional value rather than a radius from a center point then that would satisfy me to no end. It's a current notification system already in place and limited to a one time notice.

I could care less about archived caches before I started maintaining an off-line database but I would like to update my database without being forced to dump it, by using the tools and options already designed into the GC.COM system.

August 17, 2005

I would agree with sbell111 in most respects with one exception.

You dont really need to know why a cache has been archived via a PQ, if you have a system that can detect that a cache has been archived and you want to know, the reason is only a couple clicks via the GC.com website. And if you regularly generate PQ's for an area you can tell that a cache is archived by the fact that its missing, as long as you can get around the shrinking circle problem.

The one place I would disagree is for caches archived in the current week. If you run your queries once a week, say on tuesday when the load is not as high, and a cache is archived on thursday and you go looking for it on saturday, there is no way to tell a cache is missing without running all your PQ's every day, which is impractical, or scraping the website, which is illeagal. I think that being able to create a PQ that showed all the archived caches in the last week would fill that need and reduce the demands on the PQ generators.

The closest thing to a work around is the new notification system that will tell you when a cache is archived. Maybe our friends at GSAK could automate a way to read in these emails and mark the caches as archived.

August 17, 2005

I am a big user of GSAK and have come up with filters and ways to help weed out archive caches based on dates as such. No problem -- its not automatic but it is a solution that I can live with.

A while back when this thread was discussed before, robertlipe chimed in about the need for archive notifications to "update" another "offline database" -- your memory! I had to laugh at the concept but it is sooo true!

I went bike riding with a friend of mine in a park across town. While riding I realized there were about 4 caches which we would pass along the trail. Since she is trying to "break 100" I thought I would play human GPSr for her and get her in the general area of the caches that I had found months ago. Yep, you guessed it -- two of the caches had been archived and were not there!!! I have a pretty good memory and remember seeing that one had been disabled for awhile so when it was not there I figured out that it had been archived. For the other --- I had never seen that it was disabled and archived so we went on a wild goose chase through the park!!

20.7k · August 17, 2005

...
The one place I would disagree is for caches archived in the current week. If you run your queries once a week, say on tuesday when the load is not as high, and a cache is archived on thursday and you go looking for it on saturday, there is no way to tell a cache is missing without running all your PQ's every day, which is impractical, or scraping the website, which is illeagal. ...

I don't think we're really in disagreement on this. The offline database will always only be as current as the last PQ. In my opinion, however, the problem of having cache data in my pda that is a few days old is not that big of a deal. After all, most caches are archived after they go missing. They will always be missing for a little while before this is noted on the cache page and the cache is evetually archived. My chances of not finding the cache is not significantly greater because my data is as many as six days old, in my opinion.

This actually brings up the point that has been made in several threads recently that anyone can forgo running their PQs regularly and only request them when they need them. Since they hadn't been run recently, they will run very quickly.

Archived Caches

Recommended Posts