Jump to content

Pocket Queries Future?


bjornkn

Recommended Posts

I didn't find any forum group for discussing the services of the geocaching.com database, so I'll try to post this here.

 

Pocket Queries is a very nice "tool" which allows you to keep an

offline database, like GSAK, updated all the time.

Apparently it must take a lot of CPU power to handle all those queries.

But if you changed it slightly I'm sure you would find that it would need a

lot less power, save a lot of bandwidth and also be much faster.

 

Take my case as an example:

I live in Norway, where there is currently about 800 caches.

I use GSAK for maintaining an offline database.

Now I need to run at least 3 queries to get all caches updated, which means

a lot of CPU power and wasted bandwidth because most of it has been changed

anyway since last time it was run.

To get around that 500 limit I have to do:

Run a query for all caches placed before Jan 1 2004

Run a query for all caches placed between Jan 1 2004 and Aug 10 2004

Run a query for all caches placed during the last month.

And I have to do this several times each week to be up-to-date!

 

If there was no limit on the number of caches retrieved, or if the limit was much

higher then I could've run one single query for all in Norway.

There could be a limit to how many caches you get packed in each zip file though, so that the email attachments don't get too big.

 

A lot of bandwidth and CPU time could also be saved if there was a "maintenance

query" that you could subscribe to which would send you only all new logs, all new and all edited

cache pages since your last query.

 

Considering the explosive growth of geocaching, at least here in Norway, I have a feeling that the geocaching.com database will soon face severe problems keeping up with the load.

It's already quite slow at times...

Link to comment

Actually, the biggest problem is probably users like you that are attempting to mirror the geocaching database. You've found 14 caches so far, and all but 1 have been within 30 miles of your home. Why must you maintain (and update several times a week) a database of every cache in the country? Geocaching.com is here to maintain the database, you don't need to. The PQs are meant to make it easer to download caches you plan on searching for. I doubt you are going to be visiting every cache in the country in the next 2 days, are you?

Edited by Mopar
Link to comment

Well, it doesn't help much burying your head in the sand and deny reailty.

The reality is that many geocachers (at least here in Norway) who are using GSAK or something similar regulalry download (actually are getting PQs) for the entire country several times a week.

As a geo-newbie I also find it very interesting to look at cache descriptions from all over the country.

GSAK is much more flexible than the database directly, and also much faster. If many people are using GSAK offline instead of browsing online it would decrease the load on the database servers a lot, and everything would work much faster.

 

If you read my post you'd see that it was essentially suggestions to the database maintainers that could help to decrease the burden on the servers. You never mentioned that issue with a single word.

When you run queries it takes much more CPU time to run two similar queries that each returns 500 rows than one single query that returns 1000 rows.

Having a "maintenance query" would also remove a huge amount of strain on the servers - I would estimate that it could possibly save 99% if we could get only the newl updated logs instead of the entire database each time.

If you think that is a bad idea I don't think we need to discuss this any more.

 

It doesn't matter what the intention of the database was when it was created. It's just like when Bill Gates & co intended DOS to be purely an office tool, and thought that there would never be a need for more than 640KB of RAM.

What counts is the present situation - and thinking ahead.

Link to comment

I am agree with Mopar.

 

It has no sense download all the caches of your country weekly, i doubt you are watching all the caches, if you are interested in control the new logs or modifications of each cache create a PQ with the caches wich had at least a visit last week or a PQ with the newest caches, and then you don't need to download, you can preview it in your PQ page.

 

You know the porpuse of a PQ is not using it like a database, is just a quick and offline query available to you.

 

I only download two PQ at week, the PQ for the area where i most visit and the PQ for to be a First to Find and when i go to vacation i create a PQ for that area and i download it once.

 

If I am interested in other caches i search his cache page, we use to visit cache pages that aren't included in our PQ 3 o 4 times in a week and you are downloading the same 400 caches PQ because you want to avoid waste the bandwidth of visit 400 cache pages directly from GC.com each week, I can't believe it.

Link to comment

Just to be Devil's Advocate (I download 1-3 medium PQs an average week):

Seems like it would be real easy to stop someone from mirroring the database.

If PQs are only to be used like MOPAR describes, why allow someone to download 2500 caches a day?

 

Why not make it, say, 2 PQs of 300 a day instead if you don't want people maintaining large databases.

Link to comment

It looks like you're more interested in discussing my "bad habits" rather than my suggestions, which would definitely help me improve on my habits.

Earlier this summer we had 500 caches in Norway, and now it's 820. One year ago it was about 180. With this growth, which I suppose is not unique to my country, something has to be done to the PQ system and the database IMO, or else it will kneel pretty soon.

As Bull Moose writes, why would they allow us to download 17500 caches each week if they wanted us to not keep offline databases?

And why should they stop us from using offline databases if that is what we want?

 

I don't know how your setups are, but I'm using an iPaq with a BTGPS, along with TomTom Navigator (for coarse car/bike navigation) and OziCE. I import PQ ZIPs into GSAK, and from there I can export POI files for TomTom and Waypoint files for OziCE as well as HTML pages to be viewed offline on the iPaq. Works very well.

When I go geocaching it's almost always on short notice, when time allows. If I should go to geocaching.com and download each cache for a certain area one by one there would be little time left for geocaching with updated info after I have run it through GSAK. New caches pop up every day, but unfortunately most of them are at least 70km away. Now the only way to get them in GPX format right away is to download one by one. You can't get a GPX/PQ on the fly, and you can only download a selection of caches in LOC format.

Keeping an updated GSAK database is simply currently the best solution for me.

But, as I've tried to emphasize, I don't subscribe to all those PQs because I want to, but because that's currently the only way I can get it the way I want. I want the GSAK functionality, but I also want to get smaller emails - and I want the geocaching database to survive.

 

Why do you all think it is so wrong to change the way things work now?

Why is it so wrong if we could get a dayly/weekly "report" with all the new caches, newest logs etc from our selected area(s) instead of the entire set of data each time?

Link to comment

Oh man I was really trying to stay out of this, but…

 

I have made these and many more points time after time here. I systematically download all caches via PQs (well that actually has stopped since I have let my premium membership lapse about two weeks ago) and have a database that is somewhat up to date and I think fairly accurate. And it almost always seems to come down to – Why would you want to do that? You really don’t have a need to do that. What do you need all of that data for? There is no reason to do that. That is not what PQs were designed for. And on and on and on.

 

Why do you all think it is so wrong to change the way things work now?

Why is it so wrong if we could get a dayly/weekly "report" with all the new caches, newest logs etc from our selected area(s) instead of the entire set of data each time?

 

The reason is because there are some people that feel if they don’t do things that way, don’t understand why you want to do things differently then how they do it or just have no ability to think outside the box neither should you. They are after all superior to everyone else and are comfortable with dictating how others should do things. Typical of this crowd is “I don’t like virtual caches so get rid of them”, “I don’t have a need for more that 500 caches in a PQ so nobody else does either”. See the pattern?

 

My question is, is it the actual sending of the PQ file that bogs down the server or the search to put them together?

Would searching for changes just be another qualifier to tie up the system?

 

That is always the question. And the answer keeps changing depending on the argument that is being made. At first there was a 500 limit because most GPSr could only hold that many. Then when they could hold more it shifted to being a question of the query taking server resources. Then when PQs were shifted to another server and we were told that their impact is negligible it became an issue of the email. It was either the server resources being used to zip the attachments, the email resources being used to send the email, or that email with attachments become too large. There were probably some more ridiculous reasons that made no sense and had nothing to do with reality or at least changes to make things more efficient in code or hardware. But that was really not the point, just keep the target moving and you are not able to pin them down. But in the end it just comes back to I can’t see any reason I would need that so you don’t either.

 

I have suggested on several occasions to add the ability to get caches there were archived. You would only need to have the waypoint id, overall a very small file and next to no work on the part of the server to produce. Then you could use that to keep you own cache listing up to date. Just grab the new caches that show up maybe once a week and then run the list of archived against it to dump out what you have in your database that should no longer be there. But no, in a quest for greater customer service it must be better to make everyone just grab all of the data and have to refresh things that way.

 

And yes some people do like to look at data in different ways or throw it up to a map program and view it six ways till Sunday. But as was suggested it must be better to use up the resources of the server here playing around rather than your own PC.

Link to comment

I don't want to be able to download all of the caches in California three times a week, but I would like to be able to download more than 500 waypoints in a single PQ.

 

There are currently about 1400 caches within 50 miles of my home and I would like to be able to download them all at once. I like to look for areas with clusters of caches when planning a day of caching. As it is now, I need to do 5 or 6 PQ's with a total of 2500 to 3000 caches to get all of the 1400 caches. That is a lot of unnecessary redundancy putting extra load on the server.

 

Here is my suggestion:

Instead of limiting it to 5 PQ's a day with 500 caches, why not have total quantity limits in addition to the 5 per day limits (i.e. - 2500 caches per day or 5 PQ's per day whichever comes first with a 2500 cache limit per PQ).

 

Rocket Man

Link to comment
... why not have total quantity limits in addition to the 5 per day limits (i.e. - 2500 caches per day or 5 PQ's per day whichever comes first with a 2500 cache limit per PQ).

Because in a moment of great vision and forward thinking the 500 limit was hard coded into the code and would take a lot of rewriting to change.

Link to comment

Is there a need to download (several thousand) caches in a week? Yes there is.

 

Recently (3 weeks ago) I went on vacation. Now, out here in the western US, driving vacations cover a lot of ground. I planed to drive from Albuquerque, NM to Las Vegas, NV. From there, drive to Southern California, then out I-8 to Tucson, AZ, I-10 to Las Cruces, NM, and finally back home.

I used MS Streets and Trips to plan my route. With it, I also planned my pocket queries. I set PQ's for the major areas I would be passing through (Flagstaff, Laughlin, Las Vegas, Barstow, Inland Empire, San Diego (North County), San Diego (city), Yuma, Tucson, and New Mexico. I downloaded a sample set of these and found where they overlapped. I adjusted some of the queries (some more than once) and set them to run during the 2 days before my trip. I ended up with 10 queries or almost 5000 caches.

Using GSAK, I filtered out caches that were off my route. I ended up with over 2000 caches that were either freeway-close or near a city I was stopping in. I converted them to Plucker and put about 800 in my Garmin Legend (it holds 1000) for my drive to Las Vegas. After finding caches along that portion of the trip, I erased the waypoints from my GPS and loaded the next set (I-15 from Las Vegas to the Inland Empire). I did the same thing when I headed to San Diego.

My trip took a detour from its intended route along the US/Mexico border when my truck started having problems, so I reloaded the I-40 caches for the new route home. I ended up not needing any of the caches in the Yuma and Tucson PQ's. Oh well, my PQ's gave me the flexibility to change my trip like that.

 

How many caches did I find on my 8-day trip? I found 49, plus I hosted 2 cache events. Wow, I only found 1% of the caches I had downloaded. Why waste the server's time on querying so many? Without being able to filter routes directly from the server, I have to do large-area queries and filter them out in my own database using GSAK. That's why some people need the option to download a lot of caches at once.

Link to comment

Careful where you step.

 

My last post got me a warning. Seems the truth is called rude and a cheap shot here.

No the cheap shot wasn't needed, you could have written what you consider the truth without the shot.

 

you wrote

Because in a moment of great vision and forward thinking the 500 limit was hard coded into the code and would take a lot of rewriting to change.

 

when

Because the 500 limit was hard coded into the code and would take a lot of rewriting to change.

would have said the same thing without being rude

 

its really quite simple.

Link to comment

Its really quite simple.

 

Not hardcoding something like that is a basic concept. And in the face of the fact that the 500 limit issue had been raised numerous times it was short sighted.

 

I think more to the point it shows a total lack of planning for the future or even any desire of maybe even thinking about addressing requests for the customer base. Hardcoding that in is a very cognizant statement that one has no plans at all in the near future to change one's mind. Or even consider what the future may bring.

 

Its really quite simple -- it is called sarcasm.

Link to comment
Its really quite simple.

 

Not hardcoding something like that is a basic concept. And in the face of the fact that the 500 limit issue had been raised numerous times it was short sighted.

 

I think more to the point it shows a total lack of planning for the future or even any desire of maybe even thinking about addressing requests for the customer base. Hardcoding that in is a very cognizant statement that one has no plans at all in the near future to change one's mind. Or even consider what the future may bring.

 

Its really quite simple -- it is called sarcasm.

sar·casm ( P ) Pronunciation Key (särkzm)

n.

1: A cutting, often ironic remark intended to wound.

2: A form of wit that is marked by the use of sarcastic language and is intended to make its victim the butt of contempt or ridicule.

3: The use of sarcasm. See Synonyms at wit1.

 

Respect: Respect the guidelines for forum usage, and site usage. Respect Groundspeak, its employees, volunteers, yourself, fellow community members, and guests on these boards. Whether a community member has one post or 5,000 posts, they deserve the same respect.

 

Foul Language and obscene images will not be tolerated. This site is family friendly, and all posts and posters must respect the integrity of the site.

 

Personal Attacks and Flames will not be tolerated. If you want to praise or criticize, give examples as to why it is good or bad, general attacks on a person or idea will not be tolerated.

 

All underlines added by me

Edited by CO Admin
Link to comment
When I go geocaching it's almost always on short notice, when time allows. If I should go to geocaching.com and download each cache for a certain area one by one there would be little time left for geocaching with updated info after I have run it through GSAK. New caches pop up every day, but unfortunately most of them are at least 70km away. Now the only way to get them in GPX format right away is to download one by one. You can't get a GPX/PQ on the fly, and you can only download a selection of caches in LOC format.

Keeping an updated GSAK database is simply currently the best solution for me.

When I go geocaching, it is also almost always on short notice, but I have no problems using the tools provided in a simple manner. Here is how I do it:

  • I decide to go geocaching in a certain area. I know where I am going to be, so I request a pocket query (or pocket queries) for the area I am going to. I have bunches of pocket queries of various areas within 100 miles of home. I just pick the ones I need to activate and click on the checkboxes. I never leave any checked. I only run pocket queries if I am going to cache in an area. This is very friendly on the server.
  • A few minutes later, the PQ(s) arrive. I load them into GSAK. I use GSAK to filter down to a managible list, usually around 175. I download that to my GPS and then to MapSend to print out some high level maps. Since I only run PQs when I need them, They haven't been run in a while, and get a higher priority, which is why they only take a few minutes to arrive.
  • I get in the car, usually with in 15 to 30 minutes of when I requested the pocket query(ies).

If you use a PDA, you need one more step to get the info from GSAK to your PDA. I don't as I just use my phone to access the site live. Still, that step should only at a few minutes to the process.

 

I'm not saying that it's not interesting to look at bunches of caches, but I don't buy the excuse of needing them just because you cache on short notice. I seem to do just fine. :P

 

--Marky

Link to comment
sar·casm    ( P )  Pronunciation Key  (särkzm)

n.

1: A cutting, often ironic remark intended to wound.

2: A form of wit that is marked by the use of sarcastic language and is intended to make its victim the butt of contempt or ridicule.

3: The use of sarcasm. See Synonyms at wit1.

sarcasm: n 1: a cutting or contemptuous remark  2: ironical criticism or reproach

The Merriam-Webster Dictionary

Using the second def no disrespect is meant with sarcasm.

 

Also any guidlines that FORCE respect are doomed to fail.

 

respect: vb 1: to consider deserving of high regard ...

The Merriam-Webster Dictionary

If some one is wrong or makes a mistake, I must "consider [him] deserving of high regard"? The correct term should be "show consideration or courtesy" - respect is earned by others, consideration/courtesy of others comes from me.

Link to comment
sarcasm: n 1: a cutting or contemptuous remark  2: ironical criticism or reproach

The Merriam-Webster Dictionary

Using the second def no disrespect is meant with sarcasm.

I think that "ironical criticism" is only used when disrespect is intended. That's how it seems to me, at least.

 

--Marky

Link to comment
Also any guidlines that FORCE respect are doomed to fail.

That is especially true if it is a one way street.

 

No for me respect has always been something that is earned not given becuase I have to. There are people that I know that I hate with a passion but have a large amount of respect for. And people that I like but have little or no respect for.

 

One's actions and taking personal responsibility for those actions is where I start to see a person's respect meter rise in my book. But the times are changing. Today we live in a world of entitlement where people think they deserve something based simply on the fact that they are breathing. When it may actually be more respectful or perhaps courteous to slap them upside the head every once and awhile so they can learn what it means to earn something and not be handed everything.

 

Boy now this thread has really gone off topic.

Link to comment

Yes, it seems to drift off in the wrong direction.

 

Although it's good to see that I'm not the only one with "bad habits" here, I think the discussion on the future load on the database and how to prevent it from collapsing is very important.

I'm sure none of us with "bad habits" really wants to get the entire set of caches we're interested in several times each week. What I want is the ability to be fairly updated offline, and to achieve that I would be quite happy to receive small emails a few times each week with "What's new".

 

If that 500 limit is hard coded, so why not change it? Surely that code has to be changed and recompiled every now and then anyway? Code isn't something that is carved in stone and has to stay like that forever. Some years ago I developed a fairly big database which handled millions of grabbed video images, lots of subscribers, payment info as well as about 30-50k new "logs" every day. Although it worked pretty well I still had to recompile quite often to fix or change something. It wasn't that big a deal to do that.

 

The geocaching database seems to be pretty well organised, although IMO it does show some signs of growing pain. One exampe is how the waypoint names have grown from just a few characters to the current 6, and it will soon have to grow even larger. And why is there always a "GC" at the front? That shouldn't be necessary?

 

But he main thing is how PQs are handled.

Just a simple thing as adding a "Last updated during" in addition to the "Placed during" search field would save the servers from a lot of work.

I have no idea how many users there are on geocaching.com, or how many logs are inserted each day (there isn't much statistics to find here...), but it would be very interesting to see the growth rates. My guess is that is pretty steep, and growing exponentially.

 

Finally, I couldn't resist jumping in on that respect "thread".

Respect: Respect the guidelines for forum usage, and site usage. Respect Groundspeak, its employees, volunteers, yourself, fellow community members, and guests on these boards. Whether a community member has one post or 5,000 posts, they deserve the same respect.

If there was a person who started showing disrespect in this thread it has to be Mopar, with his reply to my first ever post on this forum.

I'm still glad I never sent my first reply...

Edited by bjornkn
Link to comment
Although it's good to see that I'm not the only one with "bad habits" here, I think the discussion on the future load on the database and how to prevent it from collapsing is very important.

 

I gave some nice instructions on how to use the current system in a "server friendly" manner. I haven't seen anyone comment on why this wouldn't work for them. It's extremely simple to do.

 

If there was a person who started showing disrespect in this thread it has to be Mopar, with his reply to my first ever post on this forum.

I see that Mopar's post has been edited, but I see nothing disrespectful in it as it is currently posted. It wasn't a personal attack (in my opinion, although you may think differently), it was more a statement of fact. It is the users who are trying to keep a complete, up to date, offline database of caches who are putting an unnecessary strain on the PQ engine. PQs aren't intended for this, so I don't see features being added to make this easier.

 

--Marky

Link to comment
Although it's good to see that I'm not the only one with "bad habits" here, I think the discussion on the future load on the database and how to prevent it from collapsing is very important.

 

I gave some nice instructions on how to use the current system in a "server friendly" manner. I haven't seen anyone comment on why this wouldn't work for them. It's extremely simple to do.

 

If there was a person who started showing disrespect in this thread it has to be Mopar, with his reply to my first ever post on this forum.

I see that Mopar's post has been edited, but I see nothing disrespectful in it as it is currently posted. It wasn't a personal attack (in my opinion, although you may think differently), it was more a statement of fact. It is the users who are trying to keep a complete, up to date, offline database of caches who are putting an unnecessary strain on the PQ engine. PQs aren't intended for this, so I don't see features being added to make this easier.

 

--Marky

You are right Marky, the current system works quite well for finding caches, which was it's intended use.

What I see as extremely funny is the people that are claiming that they NEED thousands of caches stored offline to geocache actually find very few, whereas the people that actually go out and find large amounts of caches have no problems doing so with the current setup. That shows me the system works fine for what it's supposed to be used for. If it doesn't work well for other things, well, to quote a phrase, "tough nuts".

 

As for my initial reply to this thread, the only thing edited were typos a minute or 2 after posting. If there is something disrespectful there, I'd love to know what.

Edited by Mopar
Link to comment
{snip}What I want is the ability to be fairly updated offline, and to achieve that I would be quite happy to receive small emails a few times each week with "What's new".

{snip}

Just a simple thing as adding a "Last updated during" in addition to the "Placed during" search field would save the servers from a lot of work.

Consider using the "Updated in the last seven days" option that is *already provided* in the pocket query generator. It won't give you new finders logs, if that's all that changed, but it will ensure that your offline PQ version of a cache will have the most current coordinates, cache descriptions, etc.

 

I run a pocket query like this weekly, to see what folks have been doing to the descriptions of caches that I reviewed and listed on the website. Sometimes things change rather dramatically.

Link to comment

(And now my turn to wade into this topic)

 

I just subscribed to GeoCaching.com and I (like alot of people who are looking towards paperless caching) am wanting to give GSAK a good base to work off.

I know that the majority of caches I am likely to visit are probably within 100km of my home, but at the same time I would like to have the ability to find closeby caches if I happen to be on the road and decide to go for a bit of a search.

 

Due to rather high charges associated with data services across GSM within Australia, re-accessing the GeoCaching.com site whilst out and about is not a very effective option.

 

What I (and I think alot of other people out there) would like to do is to have a base of information for all caches within my state (maybe even within my country), and then only update the local caches through the use of PocketQueries.

This would mean that I would have the most accurate information available for the caches I am most likely to visit, but still means that if I jump a Mystery Flight and end up in Brisbane (another state) I still have a fairly recent list of caches there to try and visit.

 

Why not have the option for Premium Users to download a GPX file for all caches in a selected state. I am not talking about information that is 100% up-to-date, but instead maybe having a batch that runs monthly (or maybe even less frequently) and dumps the GPX files somewhere we can then download them.

 

So, when a newbie like me comes along I would:

1) Download the Base GPX Files for my selected state(s)

2) Design a PocketQuery, or set of PocketQueries to provide updates to the base data (allowing me to run much smaller PQs for updates on local caches)

 

I think it would be a much easier system from my end, and would probably be faster at the system side too as it would only have to batch these large GPX files once a month rather than up to once a day per person!

 

What are your thoughts?

Link to comment

The OP believes that expanding the number of caches on a PQ will allow him to run fewer PQ's and reduce the server load. That sounds good at first. I need to run 3 PQ's to get all the caches I want, when simply doubling the number of caches I'm allowed per PQ would probably reduce that number to one PQ.

 

But what tends to happen is that, though increasing the number of caches to 1,500 per PQ will require fewer PQ's for some, nearly everyone else is going to try to pull 1,500 caches per PQ and in many cases, they will still be running multiple PQ's because they can. The problem is that if you give people the capability, they will use it.

 

Instead of limiting it to 5 PQ's a day with 500 caches, why not have total quantity limits in addition to the 5 per day limits (i.e. - 2500 caches per day or 5 PQ's per day whichever comes first with a 2500 cache limit per PQ)

 

Not knowing the details regarding how it would effect the servers, or how much coding is involved, this sounds on the surface like a good idea. It solves the problem of people like the OP and still lets others to continue using PQ's they way they currently do. Perhaps Jeremy, or someone else in the know can comment.

Link to comment

FWIW, I'll tell you how I use GSAK and the PQ system.

 

I have regular GPX deliveries for found and unfound caches in my area coming weekly, and then more frequently for new and watched caches.

 

When I am planning a trip, I start getting weekly GPX files for the area(s) in question. This is so that I can start preparing for the trip. I start marking caches which look like they are in the right location using GSAK userdata. Right before the trip I get updated GPX files, but the old logs are there in GSAK, so that when I am on the trip, I do have the full history of the cache. This is going to be important going to Orlando in the wake of Hurricane Charley.

 

I usually set the number of caches to 500 just because I don't know the geography yet, and then I can fine tune it later. This was especially useful on my trip to England in July, as I had PQs centered on Cambridge, Waddesdon and London, and 500 caches out from Cambridge didn't really reach too far into Hampshire, but the London query gave me a bunch in Hampshire where we were for a few days. This environment is so dense that in the span of two weeks of short day trips we covered a bunch of different caching areas, but in a relatively small area.

 

The current system for PQ doesn't really allow a lot of interactivity in terms of defining areas geographically according to a trip. Since I rely on GSAK to do that and there is no real integration as to GC.COM know which caches GSAK knows I'm filtering on, much info in the PQs is being wasted.

 

I'd have thought that there are a number of ways to improve this process, but I do find that having the past logs is important - especially for very chatty logs on difficult local caches with a lot of no finds and notes.

Link to comment
I gave some nice instructions on how to use the current system in a "server friendly" manner. I haven't seen anyone comment on why this wouldn't work for them. It's extremely simple to do.

 

Marky, it's nice to see that we share the concerns for the well-being of the server.

I believe I have said why that scheme doesn't work so well for me - because I find the versatility and power of GSAK combined with Ozi Explorer so much better.

My suggestions were also made to help decrease the load on the servers, and I'm sure that being able to just get "what's new" instead of the entire dataset every time would help a lot on the load.

What's wrong with that?

What is so wrong with suggesting changes that will make the database more useable for more people, while at the same time decrease the load on both database and mail servers?

 

After following this thread I think I understand GrizzlyJohns frustrations better.

There seems to be no interest in discussing new and possibly better ways to run this database. It all ends up with the oldtimers telling us how to do it, and to stop complaining.

But what if we want to use it in a different way from you?

Why don't you want to listen to us?

What if we can see it with fresher eyes than you, newbies as we are?

 

I really have no problems with the way it works now, because I can get what I want, although I need to use some workarounds. The problem is that whenever I receive a PQ about 99% of it is data which I already have in my GSAK database. And that is more of a problem for their database and mail servers than it is for me.

You may keep on buying new servers all the time to keep up with load, but if I was running that business I would certainly be willing to look at new and less expensive solutions...

 

And then to Mopar:

How many caches do you think we need to find before we should be allowed to speak then? Should that be a fixed number or as a precentage of the total number of caches within, say 100 miles from your home?

As you apparently did a little research on my statistics, just for fun I did the same on you. I picked one of your, if I may say so, for such a hardcore geocaching pro like you, surprisingly few owned caches, the "Highland Woods, Too", which I assume is pretty close to where you live. Showing the nearest caches ( I didn't bother making a PQ and entering it into GSAK) revealed that within 100 miles you have 3181 caches to find. A similar search, in GSAK, for the same distance from my home shows that there are 97 caches available (in Norway).

Apparently you've been geocaching for 3 years now.

For someone who has been geocaching for 2 months (I was without a GPS for a month) in such a sparsely populated cache area, and who likes to go hunting for caches while walking his dog (easy to ask) and his son (12 years - and not so easy to ask..), and also likes/needs to do other things than geocaching, I'd say that my statistics aren't that bad?

Maybe I'll even get as good as you one day?

Maybe then you'll be willing to listen without trying to intimidate me?

Link to comment
My suggestions were also made to help decrease the load on the servers, and I'm sure that being able to just get "what's new" instead of the entire dataset every time would help a lot on the load.

What's wrong with that?

Nothing, especially since you can already do it! :blink:

updated.jpg

While this won't give you the latest logs, it does in fact give you the latest in regards to the cache (changed to the cache description, changes to coords, etc.) which is all you really need for general offline use. The combination of this method, and my methods described above, you should easily be able to accomplish what you want to do without unnecessarily burdening the PQ server.

 

--Marky

Link to comment
Marky, I pointed out the same feature yesterday, only without the pretty picture.  Maybe the original poster will notice this in between rants?  (psssst step it down a notch and stick to the topic, thanks.)

Will that option give you ones that have been archived in the time period?

 

Yes you can keep the data fresh if changes are made to the description, coordinates and that information which is very useful. But tell me how to know which caches have been moved to temp inactive or archived so they can come off the list that a person is keeping.

 

I haved ranted I don't know how many times about that. I don't know of a way to do that. Tell me how and I will shut up about it. I always hear about this deep fear of stale data being out there yet no way that I know is there to keep the data fresh. Again I will point out give us a list of archived caches (see somewhere in the middle of the rant above) and then we would only have to run a query for new or updated caches once a week and be done with things. I would also add that I have suggested an option not to have logs. The joins on the server would not have to be done which are expensive and the result would be much smaller files sent. But again that seems to have been ingnored also.

 

And Mopar is it possible that a person does not log all of their finds online? Or maybe logs under a different account? Or maybe does not log any? Maybe, just maybe things, are not always as they seem. It is possible for people to play the game differently than the way you do and still get all of the enjoyment they want from it. And besides I thought everyone was supposed to be given the same amount of respect no matter the number of finds they have. Or does that only apply to post count? Or maybe it just applies to ... oh nevermind.

 

Edit: After reading a few other topics here today I just wanted to add that it looks like several people did not get their PQs today. Yet another reason for somebody wanting to keep their own up to date database offline. I don't really think the current system can be depended on to always get them out when a person wants them.

Edited by GrizzlyJohn
Link to comment

Trying to avoid the quagmire of mud-slinging that seems to have become an alternate purpose to this thread, I am interested in people's thoughts regarding my suggestion made above.

 

Correct me if I'm wrong, but it seems the main issues people have with PQs (as both Users and from an Admin side) are:

- The load PQs put on the server is quite high due to the depth of the search options and the number of records returned

- There is confusion between Users on how to best utilise PQs to gain the information they want to use offline (whether that is just a small hit-list of nearby caches or, as some would like, a list of every cache in their hemisphere)

- The limits currently imposed by the PQ system (500 records per search, 5 searches per day) mean that people have had to develop work-arounds to gain the information they need.

- The workarounds developed by some seem to cause "overlaps" in the data returned, meaning that the PQs are working harder than they really need to.

 

I think that one thing that we all need to remember here is that, just because someone is doing something in a way we might not doesn't mean it's wrong, just means that it's different.

We need to look at improving both sides in order to get the most out of this service. We need to try and help other users to develop techniques to get the most out of their PQs whilst at the same time doing so economically. We also need to try and develop ideas for solutions from a system side that may help in making the best use of the system resources.

 

Realising that some people do like to have a "basemap" of caches over a large area, and that gaining this information through a PQ can be quite intensive on the system (ie multiple PQs to get around the 500 result limit), and also realising that some people have difficulty gaining the most up-to-date data for their local caches due to the density of finds in their area (I am sure some people have more than 500 caches within their hunting grounds), my suggestion, as mentioned above, is this.

 

Why not run large-scale batches once a month (or at an interval decided as best by all) which would dump a file containing all caches within a selected State. Provide the option for Premium Users to download this file and use it as a base for their offline systems. Then these users can customise their PQs to specifically return the most recently updated data for the caches they wish to target.

 

I see the benefits of this being that the load on the PQ server is reduced (as only one large-scale query would be performed monthly/whenever), whilst the bandwidth load is roughly equal (people downloading the files would use about the same volume as people receiveing them via email).

I also see the benefit that people who do just want to keep a copy of the GC.com database offline can do so quite easily, even if their data only gets updated monthly/whenever.

 

Personally, as a very recent subscriber to GC.com and to the GC scene altogether I am having difficulty trying to get the information I want (which may be different to the information I need, but that's my prerogative) from these PQs.

I think that trying to design a system or technique to assist everyone in getting what they want is what we should be striving for, rather than simply saying "You're doing it wrong" and failing to open ourselves to other options.

 

</RANT>

Link to comment
But tell me how to know which caches have been moved to temp inactive or archived so they can come off the list that a person is keeping.

 

I haved ranted I don't know how many times about that. I don't know of a way to do that. Tell me how and I will shut up

I'll tell you the trickery I am going through to deal with this (despite which I do NOT expect you to shut up).

 

I have a separate PQ which I run to send me the "inactive" ones in the same circle as a PQ that gives me the active ones. All goes into my off-line database as managed by GSAK. I filter for the same area that the PQ covered (my database covers a lot more ground than any of my PQs). Within GSAK I sort by last update. Then I have to do some manual work by looking at those caches on-line that did not get updated with the last PQ. Of these, some are unavailble, others are OK. It's usually easy to find the ones in this set where I have to manually toggle the archived status: they are usually the ones showing a couple of DNF indicators.

 

It is a clutzy work-around for which I would like a better solution, especially since it still means I have to manually inerrogate the on-line database at a time when it is typically very busy and when I could be out caching.

Link to comment

Marky and Keystone Approver:

Yes, I knew that option was available.

But because it serves all your offline needs doesn't mean that it serves everyone else's needs. It doesn't really serve mine, because I'd also like to have the logs.

 

When I started this thread I was hoping for (and expecting) a discussion where we could find ways to decrease the load on the servers so that it could go on providing this nice service in the future.

How naive!

It ended up with people telling us what we need and want, instead of discussing ways to improve the system so that we can get what we really want without making the servers collapse.

Oh well then, let's just let the snowball keep on rolling downhill until it hits the wall or collapses under its own weight.

Link to comment
let's just let the snowball keep on rolling downhill until it hits the wall or collapses under its own weight.

That is actually quite a tremendous size, i had one at about 8ft diameter when it hit critical mass.

 

now back on topic...

 

I think that any idea that decreases server load and serves the customer's wants/needs is a good idea worth discussing. Your idea I agree has merit but at the same time, it may be right now that resources should be directed to just making things run better.

 

stale data is always a potential issue with PQ's

users stealing data is always a potential issue with PQ's

 

maybe we could get 1 state DL every month and each week just setup a NEW cache PQ to merge with that and once a week get a recently archived cache list you could merge out of the main data base.

 

that is how i would aproach the problem.

 

although there is another problem how many of these pq's would need to be generated every week/month, actually this could POTENTIALLY increase server load.

Link to comment
But tell me how to know which caches have been moved to temp inactive or archived so they can come off the list that a person is keeping.

 

I haved ranted I don't know how many times about that. I don't know of a way to do that. Tell me how and I will shut up

I'll tell you the trickery I am going through to deal with this (despite which I do NOT expect you to shut up).

<snip>

I kind of do the same thing now. But I keep rerunning nearly the same PQs over and over.

 

I run my searches by date. That way I can monitor and run right up to the 500 limit without going over and not knowing what I missed and also don't have to worry about over lap.

 

I have a database with the date ranges I need to run the PQs and a number for each query. I then have some code that takes the PQ and dumps it into the database and adds the current date in a last update field.

 

I enter the query number and the program looks up the date range then calls all of the caches in that date range that were not updated on that date and sets them as archived. I don't actually make the online check that you do I just assume if it was not in the query that it was archived.

Edited by GrizzlyJohn
Link to comment

Just a quick note here, as one of the "guilty" that has been querying large areas on a weekly basis. I also use GSAK and a PDA to maintain a map of caches for my area of geocaching activity. I can see at a glance what I have done and not done.

 

I travel quite a bit within a 200 mile radius of my home and as such, try to keep an active database of caches within that area. I never know where or when I am going someplace and it's a big help to me to be able to call up the caches in my database. One reason for my doing this is not being able to get PQ's when I ask for them. Average response time for a PQ request has been 3 hours, never mind times like this past weekend when I could not get any at all (3 days now with no response to my PQ's).

 

The use of GSAK allows me to have at least a reasonably current database of caches that are within a reasonable "reach" of my normal travel. Bear in mind that my "200 miles of travel" covers 7 states almost entirely.

 

Might I suggest that offering GPX format downloads of a "preview" page to a PQ request be made available? All I can get are the LOC files and I do not use them, except in dire need. However - I can always grab a couple pages of caches from a preview without much complaint for those "quick, on the fly" caching runs.

 

If my PQ requests are a part of this problem, I will cease them. Just blame it on new user ignorance :blink:

 

Final note - this is not a flame, nor a complaint - just a thought.

Link to comment
Marky and Keystone Approver:

Yes, I knew that option was available.

But because it serves all your offline needs doesn't mean that it serves everyone else's needs. It doesn't really serve mine, because I'd also like to have the logs.

Oh well then, let's just let the snowball keep on rolling downhill until it hits the wall or collapses under its own weight.

What possible value could be gained from having every log in Norway? Are you telling me that you read every one? If you need recent logs from a specific location that you are planning on caching in, then just schedule a regular PQ for that area and merge it in to your database and you will be up to date for that area. I'm guessing that you could have a daily or weekly "updated in the last 7 days" PQ scheduled and then have a bunch of normal queries that cover various areas that you often cache that aren't scheduled. Just temporarily activate the normal PQ for the area you are going to be caching in and then deactivate it once it runs.

 

Even though what you are doing is not what the PQs are intended for, I'm not saying that you are wrong to do so. I am trying to give you "server friendly" options that can be done with the current PQ system and should work for you (if your main intent is to go out and find geocaches).

 

I don't think I'll further respond to this thread since we seem to be rehashing the same arguments and you don't seem to think my server friendly solutions will help you.

 

Cache on! :blink:

 

--Marky

 

P.S. If you just like to read logs, why don't you just put every Norway cache on your watch list? Then you'd be reading all the logs! :blink:

Link to comment

Leaving aside the religious debate about how many caches people ought to have access to offline, what about the technical side...?

 

It strikes me that the GC.com servers are struggling to cope with demand, and that the bottlenecks are probably CPU / database access limited, rather than bandwidth limited. Correct? People frequently running large PQs is very CPU/database intensive (it needs a separate server, which itself seems to be struggling), but lots of paying customers seem to want to do this, for whatever reasons they may have. (Here's my reason)

 

So what to do? Three options spring to mind:

  • 1) Slap the wrists of paying members for downloading data too enthusiastically and make them feel bad for downloading caches they'll probably never do.
  • 2) Pre-generate a static GPX file for each country / state, allowing members to mirror the whole of their area with minimal CPU/database hit on your servers.
  • 3) Allow premium members to share GPX files between themselves.

Seems to me that options 2 and 3 are both better server-sense and better business-sense.

Edited by Teasel
Link to comment
Just a quick note here, as one of the "guilty" that has been querying large areas on a weekly basis.  I also use GSAK and a PDA to maintain a map of caches for my area of geocaching activity.  I can see at a glance what I have done and not done.

 

I travel quite a bit within a 200 mile radius of my home and as such, try to keep an active database of caches within that area.  I never know where or when I am going someplace and it's a big help to me to be able to call up the caches in my database.  One reason for my doing this is not being able to get PQ's when I ask for them.  Average response time for a PQ request has been 3 hours, never mind times like this past weekend when I could not get any at all (3 days now with no response to my PQ's). 

 

My average PQ response time is around 5 minutes. Why is mine so much faster? Because I never schedule PQs unless I really need them (i.e. I am going caching in a specific area). Because my queries haven't run in maybe a week, they get a higher priority and fire off quickly. Because I make PQ requests from my home coordinates more often, I have three identical PQs. When I want to run one I pick the one that is the "oldest" so that it will fire off sooner. It works very well for me. (I do remember the days of having more like 3 hour response times when I was scheduling my PQs to run daily.)

 

--Marky

Link to comment

Back to the original post wwwwaaaayyyy back at the top of this thread, for only 800 caches in the whole country, i can't see why you would need more than the 5 a day, for that matter 2 or 3. As to updating them daily, from the others I have spoke with or seen posted, there is not much point to updating a query more than once a week.

 

This is based on a couple realities I have noticed in my short time GCing (I have just over 170);

 

1. It may be days, many days, before some of us post our finds. Unless i am placing or retrieving a TB I just do it when I get a chance.

 

2. Unless you are hot and heavy to find a Jeep or paticular TB, there is just not that much happening to a cache to worry about.

 

3. Set up a daily PQ for new placements. I'm not sure, but I think the limit on the PQ's is # caches, not mileage.

 

I admit, I have a number of Q's set up because I travel a given area and am on a grab when i can basis, however none is updated more than once a week. If i know I am traveling to a specific area, i do a quick PQ and find that I have rarely waited more than 4 hours, generally less than a half hour, for the query to arrive.

 

I do fall on the side of not seeing a reason to change it. There have been instances of people trying to recreate all the work that was done here through misuse of the system, so it is not unreasonable to expect there to be a "circiling of the wagons" type mentality on some things.

 

When all is said and done, geocaching.com can set it up how they want. They do a very good job of trying to accomodate the majority when it falls within reasonable guidelines. It appears on this they have the best of both worlds, they are going with the majority which happens also to fall within their self interests.

Link to comment
But tell me how to know which caches have been moved to temp inactive or archived so they can come off the list that a person is keeping.

 

How I filter out the caches that have been archived or inactivated is by using GSAK by doing a quick sort by last update date. I use the same pq each time showing only active caches. Any that don't make it in my pocket query are either archived or inactive... I toggle them manually to archived in GSAK. Takes me a minute or two to manually update.

Link to comment
But tell me how to know which caches have been moved to temp inactive or archived so they can come off the list that a person is keeping.

 

How I filter out the caches that have been archived or inactivated is by using GSAK by doing a quick sort by last update date. I use the same pq each time showing only active caches. Any that don't make it in my pocket query are either archived or inactive... I toggle them manually to archived in GSAK. Takes me a minute or two to manually update.

I find that when I go through this process, and check each cache that I am about to toggle to archived status, that some are active. I know that it makes no sense, but that's what I am getting. BruceS, do you look on-line at the ones you manually toggle and find that they are all truly inactive or archived?

Link to comment
How I filter out the caches that have been archived or inactivated is by using GSAK by doing a quick sort by last update date. I use the same pq each time showing only active caches. Any that don't make it in my pocket query are either archived or inactive... I toggle them manually to archived in GSAK. Takes me a minute or two to manually update.

Oh yea we do almost the same thing. You use GSAK (which from what I have seen is actually a very good program), I do it within some code and database that I have written. The end result is the same.

 

But don't you feel that getting the same PQ with almost the entire set of data the same to be a bit of a waste? I just think it would be less wasteful to be able to get a list of archive and inactive that could be run against one's current list. Then all you would need is to get a PQ of the new ones every week or so to keep your data up to date. Yes I know you can not be sure to get all of the logs, but the current system does not allow for that either. You only get the last five which for busy caches may cover less than a day.

 

Let's say that half of the caches get archived or go inactive over the course of the year. I don't know I am just making up a number to show a point. Then to make it easier lets say it is actually 52% in one year. So every week 1% of the caches would go to archive. So if you have a PQ that gets 500 caches on average 495 of them will be the same every week. Don't jump all over me I know there are some basic flaws in the numbers but I think you see the point.

 

Yes there are ways to do this with third party applications. But my point is don't do the complaining dance about how much hardware has to be thrown at this because of the resources used. Or complain about people that are grabbing more caches than some think they will ever need. And totally ignore ways to make it work better. There is simply no good reason I can see not to implement the many suggestions that have been made over time here.

 

And I would also make the point that if the system was more efficient that in the end it would work better for everybody. Because people are still going to get the data they want even if it is not "healthy" for the gc.com system. If people were offered a better way to do things they would of course take advantage of it. So many people have come up with very creative ways to work around the shortcomings of the system here only because they have to. I don't think they want to and would not if given the choice. Which in the end will put less of a strain on the resources here.

 

So I guess for all those that disagree or don't understand this is yet again another rant. But click your mind off autopilot for more than a second and tell me that it is an unreasonable position and why. And no because I don't need that data, because I only have found six caches, because I don't make hikes of three hours to find a cache, becuase I have only been here two months, etc. are not reasons. If the sum change to you is zero then why would you care? And if it makes other cachers happy and allows them to enjoy the sport/hobby/game more, however they want to play, shouldn't that be a good thing?

Link to comment
I find that when I go through this process, and check each cache that I am about to toggle to archived status, that some are active. I know that it makes no sense, but that's what I am getting. BruceS, do you look on-line at the ones you manually toggle and find that they are all truly inactive or archived?

I have noticed this a few times. Most of the time I have found it to be those that were on the edge of the search area before and now have fallen off the list due to new caches. This doesn't seem to happen on my home area searches as I find them faster than they are placed. I do check the on line status as so far I have had fewer than 10 to check for the week.

Link to comment
But don't you feel that getting the same PQ with almost the entire set of data the same to be a bit of a waste? I just think it would be less wasteful...

Wasteful. Now there is a concept. I have doubled the monthly miles I put on my car since I started geocaching. I cannot imagine that most geocachers are not "wasting" substantical resources geocaching (but that is probably another thread).

 

As for the PQs being wasteful: They run on a computer which is specifically designed for it and yesterday that machine was able to do all of those scheduled for Sunday and Monday in one day. There is no cacpacity problem. The problem is with transmission. As long as the programmers are not serious about improving efficiency by giving us the tools to better target our queries, why should we?

 

Looking up a handful of caches at the web site is far more wasteful in terms of the limiting resource: every graphic icon and map uses tens of thousands of bytes. The transmission of a PQ as a zip file does that with a small fraction of bandwidth.

 

What probably should really happen is to establish an East Coast and European mirror site for the web site.

Link to comment

I don’t know why, but I would like to see all of the caches in the 3 surrounding states. Maybe its goes back to being a kid and staring at maps. Just thinking of all the stuff out there made the mind wonder.

 

I may never get to them all but having them listed out, may help me plan a weekend. Also having all info on my laptop is kinda cool when I travel. The more you have in the pq, the easier it will be to find the clusters of caches that would make a fun weekend trip.

 

Anyway, I think allowing members to share .gpx files would be a great idea, if the problem is getting too many caches taxes the servers.

 

If it is a reason of propriety, then I see the point, I don’t agree, but I see the point

 

Just my 2¢

Link to comment
And I would also make the point that if the system was more efficient that in the end it would work better for everybody. Because people are still going to get the data they want even if it is not "healthy" for the gc.com system. If people were offered a better way to do things they would of course take advantage of it. So many people have come up with very creative ways to work around the shortcomings of the system here only because they have to. I don't think they want to and would not if given the choice. Which in the end will put less of a strain on the resources here.

Everyone has unique needs and unique circumstances. What works for Marky or Mopar or whoever, probably won't work for everyone, and certainly doesn't work for me. I, myself, never know when I'll be called out 10, 100, 200 miles from home with no advance notice and no way to update my pocket queries before having to leave. I live in San Jose, might be in Salinas, and then get called immediately to Menlo Park -- no chance to go home and get updated PQs. That's exactly what happened yesterday as a matter of fact. Without my large number of queries I wouldn't have been able to get the one cache in both those areas that I managed to have time for.

 

I carry data for 900+ caches in my Palm Pilot and GPS for areas ranging from Sacramento to Merced, from San Francisco to Tahoe. That's a lot more caches than will fit in one pocket query, or even gathered in five queries. My solution was to pay for a second membership so I could have a second set of five queries run whenever I need it. Would it be easier on the system if I could query only once a day all caches in the middle third of California? Probably. Do I like using up a lot of resources? Nope. But the system doesn't provide what my unique needs require (nor am I asking it too) so I am forced to be creative and come up with my own solution. I experience no guilt either because I pay twice as much as everyone else.

 

There's too much complaining here about how people should do this, and how people shouldn't do that. If some guy wants to download every cache in his country every day, let him. That's what the membership fee is for, to pay for resources and services.

 

Want to save some system resources? Create a text-only version of the gc website. Stop obfuscating the URLs so that powerusers are forced to navigate through the site, wasting time and bandwidth. Allow more than 20 search results, set perhaps by user preference. Switch to LAMP, away from .NET. Allow for more advanced yet simpler queries: instead of returning 500 caches within 500 miles of a coordinate, allow for the possibility of returning ALL caches within a rectangle of coordinates, supplying the upper left and lower right. Not only is that a faster query, but it reduces the need for overlapping PQs -- another waste.

 

I could go on and on...

Link to comment
Guest
This topic is now closed to further replies.
×
×
  • Create New...