Jump to content

Buxley New Caches Not Being Updated ?


vds

Recommended Posts

I notice that the Buxley site information shows no new caches for the last couple weeks in the Seattle area.....is there some kind of a firewalling thing going on between geocaching.com and the Buxley folks where their crawler is being blocked from identifying new caches ?

Link to comment

Too bad "normal programming" doesn't also include removing caches from his maps when they have been archived unless you email him. I never asked him to list my caches there, why do I have to track him down and beg him to to remove them when they are archived?

Link to comment

I just use it to get an idea of new ones overlaid on the maps, but use custom queries and GSAK to do the detailed work.

 

But yes, the maps really help if you do a trip down the interstate to a new place and want to plan your trip...

Link to comment
I emailed him when I noticed the GA map wasn't updating and he said he was having some technical difficulties on his side :anibad:

 

Zack

I would imagine he is.

Why? It'd be a shame if gc.com is now somehow blocking it out. His maps are far superior to the ones on here.

 

--RuffRidr

I'd say it's about time they block him. I'd imagine he's putting an unneeded strain on the already overworked servers.

Link to comment

Well if the two don't have an arrangement, they should. Buxley's maps are perfect when planning a multi-state road trip. GC.com querys just don't cut it for the to and from drives, and take way too long to edit by hand. I click along the roads I'll be travelling on using Buxley's maps and add the caches waypoint by waypoint to form my own query. It's perfect. I would be very upset if this were not available to me.

 

To Admin: Please, until GC.com actually has something equal and operational in place, don't block out others who provide valuable services to us cachers. Thank you!

Link to comment
Why?  It'd be a shame if gc.com is now somehow blocking it out.  His maps are far superior to the ones on here.

My understanding is that gc.com recently tightened up their throttling software. This has impacted several stats sites.

 

Personally, I don't think it is the limited number of such sites that were crushing the servers. There are far more people running their own bots to watch for new caches and bug drops in hopes of getting FTF. The people at the stats sites understand the concept of limiting their page queries. Joe Overload often doesn't.

 

-WR

Link to comment
I'd say it's about time they block him. I'd imagine he's putting an unneeded strain on the already overworked servers.

I'd imagine that you are wrong. Jeremy has already stated that there is plenty of bandwidth - that the site performance woes have to do with the number of simultaneous users.

 

For planning trips Buxley's maps are the best around by a long shot, I hope that they start getting updated again soon. If not it would be an unfortunate loss for the geocaching community.

Link to comment

he's putting an unneeded strain on the already overworked servers

Yeah, right. I could imagine all of us individually would put much greater load on the servers doing the same kinds things as Buxley's, but often w/o the skills, on our own over and over again.

Well, one more reason not to geocache in other states and on the road. As if the proliferation of guardrail and lamppost caches wasn't an invitiation to give up already.

Link to comment
Lets stay on topic please.

This is about problems that Buxleys site is having on their end. Not Throttling issues.

I would think Buxley gets the data for his maps the same way stats site get their stats. If the other stats sites are throttled, chances are Buxley's is too. I read how one stats site was throttled, they were ONLY hitting the database 10,000+ times a day. 6-7 stats sites like that one would put as much load on the website as every single real live person in the world.

Surely other people have noticed that about the same time the stats people starting complaining about no stats, the rest of us STOPPED complaining about the website slowing down. Last week was the busiest week EVER I think, There were over 82,000 logs, and the site barely stumbled. Good job throttling back the 1% of people using 50% of the resources.

Link to comment

I know it's fun to blame things on evil page scrapers, but I've spoken with "Buxley" many times through the years and he's not a doofus. Please don't dump him in the 10,000 page a day bucket.

 

Whether he's been throttled or whether there's something else going on is just speculation at this time as I don't see either side having made a statement yet. I just wanted to point out that Ed's site was (at least at one time) doing what it did in a way that had an exceedingly low resource requirement; far lower than the impact of the humans that it served each manually generating similar requests.

Link to comment

Am I the only one who thinks that these stat-sites can release some pressure from Geocaching.com site? If the stat-site looks once every day at geocaching.com, and then shows stats and latest logs to users - then will not these users surf all the time to Geocaching.com to look what has happened with all my nearby caches etc...

Link to comment
I know it's fun to blame things on evil page scrapers, but I've spoken with "Buxley" many times through the years and he's not a doofus.  Please don't dump him in the 10,000 page a day bucket. 

 

Whether he's been throttled or whether there's something else going on is just speculation at this time as I don't see either side having made a statement yet.  I just wanted to point out that Ed's site was (at least at one time) doing what it did in a way that had an exceedingly low resource requirement; far lower than the impact of the humans that it served each manually generating similar requests.

No, we don't know if Buxley is throttled or not. I do know many of those stats sites have been throttled. I even see you are an opt in member of one of those "evil 10,000 page a day stats sites yourself.

Due to recent changes in Groundspeak policy, fully automated collection of stats is unavailable at this point in time. Stats are presently being collected from the "recent finds by" page on geocaching.com, so most new logs will be added to the queue within 20 minutes of their entry on geocaching.com. The real "magic" of the stats system, the ability of the stats system to "self-reconcile" to correct invalid information has unfortunately been disabled for the time being. As a result, you will need to use the form below to submit any corrections and/or exclusions for inclusion in the stats database (basically we are reverting back to Dan's old style of things for the moment).

 

New members are most affected by this change as the system no longer has any mechanism to locate logs that were entered prior to the member being included in the stats system. Additionally, our data rate has been reduced from one page per 10 seconds to one page per 3 minutes in order to stay within Groundspeak requirements. I am aware of the unfortunate slowness and number of errors these changes will result in. Attempts are being made to resolve these issues with Groundspeak at this time.

 

So, the stats site that Robert joined (and many other people, I'm not really singling him out here) was hitting the site every 10 seconds. 6 times a minute. 360 times an hour. 8640 times a day. 60,480 times a week! On top of that, the "magic" to self reconcile errors and add old logs for new members meant it had to constantly drill down through a members finds and hides to see if something had been missed. In Robert's case, that's quite a few more pages to check. Then reason it's done like this I'm told is because they want something that GC.com doesn't offer. I'm sorry, but anywhere else in life, if someone has something you want, and they wont give or sell it to you, that does NOT give you the right to take it. The same people who I'm certain would never think to take something from their neighbor's yard because "I want it", have no problem taking something from Jeremy and company, and from the rest of the users of this website.

Edited by Mopar
Link to comment
our data rate has been reduced from one page per 10 seconds to one page per 3 minutes in order to stay within Groundspeak requirements. I am aware of the unfortunate slowness and number of errors these changes will result in. Attempts are being made to resolve these issues with Groundspeak at this time.

 

So, the stats site that Robert joined (and many other people, I'm not really singling him out here) was hitting the site every 10 seconds. 6 times a minute. 360 times an hour. 8640 times a day. 60,480

Nice hyperbole. Your math is predicated on a false assumption that it does this around the clock without bounds or limits. The above describes the maximum burst rate. The actual fetch rate is far less than you're fussing about and during off hours at that. The full page refetch is a slow drip and the unit of measure is "weeks" not "seconds".

 

I'll emphasize that Buxley was once able to provide his "value add" tiger-based maps by pulling two pages from this site. So even if he provided them hourly around the clock (and I don't know his update frequency) that would have been under 50 page views a day. The typical FTF hound could easily generate way more load than that just clicking reload all day long. Let's make a stretch assumption and say that at least two such people used Buxley's site in a day. Result: fewer pages delivered by the site.

 

 

I with the site would open a dialogue with the programmers capable of offering front-end services. Sites like Buxley, the various local groups offering maps of their area, the cell fone delivery services, and so on should be treated as friends (free programming catering to wants of end users) instead of enemies. Believe me, most of us can think of more desirable ways of doing that than reading pages formatted for humans.

Link to comment
I with the site would open a dialogue with the programmers capable of offering front-end services. Sites like Buxley, the various local groups offering maps of their area, the cell fone delivery services, and so on should be treated as friends (free programming catering to wants of end users) instead of enemies. Believe me, most of us can think of more desirable ways of doing that than reading pages formatted for humans.

I agree wholeheartedly. The programmers setting up these "value-added" services are just trying to help out the geocaching community, not hinder it. Calling them thieves is a very big stretch in my opinion.

 

--RuffRidr

Link to comment
Both the danish and the english stat site have written public that they have been PERMBAN on geocaching.com, they are saying that their IP are blocked by geocaching.com, and they can't surf here...

Hedberg,

 

Firstly I wasn't positive if your info was about some stats sites, or about Buxley's Waypoint.

 

If the latter is indeed true, then is this info posted anywhere? Of course when CO Admin mentioned that Buxley's was "having technical difficulties on their end" and refused to answer my specific questions, then the Admin's silence seemed more eloquent than any words.

 

But I wonder if it is already time for mourning, and for revenge? Has the death certificate for Buxley's been finalized? Add'l info still appreciated!

Edited by MOCKBA
Link to comment
But I wonder if it is already time for mourning, and for revenge?

Revenge? What are ya gonna do, letterbomb Groundspeak because they only let page scrapers hit the site once every 3 minutes (according to this page, which RobertLipe says is not accurate)?

 

People, it's a friggin GAME!

Edited by Mopar
Link to comment
Due to recent changes in Groundspeak policy, fully automated collection of stats is unavailable at this point in time. Stats are presently being collected from the "recent finds by" page on geocaching.com, so most new logs will be added to the queue within 20 minutes of their entry on geocaching.com. ...

It also says (out of date now of course)

 

Doesn't the data collection DOS attack Geocaching.com?

No.  Most of the statistics data is collected by a process that runs at a rate of one web page per 15 minutes.  Other processes will never exceed a rate of one page per 10 seconds and always run during off-peak hours.  Great amounts of time were put into designing the collection process so that it would not be a burden on Geocaching.com.  If the data could not be collected humanely, we would not have created this site.

Link to comment
Due to recent changes in Groundspeak policy, fully automated collection of stats is unavailable at this point in time. Stats are presently being collected from the "recent finds by" page on geocaching.com, so most new logs will be added to the queue within 20 minutes of their entry on geocaching.com. ...

It also says (out of date now of course)

 

Doesn't the data collection DOS attack Geocaching.com?

No.  Most of the statistics data is collected by a process that runs at a rate of one web page per 15 minutes.  Other processes will never exceed a rate of one page per 10 seconds and always run during off-peak hours.  Great amounts of time were put into designing the collection process so that it would not be a burden on Geocaching.com.  If the data could not be collected humanely, we would not have created this site.

Well, it sure seems to me that throttling these sites back to one page every 3 minutes (why is 3 minutes such a huge problem if it's mainly one page every 15 minutes now?) has made a HUGE difference in site performance. Like I said earlier, the last 2 weekends have had the most logs ever. Not long ago Jeremy was touting breaking 50,000 logs a week, the last 2 weeks have been over 80,000. No new changes except the throttling software appear to be made; as a matter of fact a recent post mentions the new server was backordered. 2 of the busiest weekends ever, and yet the site has run better then any weekend for the last 3-4 months.

I could be wrong, but I suspect most of the 16,344 people who logged in the last week, if given a choice between stats and cache maps that are almost useless for planning since they still show archived caches and caches they already found; or being able to actually use this website, would side with me on which one they want.

Edited by Mopar
Link to comment

Mopar,

 

Where can I find out more about this so called 'throttling" software (or data scraping blockers, or whatever).

 

Your posts were the first ones I saw mentioning this, although I've seen a few others since then and I'm not sure if they're based on your statement or what.

 

Of course - in response to the original post - I'd say ask Buxley - if he wants to tell you he will. (Shame they don't have forums there, do they?)

 

southdeltan

Link to comment
Well, it sure seems to me that throttling these sites back to one page every 3 minutes (why is 3 minutes such a huge problem if it's mainly one page every 15 minutes now?) has made a HUGE difference in site performance. Like I said earlier, the last 2 weekends have had the most logs ever. Not long ago Jeremy was touting breaking 50,000 logs a week, the last 2 weeks have been over 80,000. No new changes except the throttling software appear to be made; as a matter of fact a recent post mentions the new server was backordered. 2 of the busiest weekends ever, and yet the site has run better then any weekend for the last 3-4 months.

I could be wrong, but I suspect most of the 16,344 people who logged in the last week, if given a choice between stats and cache maps that are almost useless for planning since they still show archived caches and caches they already found; or being able to actually use this website, would side with me on which one they want.

What you're leaving out is that there are probably dozens of other site scraping webpages that were effected by this change as well. I'm sure most of these weren't nearly as nice on the site as Buxley's and the MTGC page were. Lumping Buxley's and MTGC site with these and blaming them for the weekend's congestion is simply not fair.

 

The question I want to know is why can't GC.com work with these two sites so everyone is happy?

 

--RuffRidr

Link to comment
Mopar,

 

Where can I find out more about this so called 'throttling" software (or data scraping blockers, or whatever).

 

Your posts were the first ones I saw mentioning this, although I've seen a few others since then and I'm not sure if they're based on your statement or what.

 

Of course - in response to the original post - I'd say ask Buxley - if he wants to tell you he will. (Shame they don't have forums there, do they?)

 

southdeltan

First I saw it mentioned was here, in my regional forum. That post points to a message on a stats site that explains the throttling and says "our data rate has been reduced from one page per 10 seconds to one page per 3 minutes in order to stay within Groundspeak requirements". Since it was posted by the guy running the stats site, I assumed it was correct, but several people here tell me despite what that website's owner said, the info there is inaccurate, so who knows. All I know is that since that was posted, the website has for the most part run fine on weekends, with very few timeouts.

Edited by Mopar
Link to comment
What you're leaving out is that there are probably dozens of other site scraping webpages that were effected by this change as well. I'm sure most of these weren't nearly as nice on the site as Buxley's and the MTGC page were. Lumping Buxley's and MTGC site with these and blaming them for the weekend's congestion is simply not fair.

I would think if they were not part of the problem, they wouldn't be having a problem now. If they are only hitting the site every 15 minutes as someone contends, how would limiting them to once every 3 minutes be a problem?

Link to comment
I would think if they were not part of the problem, they wouldn't be having a problem now.

You mean like another time that GC.com swatted flies with sledgehammers (forum signature graphics)?

 

If it's a simple case of IP filtering, you *could* just shut down every .nu IP from ever accessing the website here. Does that solve the scraping/overusage problem from cachestats.nu ... sure. But it also includes people who were not part of the problem. Your statement's validity is dependent on the method applied to handle the automated accesses (which has not been announced by Jeremy et al. for us to know).

 

Given past actions, my guess is that a very broad policy was just implemented to assure that the guilty were among all of the usual suspects that are now in custody. Whether the innocent go free remains to be seen.

Link to comment

Nobody seems to be "shut down" from reading local forums. If you start hitting the the website too hard (like once every 10 seconds) it throttles you back to only getting a page once every 3 minutes. You aren't banned, just slowed down. This of course is all guesses based on info from the guys getting throttled. I don't think Jeremy has weighed in on this with actual facts yet.

Link to comment

As I’ve read from the Buxley site, Geocaching.com won’t let him work with their site. Why? I don’t think it’s a question of taking money from GC.com. That site was the first one I used when I started caching. I prefer Buxley’s maps for my area. Someone must be spoiled and wants to take their ball and go home.

Link to comment

Buxley has finally mentioned why his site has not been updated - apparently geocaching.com is blocking his site. He is also petitioned people write to Jeremy to life the block.

 

I personally think Buxley's site is a site is great, however if Buxley's site is truly affecting the performance of gc.com, I can understand why the site is blocked.

 

Last year there was another site that offered a similiar service, except it emailed/paged you of new caches (XX miles from coordinantes). That service is no longer available... it was great for FTFs. It would be great if gc.com could offer these same services as Buxley and the paging service.

Link to comment

:blink: I live in Bremerton/Kitsap County, which is West of Seattle and in the "outback" of Puget Sound, and I have relied upon the Buxley/Brillig website for much of my information about new cache locations in my area and others.

 

:huh: I do not understand why Groundspeak sees fit to terminate access to their system.

 

:huh: Is there something going on behind the scenes about which us Geocachers are not being told?

 

Whatever has to be done to solve the problem,

:ninja: JUST BRING BACK THE BUXLEY/BRILLIG ACCESS! :ninja:

That site provided a service that was lacking at GC.COM.

Edited by Fledermaus
Link to comment
  :huh: Is there something going on behind the scenes about which us Geocachers are not being told?

 

I'm sure there is a lot going on behind the scenes that we are not being told. I, for one, really do no twant to know how much TP they run through in a day, or the temperature on the server at any given time. I get the impression from the :blink: that you think there is nefariousness going on. I submit that such assumptions are silliness. It *might* be nefarious if they played around with public property or property that was not theirs. The gc.com database IS theirs... the caches are not, but the submitted listings are. So are the servers they are running from. If these things are so important, Navicache or another site can allow it, or if enough people request it, it can be made to happen here at gc.com. See the pinned cache-route thread at the top of the geocaching.com website discussion.

Link to comment

Last time when Groundspeak tried to ban Buxley, it made Slashdot news, now they are apparently feeling stronger. What a shame. With no maps outside the US, Buxley and regional sites which are "scrapping" coords from gc.com pages, are the only options to visualize caches in the area.

Oh and I assume that the timing of the fix which prevents nonpaying users from zooming gc.com maps was just coincidental.

Link to comment

It would be nice if GrSpk would enter into an agreement with a couple of outside entities like Buxley and maybe a comprehensive stats site and let them grab what they need. That way there won't be a need for most of these leach apps to pound on GC.COM all day and everyone will have their maps and stats and be happy.

 

As long as they aren't out to make a buck off GC.COM, but only to provide features that GC.COM can't or won't, I don't see the harm.

Edited by briansnat
Link to comment
It would be nice if GrSpk would enter into an agreement with a couple of outside entities like Buxley and maybe a comprehensive stats site and let them grab what they need. That way there won't be a need for most of these leach apps to pound on GC.COM all day and everyone will have their maps and stats and be happy.

 

As long as they aren't out to make a buck off GC.COM, but only to provide features that GC.COM can't or won't, I don't see the harm.

I agree

Link to comment
As long as they aren't out to make a buck off GC.COM, but only to provide features that GC.COM can't or won't, I don't see the harm.

I agree also. This is what irritates me most. If GC.com were to provide the same functionality, then it would be no big deal. But no, they shut down all these sites and just leave us hanging. Or they promise us that that feature will be here some day. How long now have we been waiting for a new cache notification service?

 

Frustrated,

--RuffRidr

Link to comment

I guess it's just me, but my opinion of Buxley's has changed in 3yrs. Back then, there were few caches and I had few finds, I found the site somewhat useful. Now, if I look at his maps for my stomping grounds now, they are useless. There are thousands of caches within 100 miles of me, and Buxley doesn't allow me to zoom down close enough to separate many of them. I often cant even easily click down to a zoom level at all, since its a big blob of caches, if I click anywhere I get a cache, not a zoom level. My second gripe is it doesn't remove archived or disabled caches. So after 3yrs, sometimes many of the caches he's showing are no longer there. It's a PITA to click each one and see if they even still exist to get an idea of a route. Which brings me to my 3rd problem. It doesn't know what caches I've found. After 3yrs I've found a few caches, and to use buxleys for any sort of planning i need to click each cache, scroll to the bottom and show all logs, then search them all to see if I'm there. In most of my frequent caching areas, I've alreadhy found 1/2 the caches there or more, so it's a real pain (and a waste of gc.com resources) to figure out what caches to do via Buxleys. The GC.com maps (at least in the US) are so much better for that, and personally, using the PQs and my own local mapping is better yet. I just see active caches I haven't found, I click the ones I want to do, and let the software route them out.

Now, I could see when Buxley started and gc.com didn't have cache maps there was a need, and there still may be a need outside the the us where GC is lacking in maps, but for most users his site no longer serves a useful purpose. Despite what he says about being "banned", I suspect his maps collected data the same way the stats scrapers were, and he has the same problem with now being throttled they have. I for one am glad to see it. Since they started cracking down on the site scrapers the site is running faster and smoother then it has in ages. For the first time all summer the site is actually usable on weekends when most of us want to use it. If Buxley's maps were part of that problem that's now fixed, I'm glad to see them go.

Link to comment
Guest
This topic is now closed to further replies.
×
×
  • Create New...