Jump to content
Sign in to follow this  
Followers 5
hedberg

The Servers Overloaded!?

Recommended Posts

Sometimes are the GC.com server very sloooooooooooooow. Are there so many cachers active right now, or is it any other reasons?

 

Just curios...

Share this post


Link to post

In that case were there some drunk cachers during the weekdays earlier this week also [:P]

 

It has happen us a few times the last couple of days...

Share this post


Link to post

We have had some recent problems with people hammering the site with automated bots, which causes performance hits. We treat them like denial of service attacks and kick them out when we find them.

Share this post


Link to post
We have had some recent problems with people hammering the site with automated bots, which causes performance hits. We treat them like denial of service attacks and kick them out when we find them.

I would actually go a little farther than that, if I were saying that...

 

"We treat them as denial of service attacks and kick them out when we find them."

Share this post


Link to post

Today seems really bad. Taking an average of three tries to get by the server error messages, and then another minute or so to actually load the pages.

Share this post


Link to post
We have had some recent problems with people hammering the site with automated bots

Ok. Maybe someone can explain this to me. Why would anyone want to "Hammer the site with automated bots"?? Is this some sort of data collection deal, or did we really tick off someone and they are trying to take the servers down in revenge??

 

Baptist Deacon :P

Share this post


Link to post
Today seems really bad. Taking an average of three tries to get by the server error messages, and then another minute or so to actually load the pages.

I have had similar problems. I am going to have Server Too Busy nightmares tonight. It has been excruciating. RM

Share this post


Link to post
We have had some recent problems with people hammering the site with automated bots, which causes performance hits. We treat them like denial of service attacks and kick them out when we find them.

I would actually go a little farther than that, if I were saying that...

 

"We treat them as denial of service attacks and kick them out when we find them."

If that's the cause of the problems tonite, I would side with ClayJar. Ban the IP and report them to their ISP, AND post their user account here so we know who to blame for not being able to use the website.

Share this post


Link to post
It took me an hour to log 7 finds tonight.

I like friendly bots like C-3PO.

Same here.... on both counts! :smile:

 

Actually took two hours because I had to log my seven, then Faile had to log her seven :mad:

 

What a way to spend the evening.

Share this post


Link to post

I had more server errors than logs to post! :smile:

 

Good news is I'm sure they're working on it. I doubt it'll be a problem for long.

Edited by mrkablooey

Share this post


Link to post

Bot detection concerns me. When I get my rhythm going I can download a wack of location files fairly quickly. Would I get flagged as a bot and blocked out?

Share this post


Link to post
When I get my rhythm going I can download a wack of location files fairly quickly.

Unless you have a version number attached to your name, I doubt you'd be flagged.

Share this post


Link to post
We have had some recent problems with people hammering the site with automated bots, which causes performance hits. We treat them like denial of service attacks and kick them out when we find them.

Jeremy, this sounds like very unfortunate !! You will most propably need cooperation with your Network Connection Provider !! Or what else can you do ?? Unfortunately this is the present state, what will we face in the near future ???

Share this post


Link to post
Jeremy, this sounds like very unfortunate !! You

It is the nature of the beast, I'm afraid. The solution is being ever vigilant and monitoring the server load to determine issues and stop them before they get out of hand.

Share this post


Link to post

I assume that some of these robots are providing different non-GC sites that have statistics about cachers/caches in their area. Couldn't GC instead have stats about different countries/states on their site, or has this been discussed 12737 billion times before at this forum??

 

Just a question.

Share this post


Link to post
Jeremy,  this sounds like very unfortunate !!  You

The solution is being ever vigilant and monitoring the server load to determine issues and stop them before they get out of hand.

I honestly wish that the servers can stand it !!

Share this post


Link to post
I assume that some of these robots are providing different non-GC sites that have statistics about cachers/caches in their area. Couldn't GC instead have stats about different countries/states on their site, or has this been discussed 12737 billion times before at this forum??

 

Just a question.

Hi again Hedberg, I am affraid these people have neccessarily nothing to do with GC people, they just want to cause any possible harm to anybody they can find !! And there is nothing much you can do if they hit you real, only upgrade your servers and hope the best !!

Share this post


Link to post
We have had some recent problems with people hammering the site with automated bots, which causes performance hits. We treat them like denial of service attacks and kick them out when we find them.

I think my IP is banned two times since saturday, I join to the site from another ISP.

 

I don't understand at all about the bots you are talking about. Did you mean spyware bots?

 

I downloaded Spybot and Ad-aware both and upgraded them, i found some spyware, specially a lot of tracjing cookies of unknow domains.

 

Can Spybot and Ad-aware help the problem?

 

How long can take your Bans?

Share this post


Link to post

I think Jeremy is talking about programs written to hit the site over and over again in an effort to "scrape" data about the geocaches for personal use. Not spyware, adware, etc.

Share this post


Link to post
I think my IP is banned two times since saturday, I join to the site from another ISP.

 

What do you mean by join to the site?

Share this post


Link to post
I think my IP is banned two times since saturday, I join to the site from another ISP.

 

What do you mean by join to the site?

Sorry, i mean visit the geocaching website

Share this post


Link to post

We haven't banned any IP addresses through the weekend, so that isn't it.

Share this post


Link to post

While I could be well off base here, but the data scapers probably fall into three categories; grabbing pages because they don't get PQs, grabbing pages because PQs don't have all of the logs, stats.

 

Can't do much about the first one, but I'll be upfront and tell you why I will do the second one before a long trip. I can't tell you how many times we've gotten to a cache, had trouble, and when consulting the logs it said the coords or description was way off and they had referred to a previous log. But, now that log is not on the PQ because of the 5 log limit. We're wasted 2 hours for nothing. Getting only 5 logs is like getting only part of the description.

 

So now before a 200 mile trip, not only do I download a PQ of the area, I massage the PQ to get a list of the caches and run that through an offline browser limiting it to 2 connections and 1 request per second. I try to run it in the morning during the week when I suspect the server load is at its least. What I do is not like the some other scrapers. Mine is limited to only the caches in a certain area and is finite. I've only done it twice and I can tell you that it has worked out very well--which shows that the 5 log limit is just too low.

 

I suspect this type of scraping is in the minority, though. It could be solved by changing the 5 log limit to something more reasonable. I'd like to see, in the least, to have at least the last 10 logs and, though included, log types other than "found it" logs not be counted torwards the 10. That would give us a much better selection of the found it logs without lossing the ever important DNFs and notes.

 

The site scraping for stats only illustrates the desire for stats. Displaying stats directly here on the site or better yet providing some kind of "feed" for the stats for each cacher would go a long way to reducing the site scraping because of stats.

 

In short, any data scraping that is going on tells you you're not providing all the services that people want.

 

Trying to block the true data scraper is like the "War on Drugs," you can't stop it and you'll waste far too many resources trying to.

Share this post


Link to post
The site scraping for stats only illustrates the desire for stats. Displaying stats directly here on the site or better yet providing some kind of "feed" for the stats for each cacher would go a long way to reducing the site scraping because of stats.

I agree 100% with this. I don't feel any sympathy in the least if this is what the cause of the server load is. There are LOTS and LOTS of people everyday saying that they want this. TPTB of this site say that they don't want to make it a competition. Fine, don't. But like CoyoteRed said, make a "feed" where other sites can easily get the data they need. Let the other sites set up the statboards for their local area. People that don't want to compete don't ever have to look at those sites.

 

Maybe its me, but I just don't see what the problem is with this. Why are you so stubborn on this subject?

 

--RuffRidr

Share this post


Link to post
The site scraping for stats only illustrates the desire for stats.  Displaying stats directly here on the site or better yet providing some kind of "feed" for the stats for each cacher would go a long way to reducing the site scraping because of stats.

I agree 100% with this. I don't feel any sympathy in the least if this is what the cause of the server load is. There are LOTS and LOTS of people everyday saying that they want this. TPTB of this site say that they don't want to make it a competition. Fine, don't. But like CoyoteRed said, make a "feed" where other sites can easily get the data they need. Let the other sites set up the statboards for their local area. People that don't want to compete don't ever have to look at those sites.

 

Maybe its me, but I just don't see what the problem is with this. Why are you so stubborn on this subject?

 

--RuffRidr

Now thats a great attitude. If a company doresn't desire to provide you with a product, it's ok to steal it from them. No matter that it's costing the company (and indirectly, legitimate users) money and hampering the ability for them to provide the products they do offer.

Nice.

Share this post


Link to post
... it's ok to steal it from them.

Oh, come on, Mopar. You know that's not right. Datascrapers aren't getting data that's not freely available to everyone. They're just getting it in a way that taxes the resources.

 

It's not stealing. It more like hogging. They are not taking "product off the shelves." It more like a library where someone comes in a grabs huge armfulls of books and goes and sits in the corner only reads a few passages because that's all they need. They go and get all of those books because they might information out those books and it's easier to do that than go get one book at a time. That's a better parallel. It's hogging resources.

 

However, if the library was to compile certain information it would be easier on the searcher and the library!

 

Speaking of stats, it wouldn't take much at all to create a link on one's profile page that spit out a comma delimited list of the found caches with the pertinent information.

 

Or a single canned query twice daily with the stats of all cachers that have been active in the last week.

 

Or get the ability for one to get a PQ of all of their finds online quickly so they can include themselves on a stats site.

 

There are many ways to provide a service while making it easier on everyone.

Share this post


Link to post
Now thats a great attitude. If a company doresn't desire to provide you with a product, it's ok to steal it from them. No matter that it's costing the company (and indirectly, legitimate users) money and hampering the ability for them to provide the products they do offer.

Nice.

That's not what I am saying. I personnally don't advocate scraping of GC.com's website.

 

--RuffRidr

Share this post


Link to post
... it's ok to steal it from them.

Oh, come on, Mopar.

Didn't you commit forum suicide?

 

This isn't a place to argue semantics of stupid actions. Whether it is hogging or stealing in your humble opinion, it's considered a denial of service attack and is not tolerated on this site.

Share this post


Link to post

So just to clarify.

 

Geocaching.com will ban IP addresses if they are found to be running bots.

 

No IP addresses were banned through the weekend.

 

That would imply that:

 

No bots were running over the weekend that caused the slowness (so the slowness was not caused by bots)

 

Or

 

That no bots were caught over the weekend (so more work needs to be done on the detection side but if no bots were caught who knows if there really was anything to catch)

 

Or

 

The site is just slow

 

Because this is a weekend problem and bots have no concept of time other than what sort of things their owners may have set up and this does not seem to happen at other times my guess is that it is not bots. It may very well be the number of people trying to log their entries from a weekend of caching. Does anybody have a clue how many logs were created over the weekend and it would seem more specifically on Sunday evening? If it was thousands then maybe there is no cause for alarm, that is just the way things go. But if it was hundreds then there would seem to be problems.

Share this post


Link to post

It's hard to run bots really fast when the site is running really slow.

Share this post


Link to post

I'm not sure if this is on-topic since the thread just took a turn, but I just got this when I tried to pull up a local cache page:

 

Cache Name

Owner

 

 

 

 

Transaction (Process ID 145) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

 

N/S 0 000.000 W/E 000 00.000N/AHyperlink to Jeeep

 

 

Location

Hidden: Date

Use waypoint: GCXXX (what's this?)

 

 

 

(ratings out of 5 stars. 1 is easiest, 5 is hardest)

Difficulty: N/A Terrain: N/A

 

(Short Description)

 

 

(Long Description)

 

Additional Hints (Encrypt/Decrypt)

 

(Hints)

 

 

 

 

 

 

 

 

 

Logged Visits (No Logs)

Warning. Spoilers may be included in the descriptions or links.

Cache find counts are based on the last time the page generated.

 

Refreshing the page gave me the correct info.

Share this post


Link to post

No. That is a file locking error that has been around for a while. This is probably the most elusive bug on the site.

Share this post


Link to post
Its the first time its ever happened to me, so I guess its pretty rare.

Depends on how many cache pages you view every day :unsure:

Share this post


Link to post
This isn't a place to argue semantics of stupid actions. Whether it is hogging or stealing in your humble opinion, it's considered a denial of service attack and is not tolerated on this site.

I am confused about a couple of things...

 

At what point is it considered a denial of service attack?? Is it one bot slowing things down or multiple bots? If so, which bot is the one considered guilty?

 

If I am not mistaken, many of the state geocaching associations scrape data for their local stats. Are they being banned too?

Share this post


Link to post

 

If I am not mistaken, many of the state geocaching associations scrape data for their local stats. Are they being banned too?

You'll find that the many of the state and non US sites listing stats, are approved by Groundspeak. And collect their data in a approved maner and at approved times, using a log in to the site, and so can be tracked very easily. In the case of the UK stats site, they have very stict rules on what, when and how they collect data, so as not to affect the running of the site. Dave

Share this post


Link to post

Here we go again!! Been trying to log finds for a while now and keep getting this:

Server Error in '/' Application.

--------------------------------------------------------------------------------

 

Server Too Busy

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

 

Exception Details: System.Web.HttpException: Server Too Busy

 

Source Error:

 

An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

 

Stack Trace:

 

[HttpException (0x80004005): Server Too Busy]

System.Web.HttpRuntime.RejectRequestInternal(HttpWorkerRequest wr) +147

Share this post


Link to post

Tonight's issue doesn't feel like a real server overload to me; I think there is a bug somewhere. So I'd expect it will be resolved pretty quickly.

Share this post


Link to post

I have ben getting a number of different but similar errors like yours listed while trying to do a number of different things on GC.com tonight. It is the first time I have had andy real issues with this site. It is also running very+++ slow at times, I think that there are just alot of people out there that don't have real lives outside of their 'puters. :D

 

Hey wait that doesn't include me does it :D ....I mean I did recognize the problem. So either I am better than everyone else or I have just started step one of a 12 step program :D:D ...Hi I'm MedicP1 and I am addicted to techno toys and the smell of plastic in the woods! :D

Share this post


Link to post

Jeremy, would a new hosting machine help, or is it just a matter of trying to find a way to keep particular users/apps from hogging CPU cycles? If a new hosting machine is in order, I'd be willing to contribute a fair amount towards one of these bad boys.

 

Although I gather you're running the site on a Windows variant, so that might not appeal to you very much. :D

Share this post


Link to post

Flawed reasoning. Bots run all week, but during the weekends the combination of bots and regular users creates a greater "perfect storm" candidate. Also people running bot clients for downloading bulk pages occur when they want them, which is normally on the weekend.

Share this post


Link to post

I know its redundant to say this, as everyone is having the same problems, but it makes me feel better to complain, heh heh.

 

I have been very irritated this morning spending 30 minutes trying to print out a few cache pages that should have taken 5 minutes, without the errors.

 

Thanks for listening to me complain, I feel better now. :ph34r:

Share this post


Link to post
Guest
This topic is now closed to further replies.
Sign in to follow this  
Followers 5

×
×
  • Create New...