Jump to content

Cache Page Throttle -- Could we let up a bit?


Moosiegirl

Recommended Posts

In THIS THREAD, I read about the apparently recent throttling on cache pages:

 

It's not an error. The system has a throttle control, which locks you out if you try to access too many pages within a certain time frame. This is to foil screen-scrapers. If all you get is a white page, with the page's URL address as the page title, then you've been throttled. Come back in 10 or 15 minutes, and the throttle should be released.

 

I made 4 before the throttle kicked in. It's just white. It should say. "You've been throttled. Come back in 5 min" or something like that. Now I can't see anything. Good thing the forum is on another server.

 

I usually only make 10 or 12 before I get cut off. I write my logs off line in GSAK, and I can log MANY more caches in 15 minutes than the system will allow me to pull up. I suppose part of the problem is that after I log a cache, I always download a gpx to update my database.

 

May I suggest a bit longer timeout on the throttling, or addition of some means by which a "real person" can tell the system that they are not a screen-scraper? It's really frustrating not to be able to use the tools I have to make logging quick and painless. I often open 10 windows or so at a time, so I can be pasting a log on one page while another is loading and yet another is posting. I can't do that with the current timeout.

 

Thanks and Happy Trails,

Candy

Link to comment

I'm with you, Candy.

 

A human operator, typing original (though admittedly not chatty) logs or loading a single page at a time using a well tuned OS/'net connection/browser combination can so trivially trigger the throttler that it's crazy annoying. It's not a recent thing - it's been this way for a long time.

 

This is the ONLY site I use where I have to consciously think "one banana, two banana, three banana, four banana, submit".

 

I get that the site can't accept its current user load and is attempting self-preservation, but it's really obnoxious how it handles it.

Link to comment

I just attempted to pull up a whopping FIVE cache pages in order to get updated gpx files, and was throttled after the first three! In an attempt to kill time, I went into the "view my stat bar" page, built a different stat bar, and played with it for a bit, then went back to my account page and clicked on my stat bar there to see where it went (I actually had not tried that!). I WAS THROTTLED OUT OF MY PROFILE!

 

This is simply ABSURD. I refuse to even consider that my fingers can call up web pages as fast as some scraper can scrape them. If it's going to take some time to tune the throttling mechanism, may I suggest that it's time to BACK IT OUT UNTIL YOU CAN FIX IT, so that HUMAN BEINGS aren't thwarted at every turn???

 

Any of you others out there who are experiencing the same frustration level, please HELP KEEP THIS TOPIC ON TOP OF THE FORUM. Subscribe to it; keep track of what you are doing, how many pages you pull up before getting the "white screen of death", and post here. User feedback with statistics, as opposed to general griping, is the only thing that's going to get this fixed any time soon.

 

Thanks,

Link to comment

Nope. I've never seen it either. Here are my two tests:

 

ENVIRONMENT:

Firefox 2, Win XP, normal network operations, keywords under Firefox for faster page access

 

TEST #1: Logging caches and updating mileage tracker (normal operations)

I typed up all my logs in MS Word and had the GC at the top of each log. I brought up two tabs in Firefox: one for the cache log and the other for my mileage tracker. For each log, I created a Greasemonkey script to select "found it" and my mileage tracker from inventory. I type "log GCXXXX" into the address bar and the log page displays. By that time, I have copied the typed log and am ready to paste it. Three tabs and a space bar will submit the form. Control-tab to my mileage tracker, backspace to the previous filled-out screen, F5 before the log confirmation page returns, paste my log and adjust the cache number before the page refreshes, click the button. Copy the next GC and repeat. I can complete each operation in fifteen to twenty seconds, max. No throttling here.

 

TEST #2: Page loading

I pulled a PQ result search and control-clicked on each cache listed on the page. All pages returned without problem.

 

TEST #3: iframes

I tried constructing a JavaScript that would load 24 cache descriptions into separate iframes. However, script exists on cache pages that will break out of an iframe. Test failed.

 

TEST #4: Clicking a list

Similar to test #2, I created list of hyperlinks to the 24 test cache descriptions and control-clicked on each one as quickly as I could. Since I could tab and control+enter each link, this is the fastest humanly possible. No problems whatsoever. I was able to click on all the links in under ten seconds.

 

TEST #5: Screen scraping

Okay... Is this even a problem? I created a quick scraper in .Net to see if the site even catches something like that. 11 of 24 returned were blank. Okay. It IS possible to get a blank page. Therefore, the thing catches when pages are requested faster than #4--much more than I can do manually.

 

RESULT:

I do not know how you are experiencing this problem. Do you have GSAK trigger your web browser to open at one time all caches you are logging? If so, this may be the case. However, going to the pages themselves through any manual means (clicking a list or using Firefox bookmark keywords), does not seem to trigger this problem.

 

That's as much as I can accomplish on my own for QA. I wanted to find a solution to your problem, but could not repro it manually.

Edited by Ranger Fox
Link to comment

I've done #4 and timed it out in just a few seconds. In IE, you can hold the Ctrl Key and click to your heart's content, opening up new tabs. Just yesterday I got the throttle after only a very small number of cache pages opening. All human - no automatic.

 

So - I slow down a little in my clicking. Everything works fine.

Link to comment

Bump

 

Can Robertlipe and I be the only two people who are ''drowning the engine'' on a regular basis? :(

I guess I can't type as fast as you :D

 

The problem I somtimes have is if I'm looking for a certain cache page (but don't know ID or exact name), and have to find it by going threw all the 'nearest' caches links till I find the right one, it might start returning "page not found" :cry: . Of course this seems to happen most when the pages load slowly (slower than 'normal'), so maybe its not a me-being-throttled issue?

Link to comment

I ran into this yesterday. I wanted to look over some of the caches nearby that I haven't found. I did the caches near me search. Using tabs in FF, I opened about 10 cache pages. When I went to view the pages, half of them were blank, white pages.

 

I don't normally search like that, but I didn't waste too much time trying to track it down since I'm used to things like that now.

 

It is nice to know why it happened and I agree that at the very least when you are throttled there should be some kind of message letting you know it and why.

Link to comment

Just happened to me which is why I'm on this thread and it's nothing to do with my ISP as I use at least 3 through work and home and it usually happens when I'm opening caches from GSAK using the double-click option....really, really frustrating as there is no indication how long I have tio wait until the throttle is let up meaning I either leave it for ages to be sure or else I go too early and get reset back to the beginning of the waiting period. It's probably the most annoying thing I've come across on GC.com since the delays in PQs last year :D

Link to comment

Bump

 

Can Robertlipe and I be the only two people who are ''drowning the engine'' on a regular basis? :D

I guess I can't type as fast as you :D

 

The problem I somtimes have is if I'm looking for a certain cache page (but don't know ID or exact name), and have to find it by going threw all the 'nearest' caches links till I find the right one, it might start returning "page not found" :D . Of course this seems to happen most when the pages load slowly (slower than 'normal'), so maybe its not a me-being-throttled issue?

 

through :tongue::huh::unsure:

Link to comment

Bump

 

Can Robertlipe and I be the only two people who are ''drowning the engine'' on a regular basis? <_<

I guess I can't type as fast as you :yikes:

 

The problem I somtimes have is if I'm looking for a certain cache page (but don't know ID or exact name), and have to find it by going threw all the 'nearest' caches links till I find the right one, it might start returning "page not found" :laughing: . Of course this seems to happen most when the pages load slowly (slower than 'normal'), so maybe its not a me-being-throttled issue?

 

through :anitongue::anitongue::anitongue:

Put down the red pen before it runs out of ink? :laughing:

Link to comment

I open lots of pages at once all the time, but I've never had a problem with throttling. Don't know why.

I do. You're a Platinum Member, remember?

 

Seriously, I understand the rationale for not publicizing the triggering criteria for a throttle, but if it can be triggered by hand at less than, say, 20 pages, I think some public information is warranted.

 

That said, I don't think I have ever triggered it.

Link to comment

In THIS THREAD, I read about the apparently recent throttling on cache pages:

 

It's not an error. The system has a throttle control, which locks you out if you try to access too many pages within a certain time frame. This is to foil screen-scrapers. If all you get is a white page, with the page's URL address as the page title, then you've been throttled. Come back in 10 or 15 minutes, and the throttle should be released.

 

I made 4 before the throttle kicked in. It's just white. It should say. "You've been throttled. Come back in 5 min" or something like that. Now I can't see anything. Good thing the forum is on another server.

 

I usually only make 10 or 12 before I get cut off. I write my logs off line in GSAK, and I can log MANY more caches in 15 minutes than the system will allow me to pull up. I suppose part of the problem is that after I log a cache, I always download a gpx to update my database.

 

May I suggest a bit longer timeout on the throttling, or addition of some means by which a "real person" can tell the system that they are not a screen-scraper? It's really frustrating not to be able to use the tools I have to make logging quick and painless. I often open 10 windows or so at a time, so I can be pasting a log on one page while another is loading and yet another is posting. I can't do that with the current timeout.

 

Thanks and Happy Trails,

Candy

 

You can use gsak to log caches?

Link to comment

I have never seen this either. My newest cache has 30 links to other cache pages in it. I have clicked many of them, one after another, and they all opened. With my dialup connection it took a while for them to finally load, but they all started loading immediately with no "throttling."

 

I wonder what the variable is that causes this to be a problem for some people . . . ? :ph34r:

Link to comment

I tried really hard to see this behavior on my machines and was only able to see it once when I was madly clicking as fast as I could to open links. Even then it was on just 2 of the 30 or more links I opened. I just can't picture anybody able to log caches that fast. Must be something else going on here.

Link to comment

I've never seen this. Maybe try typing cache logs rather than pasting "TNLNSL TFTC" on every cache??

You posted that twice and nobody took you up the first time which should have been a clue that it wasn't funny :ph34r:

 

To clarify things I write pretty long logs and I get this problem fairly often. Happened today when I downloaded a PQ to GSAK and realised there were 7 new caches. I tried to open each of them in turn from GSAK and only 1 loaded before I got throttled :blink:

 

I hope this isn't just restricted to GSAK as that seems to be common to many of the posts here :ph34r:

Link to comment

There must be something else going on, as I stated above. I frequently click on more than one cache from my GSAK database and have never seen the "throttling." I just clicked more than ten of the links on this cache page and all the pages are loading . . .

 

I'm curious what ISP those of you who have this problem use?

 

Maybe next time I take my laptop to a WiFi connection, I'll see if I can experience "throttling."

 

Edit to fix link . . .

Edited by Miragee
Link to comment

Yeah, I'd say the throttling needs to be worked on a bit...

 

I clicked on a link to a cache page.

 

I clicked on "View this log on a separate page" for the first log. I got a blank page.

 

Now I get nothing but blank pages wherever I go.

 

So what's the limit? One page view every ten minutes? :rolleyes:

 

I have the same sorts of issues. The throttle seems to trigger - often - when it shouldn't, i think. Sometimes only opening one or two cache pages will trigger it. Sometimes not.

 

But i doubt this bug will get fixed or looked at, since this is a 'security feature'.

Edited by benh57
Link to comment

I'd like to think that it's something that will be looked at. Remember that in order to actually work on it Raine needs some information per his post here.

 

I've got firefox set as the default browser from GSAK. I may try to replicate my lockout using IE instead.

 

It's erratic. I came back later in the day, doing exactly the same thing, with exactly the same software running, same browser, and had no problems opening pages 6 at time over and over.

Link to comment

I've never seen this. Maybe try typing cache logs rather than pasting "TNLNSL TFTC" on every cache??

You posted that twice and nobody took you up the first time which should have been a clue that it wasn't funny :D

 

To clarify things I write pretty long logs and I get this problem fairly often. Happened today when I downloaded a PQ to GSAK and realised there were 7 new caches. I tried to open each of them in turn from GSAK and only 1 loaded before I got throttled :D

 

I hope this isn't just restricted to GSAK as that seems to be common to many of the posts here :lol:

I wasn't even logging caches. I opened up a cache page, scrolled down to view an existing log, and clicked on "View this log on a separate page". Boom... I was locked out.

Link to comment

.....I wasn't even logging caches. I opened up a cache page, scrolled down to view an existing log, and clicked on "View this log on a separate page". Boom... I was locked out.

I tried doing this as fast as I could manually and could not trigger the problem except for one time in 10 very separate tries. Even then I got only 2 blank pages out of 30 I had opened.

 

Tried Win XP Pro with IE6 and IE7

Tried Vista Home Premium and Ultimate With IE7

 

Tried on 3 different ISPs with speeds ranging from a 256Kbps connection up to a 7Mbps connection.

 

Just for the info.

Link to comment

The amount of meat in the logs has nothing to do with it. I logged my weekends finds on Monday and wrote personalized thoughtful logs for them and I still ran into this problem. It seems like the threshold for throttling has been greatly lowered lately. I used to be able to trip it using the firefox linky add-on to open the pages for logging, but now I can trip it by manually entering the gc number and hitting enter on multiple pages.

 

Many of us use tabbed browsing to open up the pages to be logged before we begin writing the logs. This is not a length or quality of log issue. This is a website throttling issue, and I am with the others when I say the throttling threshold should not be set so low that a fumble-fingered monkey can trip it while manually entering gc numbers.

Link to comment

I wonder what the variable is that causes this to be a problem for some people . . . ? :)

 

This is the key question here.

 

I've had the blank pages appear to me very often. I guess that in about 50% of my sessions to gc.com this happens. Only for cache pages, though; not for the homepage or the forums. Haven't checked profile pages though.

It's very annoying when I quickly want to print out one cache page but find myself locked out (I then refer to Google's cache but it happened to me once that I had old info in the field with me).

 

I'm not a very fast typer, or the type of person that opens up more than 10 pages at the time or very quickly one after the other. So there must be something else.

 

Could it be that some program on our systems, or maybe the browser itself, makes frequent calls to gc.com's server, causing it to think that there is a spammer/scraper at work, and thus triggering the throttle?

 

It would be very helpful if someone from gc.com itself would shed some light on this issue. Is throttling a possibility, and what can be done to separate the humans from the bots?

Link to comment

The amount of meat in the logs has nothing to do with it. I logged my weekends finds on Monday and wrote personalized thoughtful logs for them and I still ran into this problem. It seems like the threshold for throttling has been greatly lowered lately. I used to be able to trip it using the firefox linky add-on to open the pages for logging, but now I can trip it by manually entering the gc number and hitting enter on multiple pages.

 

Many of us use tabbed browsing to open up the pages to be logged before we begin writing the logs. This is not a length or quality of log issue. This is a website throttling issue, and I am with the others when I say the throttling threshold should not be set so low that a fumble-fingered monkey can trip it while manually entering gc numbers.

I wasn't so much referring to the length of log, but the fact that a longer log takes more time to write, and thusly longer time between page loads.

 

Nonetheless, I use tabbed browsing too, and usually have three windows open, so I dunno.

Link to comment

I wonder what the variable is that causes this to be a problem for some people . . . ? :)

 

This is the key question here.

 

I've had the blank pages appear to me very often. I guess that in about 50% of my sessions to gc.com this happens. Only for cache pages, though; not for the homepage or the forums. Haven't checked profile pages though.

It's very annoying when I quickly want to print out one cache page but find myself locked out (I then refer to Google's cache but it happened to me once that I had old info in the field with me).

 

I'm not a very fast typer, or the type of person that opens up more than 10 pages at the time or very quickly one after the other. So there must be something else.

 

Could it be that some program on our systems, or maybe the browser itself, makes frequent calls to gc.com's server, causing it to think that there is a spammer/scraper at work, and thus triggering the throttle?

 

It would be very helpful if someone from gc.com itself would shed some light on this issue. Is throttling a possibility, and what can be done to separate the humans from the bots?

Just in case someone wants to "tease" this out, I use Opera 9.1 as my browser on WinXP Media Edition. I have a dialup connection, but can open more than 10 pages, in separate tabs, from that cache page of mine with 30 links on it, and all of those pages start loading. I haven't yet experimented at a WiFi Hotspot to see if I get the "throttling" with a faster connection speed.

Link to comment

Miragee, and others trying to tease this out by running some tests:

 

Make sure you clear your browser's cache before each test.

 

Also, make sure you understand which type of page you pull. If one person tests with the print friendly version then that is not comparable with the regular cache page. The reason is that when your aim your browser at a particular page, it may pull a whole bunch of data (picture files, icon files, ...) whatever it does not have in the browser cache.

 

If this is not clear, then open any cache page, right click on some white space in the display, select View Page Source and you can find a list of all the files that have already been pulled for that page (around 50 files). The Print-friendly page uses only 9 files. Clearly it makes a difference if your one click has your browser pulling 50 files rather than 10.

 

In fact, since some are seeing this problem and others are not, it could well be a browser setting issue. If your browser cache is too small or disabled (i.e loading every referenced resource every time), then you would trigger any throttling mechanism much faster.

 

The folks who are using Firefox along with GSAK need to be aware that GSAK's internal split-screen browser is IE so your computer may be using two applications to pull data off the website and the Groundspeak system won't tell those apart. So be sure that the GSAK splitscreen is showing the display grid and not a particular cache when running the test. If you don't then also note that IE and Firefix don't share browser caches; so you need to account for the added complexity.

Link to comment

Wow,

I'm glad I found this thread. I used to think the geocaching.com website was just flaky; now I know it's actively throttling me. :D

 

It's very easy for me to trigger the throttling - Using firefox, I do a query for caches (for example, caches near me), then from the results page I open a bunch of caches in new tabs. If I open more than 5 or so, the later ones all come up with a blank page.

 

This is my standard behavior on just about every internet (non-geocaching) web site - I do a query then open the relevant pages in tabs. I find it pretty annoying that I can do this everywhere else on the internet except for gc.com -- and I am a premium member here.

 

So I come back to the title of this thread: "Could we let up a bit?"

Link to comment

I've done a couple of tests, and I believe I understand the source of the throttling behavior. It's not how quickly you request the pages, exactly. It's how many simultaneous connections your browser makes to the gc.com server. Above some limit, a throttle is engaged.

 

So as long as you wait for one page to load before starting the next, you will be unable to trigger the throttle. In Firefox, however, you can start multiple tabs with multiple pages loading at the same time, and it makes you look like page-scraper.

 

It's pretty easy to fix this. Somebody who has been having these troubles should try and report back. In Firefox, got to the about:config page and find these two values:

 

network.http.max-connections

network.http.max-connections-per-server

 

I'd try changing the latter to 4 or less first, and see if that helps. If not, try changing the maximum total connections to 4 and see if that fixes it. Your tabs will load more slowly, but it should avoid the throttle.

 

BTW, be sure to exit and restart Firefox after making any changes.

Link to comment

 

It's pretty easy to fix this. Somebody who has been having these troubles should try and report back. In Firefox, got to the about:config page and find these two values:

 

network.http.max-connections

network.http.max-connections-per-server

 

I'd try changing the latter to 4 or less first, and see if that helps. If not, try changing the maximum total connections to 4 and see if that fixes it. Your tabs will load more slowly, but it should avoid the throttle.

 

BTW, be sure to exit and restart Firefox after making any changes.

 

Changing network.http.max-connections-per-server alone didn't stop the throttling; however changing network.http.max-connections to 4 does indeed stop it.

 

<soapbox>

I think it's pretty silly that the default firefox settings work fine on the other 99.999% of internet sites but on geocaching.com, where I am a paying ('premium') customer, I need to throttle back firefox.

</soapbox>

Edited by Jam Clam
Link to comment

Ive had this a few times here is what happens for me

 

I get the list of caches i want to open up listed on gsak

 

I double click them and then click the link to the cache page at the top

 

I then alt tab back to gsak and repeat

 

If i do this 6 times in a row i get throttled.

 

If i do 4 wait for them to load and then repeat i dont get throttled.

 

Uploading photos on more than one cache log at a time also gets me throttled.

 

For those saying write longer logs the reply is this

 

I write all my logs in word and then copy and paste them.

 

I would say i did not try to get throttled when logging my recent batch of caches so cant comment if its still happening the above is how i have started to cope.

 

Is does seem strange that a human with mouse and keyboard can trigger this throttle though.

Link to comment
Guest
This topic is now closed to further replies.
×
×
  • Create New...