Jump to content

Satellite Web Sites Policies


stringcachers

Recommended Posts

What are the GC policies concerning third party websites? There are a number of them out there for local groups that provide statistics, etc. for local players.

 

They are really community building sites, that all point back to and encourage going to the main GC site. But, they seem to be stymied lately in access to the data from the primary site.

Link to comment
what changed??? :)

They seem to be blocking the source IP addresses of the data mining servers from even looking at the site.

 

I am really curious about this. Is it a question of publishing the actual coords? Is it a load on the servers issue? Is it control of the data?

 

I would think that GC, with special permissions, or a small license fee, or something, would allow groups like this (see link) to work with the site to allow a greater community. http://forums.Groundspeak.com/GC/index.php?showtopic=87157

 

I love them, because I can see who is active in my area. This allows me to share, and get to know, and interact with the group in a meaningful way. It also allows these groups to think about different ways of generating statistics that are meaningfull to them.

 

Pocket queries come close to what these groups want/need. Create a way to have a special members account with the ability to do full load pocket queries (all caches in a state for example - with all logs - including archive caches). Once a day, or three times a week, then they could do what they wanted with the stats (within limitations). And they could still not be putting too great a load on the GC site.

 

All of these sites seem to come back to GC and help foster additional GC business. I don't understand why GC can't be cooperative with these sites. Geocaching really is about community.

Link to comment
All of these sites seem to come back to GC and help foster additional GC business. I don't understand why GC can't be cooperative with these sites. Geocaching really is about community.

Do they really? I would think that being able to download all the waypoints in the local area in every format one needs them ready converted doesn't encourage people to sign up for a premium membership at GC. So how exactly does this help GC.com's business?

 

Jan

Link to comment
I would think that being able to download all the waypoints in the local area in every format one needs them ready converted doesn't encourage people to sign up for a premium membership at GC.

I do see that as a potential sticking point. I would argue, however, that these are just the coords. They do not include the logs, the page information, the hints, etc. As such, the cachers need come back to the main page for information. And, this information is available still to the individuals as PQs.

 

If agreements were to be reached that downloadable files of waypoints were not to be available, would that change the issues with GC? By being able to do a couple of queries most of the goals could be reached:

1) All caches in (name your state) with log lists, including archived caches. This might not include the text of the logs, but rather the type and poster ID.

2) Totals stats of all posters in the (name your state).

 

And if this was on a subscription basis (for a MODEST fee), then a large portion of the goals of the other sites could be reached, and most of the site scraping could be elimitated.

Link to comment
what changed??? :)

They seem to be blocking the source IP addresses of the data mining servers from even looking at the site.

 

I am really curious about this. Is it a question of publishing the actual coords? Is it a load on the servers issue? Is it control of the data?

 

and? :D

 

AFAIK its always been "don't scrap us". Whenever they around to cutting off the bot they do... There was no 'lately', either it was causing enough draw something noticed, or someone was bored and went to see what hitting the site today. (I don't know if Jeremy et al. explain how they find them, and even if they did it wouldn't have made sense to me :D )

 

Don't get me wrong it would nice if they would allow access... the question of how to do this or that, and there hasn't been a clear answer. Basically if you want it badly, do it and hope they don't band your IP.

Link to comment
They are really community building sites, that all point back to and encourage going to the main GC site. But, they seem to be stymied lately in access to the data from the primary site.

 

what changed??? :)

Since all the money that could have been used for larger pipes (bandwidth) was used on Waymarking.com instead, and since bandwidth problems are slowing the site on certain days of the week, and since these satellite sites, in effect, steal bandwidth, Groundspeak decided to block them. Saves Groundspeak from purchasing more bandwidth for awhile.

 

That's my guess. :D

Link to comment
They are really community building sites, that all point back to and encourage going to the main GC site. But, they seem to be stymied lately in access to the data from the primary site.

 

what changed??? :lol:

Since all the money that could have been used for larger pipes (bandwidth) was used on Waymarking.com instead, and since bandwidth problems are slowing the site on certain days of the week, and since these satellite sites, in effect, steal bandwidth, Groundspeak decided to block them. Saves Groundspeak from purchasing more bandwidth for awhile.

 

That's my guess. :(

:lol:

 

:lol: this is why I should always use proper quote tags and not just cut and past :(

Link to comment
Since all the money that could have been used for larger pipes (bandwidth) was used on Waymarking.com instead, and since bandwidth problems are slowing the site on certain days of the week, and since these satellite sites, in effect, steal bandwidth, Groundspeak decided to block them. Saves Groundspeak from purchasing more bandwidth for awhile.

 

By licensing the access - with queries from within (instead of page scraping), doesn't the bandwidth problem go away (or at least greatly reduced)? And, that allows GC to eliminate sites that are still scraping while still working hand in hand with those sites that do provide a service to local comunities. :lol:

Link to comment
Since all the money that could have been used for larger pipes (bandwidth) was used on Waymarking.com instead, and since bandwidth problems are slowing the site on certain days of the week, and since these satellite sites, in effect, steal bandwidth, Groundspeak decided to block them. Saves Groundspeak from purchasing more bandwidth for awhile.

 

By licensing the access - with queries from within (instead of page scraping), doesn't the bandwidth problem go away (or at least greatly reduced)? And, that allows GC to eliminate sites that are still scraping while still working hand in hand with those sites that do provide a service to local comunities. :lol:

Can you give us an example of a site that is now having problems?

Link to comment

It seems most local orgs are now formalized and could probably arrange for contributions through their membership to pay for a license. They could even limit the cache information obtained geographically - i.e. tie the license to a region. Getting just the data needed uses a lot less bandwidth than downloading HTML full of lots of data that isn't needed (such as formatting, etc.)

 

Andy

Link to comment

I do not think a handful, even a 100 community sites scraping the HTML coming from GC is a bandwidth problem.

 

There are two ways of scraping the data.

 

1) Scheduled job that runs occasionally and pulls information off of a GC page.

2) Real time scrapes, or cut. A page redisplays a portion of a GC page wrapping the HTML in its own page.

 

Now, considering the 10's of thousands of users hitting GC, the amount of band-width the above would be using is trivial. Regarding #2, if the user could not get the info on the community site, they would get it from GC directly, and probably go through 2 or 3 pages to (home, search, and results) get it. That’s MORE bandwidth.

 

For #1, even if 100 community sites (are there that many, that scrape, I doubt it) check a page 4 times an hour, that’s only 400 measly hits, probably directly to the needed URL. TRIVIAL.

 

GC's reasons for blocking access are probably more an issue on content protection than bandwidth.

 

GC's ongoing technical problems are not caused by a few community sites scraping data.

Link to comment
They are really community building sites, that all point back to and encourage going to the main GC site. But, they seem to be stymied lately in access to the data from the primary site.

 

what changed??? :lol:

Since all the money that could have been used for larger pipes (bandwidth) was used on Waymarking.com instead, and since bandwidth problems are slowing the site on certain days of the week, and since these satellite sites, in effect, steal bandwidth, Groundspeak decided to block them. Saves Groundspeak from purchasing more bandwidth for awhile.

 

That's my guess. :lol:

Jeremy has repeatedly said that bandwidth is not an issue.

 

The issue here is that scraping and re-publishing the info on other sites violates the Terms of Use and therefore Groundspeak has every right to block their access.

Link to comment

Jeremy has repeatedly said that bandwidth is not an issue.

 

The issue here is that scraping and re-publishing the info on other sites violates the Terms of Use and therefore Groundspeak has every right to block their access.

 

What about just displaying stats? I can understand making GPX files that contain an entire states caches (if it contains more info than the typical .loc files) available to anyone. But why not team stats and cache stats? Wouldn't this information fall under "Third Party Submissions"? I would like to think that if I create a cache page or I submit a found (or DNF) log, that those materials would be MY intellectual property.

 

All materials available on or through the Site, other than Third Party Submissions (collectively, the “Site Materials”) are the property of Groundspeak or of its licensors and are protected by copyright, trademark, and other intellectual property laws.

 

Thoughts?

Link to comment

You forgot:

All comments, articles, tutorials, screenshots, pictures, graphics, tools, downloads, and all other materials submitted to Groundspeak in connection with the Site or available through the Site (collectively, “Submissions”) remain the property and copyright of the original author.

 

Hence, put a release on your cache page:

"I hereby give permission to all persons the ability to copy, reproduce in part or in whole any information on MY cache"..

 

Same thing with a log entry....

 

But , aggregate statistics are gc's numbers. They don't have to share them.

Link to comment

I am wondering if Groundspeak knows who they are cutting off.

Is it 'Hey, cut off that IP address for AZgeocaching.com!'

Or is it just 'Hey IP xxx.xx.xxx.xxx is crawling our site, block it!'

 

If TPTB know (or don't know) whom they are blocking, it puts a completely different spin on what has been happening.

 

AZgeocaching.com has no intention of undermining Groundspeak (or Geocaching.com) but only wishes to provide added STATE SPECIFIC benefits to Arizona Geocachers.

 

Yes, added STATE SPECIFIC benefits that Groundspeak seems incapable, or at least unwilling to provide.

 

If you won't pick the apples, let us make our own cider!

Link to comment

Considering most IP addresses used by hosting providers have generic reverse-dns entries, I doubt Groundspeak knows who gets cut off. And with the automatic throttling discussed here in the past, I doubt any human even notices. The first they probably hear about it is when someone complains here or to the contact@ address.

Link to comment
I am wondering if Groundspeak knows who they are cutting off.

Is it 'Hey, cut off that IP address for AZgeocaching.com!'

Or is it just 'Hey IP xxx.xx.xxx.xxx is crawling our site, block it!'

 

If TPTB know (or don't know) whom they are blocking, it puts a completely different spin on what has been happening.

 

AZgeocaching.com has no intention of undermining Groundspeak (or Geocaching.com) but only wishes to provide added STATE SPECIFIC benefits to Arizona Geocachers.

 

Yes, added STATE SPECIFIC benefits that Groundspeak seems incapable, or at least unwilling to provide.

 

If you won't pick the apples, let us make our own cider!

I would think you would get permission BEFORE violating the Terms of Use, not after. If they do not respond as quickly as you like I suggest you wait a little longer. there are more things going on then just your violation of the Terms of use.

Link to comment
I am wondering if Groundspeak knows who they are cutting off.

Is it 'Hey, cut off that IP address for AZgeocaching.com!'

Or is it just 'Hey IP xxx.xx.xxx.xxx is crawling our site, block it!'

 

If TPTB know (or don't know) whom they are blocking, it puts a completely different spin on what has been happening.

 

AZgeocaching.com has no intention of undermining Groundspeak (or Geocaching.com) but only wishes to provide added STATE SPECIFIC benefits to Arizona Geocachers.

 

Yes, added STATE SPECIFIC benefits that Groundspeak seems incapable, or at least unwilling to provide.

 

If you won't pick the apples, let us make our own cider!

I would think you would get permission BEFORE violating the Terms of Use, not after. If they do not respond as quickly as you like I suggest you wait a little longer. there are more things going on then just your violation of the Terms of use.

I'm not sure the terms HAVE been violated.

How does THIS SITE manage to skate by?

Link to comment
I would think you would get permission BEFORE violating the Terms of Use, not after. If they do not respond as quickly as you like I suggest you wait a little longer. there are more things going on then just your violation of the Terms of use.

The question I am trying to ask is:

How we can get together with for all sites and allow a cooperative agreement that does not violate the terms of use? If Groundspeak wishes to maintain control, or to block scrapers, then how can we work with them to accomplish the local goals of these types of sites while remaining within bounds.

 

While I have my agenda for certain sites that I like to see working, the reality of it is that there are dozens of sites like these. And, they exist worldwide. Can Groundspeak come to terms with these sites that is benefitial to both? That is my question.

 

Jeremy, are you listening?

Link to comment

IMHO

 

The "problem" here is that you want to compile, create and store statisics about cachers and caches within your state. Groundspeak, as a general rule, sees little or no point in such statistics and has on numerous occasions refused to supply theses statistics.

 

I would conclude that there is little chance of talking them into new policies that allow certain sites to grab that information. No bandwidth or server load problem at all. No license agreement or fees that need to be hashed out. Simply a problem in philosophy.

 

Your philosophy about numbers and stats versus thiers. They own the agregate data and you do not.

Edited by StarBrand
Link to comment
Your philosophy about numbers and stats versus thiers. They own the agregate data and you do not.

I agree that they have control of the data. That is why I am asking the question whether an agreement can be worked out?

 

There is obviously some demand for these kind of statistics, whether Groundspeak sees a point in them or not. I am not asking them to present the stats, just to work in cooperation with those that do.

 

Nor am I suggesting that these sites should obtain the required data outside an agreement. Which, sans an available cooperative avenue, they will gather the data however they can.

 

If Groundspeak were to cooperate, then a load on the servers could be reduced (not the real issue). And possibly some small revenue could be generated. And, the local caching groups could still see that data the way they wanted.

 

Can GP deny that these satellite sites have provided some benefit to them as well? By driving cachers to them, training new cachers, adding interaction and friendships between local cachers?

 

Is GP just going to dig in their heals and say no? Or, is it going to see an opportunity and/or demand and fill that void?

Link to comment

GC's site. I figure they can serve (open) to anyone they feel compelled to.

 

It's kinda like your local restaurant. "No Shoes. No Shirt. No Service." Or "We have the right to refuse service to anyone." It's a stretch but if it's their site, and it is, then they can govern how they want to. If someone needs data or service, they should seek permission before doing. Just MHO.

 

You gotta also think security. Think to Today's Cacher. It was hacked.

 

Maybe I'm too far to the right on this. Maybe not.

 

:)

Link to comment
If someone needs data or service, they should seek permission before doing.

 

You gotta also think security. Think to Today's Cacher. It was hacked.

Seek permission? Isn't that what I am trying to do?

 

As for security. I am not suggesting that the remote sites do a direct query of the database. I am suggesting that perhaps, simular to a PQ, that an agreement can be reached as the needed queries, and then the data is sent (ftp, email, scp, rsync, whatever) to the target site. Then, the target can manipulate that data however it wants to. The database is never open directly to the target licensee.

 

What I am asking is whether Jeremy (Groundspeak) is open to the idea at all. Or, is the topic closed, period.

Link to comment
If someone needs data or service, they should seek permission before doing.

 

You gotta also think security. Think to Today's Cacher. It was hacked.

Seek permission? Isn't that what I am trying to do?

 

Did you send an email to the contact address?

No? do it

Yes? good, now wait

 

If or when someone replies come back and we'll all give our opinions about the reply... until then how does this thread help? <_<

Link to comment
Did you send an email to the contact address?

No? do it

Yes? good, now wait

 

If or when someone replies come back and we'll all give our opinions about the reply... until then how does this thread help? <_<

Yes, I did. I have not received a reply.

 

How can it help? Maybe by getting some interest generated so they will reply? <_<

Edited by stringcachers
Link to comment

I occurs to me that Geocaching.com makes money by selling advertising, and selling premium memberships. Groundspeak.com also makes money by encouraging people to pay for premium memberships. Every hit missed by Geocaching.com because some local group has mined the data off Geocaching, costs Geocaching.com lost advertising revenue. If people choose to use local caching bulletin boards rather than Groundspeak, then Groundspeak loses potential sources of revenue from advertising and premium memberships.

 

To put it bluntly, site traffic generates income, if a local group reduces traffic to Geocaching or Groundspeak, it also reduces income to those sites. Why would any rational person or business give away, for free, information they paid to generate, when it will reduce the income they receive for their efforts?

Link to comment
Did you send an email to the contact address?

No? do it

Yes? good, now wait

 

If or when someone replies come back and we'll all give our opinions about the reply... until then how does this thread help? <_<

Yes, I did. I have not received a reply.

 

How can it help? Maybe by getting some interest generated so they will reply? <_<

GC.com generally does not respond to question like the one you posed. The only site I'm aware of that did work something out was Buxleys. Buxleys does maps. As for stats Groundspeak doesn't support them and I've never seen them give permission.

 

All topics geocaching and related are open for discussion in the forums. Plus a few that aren't if you paid your dues.

Link to comment

GC.com generally does not respond to question like the one you posed. The only site I'm aware of that did work something out was Buxleys. Buxleys does maps. As for stats Groundspeak doesn't support them and I've never seen them give permission.

 

All topics geocaching and related are open for discussion in the forums. Plus a few that aren't if you paid your dues.

Yes, I pay my dues.

 

So, what you are saying is that I am beating my head against the wall for nothing? And that Jeremy is not even will to come out and give a simple "no" (though I would love it if he would give a "maybe")?

Link to comment
...Yes, I pay my dues.

 

So, what you are saying is that I am beating my head against the wall for nothing? And that Jeremy is not even will to come out and give a simple "no" (though I would love it if he would give a "maybe")?

He does make his position known every now and then, it's just not likley in any one topic such as this. You can bet they read this thread though.

 

However when it comes to stats you would be beating your head against the wall. That Buxleys did work out some kind of deal though, holds out hope that for non stats purpose things can be worked out.

Link to comment

 

You gotta also think security. Think to Today's Cacher. It was hacked.

 

Maybe I'm too far to the right on this. Maybe not.

 

:lol:

 

Not to hijack the topic or anything, but IIRC, the so-called 'hack' was nothing more than someone modifying the URL in their address bar to see if a subsequent issue was there. Using DOS commands in an IE window to exploit a Code Red/Code Red II infected machine...now that's hacking. :ph34r: But I wouldn't know anything about that personally.

Link to comment
He does make his position known every now and then, it's just not likley in any one topic such as this. You can bet they read this thread though.

 

However when it comes to stats you would be beating your head against the wall. That Buxleys did work out some kind of deal though, holds out hope that for non stats purpose things can be worked out.

So, two questions do come to mind.

1) If what kind of forum does he make his position known, and can you give me examples?

2) How would I go about having a real discussion so that I can come to an understanding about what kind of deal is workable?

:huh:

Link to comment
...So, two questions do come to mind.

1) If what kind of forum does he make his position known, and can you give me examples?

2) How would I go about having a real discussion so that I can come to an understanding about what kind of deal is workable?

:huh:

1) Do a search on stats or data scraping

2) The best way is if you do have something in mind for your own website is to contact Grounspeak. While we can all discuss what we would like to see in the forums only Grounspeak staff can give the permission needed and work out the details.

Link to comment
It's kinda like your local restaurant. "No Shoes. No Shirt. No Service." Or "We have the right to refuse service to anyone." It's a stretch but if it's their site, and it is, then they can govern how they want to.

I think your analogy is a little flawed. A better analogy would be to add the fact that the restaurant is refusing part of the service (perhaps the wine list or desserts) to the people who supplied free raw ingredients for the food to be made.

 

Regarding my suggestion that I posted earlier. I estimate it would take probably no more than a week to implement and would enable gc.com access to multiple additional revenue streams. Seems to make business sense to me, just not sure why they wouldn't go that approach. Every time a local site is cut off it angers all the users of that site. Over time the number of people who don't like gc.com will gradually build. Implementing fee based access system would IMO resolve all problems.

 

Andy

Link to comment
1) Do a search on stats or data scraping

2) The best way is if you do have something in mind for your own website is to contact Grounspeak.  While we can all discuss what we would like to see in the forums only Grounspeak staff can give the permission needed and work out the details.

 

You'll find that the many of the state and non US sites listing stats, are approved by Groundspeak. And collect their data in a approved maner and at approved times, using a log in to the site, and so can be tracked very easily. In the case of the UK stats site, they have very stict rules on what, when and how they collect data, so as not to affect the running of the site. Dave

 

Is this true? :huh:

Edited by stringcachers
Link to comment
...Regarding my suggestion that I posted earlier. I estimate it would take probably no more than a week to implement and would enable gc.com access to multiple additional revenue streams. Seems to make business sense to me, just not sure why they wouldn't go that approach. Every time a local site is cut off it angers all the users of that site. Over time the number of people who don't like gc.com will gradually build. Implementing fee based access system would IMO resolve all problems.

 

Andy

What you are failing to grasp here is that TPTB have concluded that overt amounts of statistics are in fact detrimental to caching. No matter how much you want them or think it is an opportunity.

 

You own your data. I own mine. They own the aggregate. Feel free to compile and publish all the stats you want concerning your submissions. Share them with your friends. Compile stats with your friends stats.

 

I personally don't care to have my activities over-analyzed by random groups and I know many other cachers feel that way too.

 

I like just the basic numbers - how many did I find - how many have found my hides. Things like that.

 

If you can work out some kind of deal - more power to you. Ask TPTB and be persistent about it but don't be surprised or angry if they continue to be againist stats.

 

Just my opinion.

Link to comment

What you are failing to grasp here is that TPTB have concluded that overt amounts of statistics are in fact detrimental to caching. No matter how much you want them or think it is an opportunity.

...

I personally don't care to have my activities over-analyzed by random groups and I know many other cachers feel that way too.

 

If you can work out some kind of deal - more power to you. Ask TPTB and be persistent about it but don't be surprised or angry if they continue to be againist stats.

 

Help me understand why stats are bad for caching in general.

 

Is it:

1) A privacy issue? (This is an argument that I can appreciate).

2) "This is not a competition activity" attitute?

3) ????

 

As for being surprised or angry - I am not. I am simply trying to get a clear understanding of the position of GC - from GC. If I am fighting a loosing battle, fine. But you don't stop a battle and run away simply because somebody on the sidelines tells you that you will lose.

 

As for it being a battle, I would rather it not be. I just want to know if peace talks is even possible and who to talk to. That is it.

 

I have greatly enjoyed this game/sport/hobby/activity over the past couple of years. A large part of that enjoyment has come from the friendships I have made with fellow cachers. And, local stats has encourged that friendly interaction. It is simple as that.

Edited by stringcachers
Link to comment
What you are failing to grasp here is that TPTB have concluded that overt amounts of statistics are in fact detrimental to caching. No matter how much you want them or think it is an opportunity.

 

You own your data. I own mine. They own the aggregate.

 

I personally don't care to have my activities over-analyzed by random groups and I know many other cachers feel that way too.

Horsecrap.

 

Just because you're prejudiced against them doesn't mean they ruin caching.

 

I can show you a great thread on statistics that hasn't kept you up at night crying over how your caches have been included in the data and had you lamenting the oncoming death of geocaching.

 

TPTB have concluded that providing overt amounts of statistics aren't of interest to them and therefore they put everything else on their agenda for fixing/improving the site above it. They're not going to program what they're not interested in. This is why Waymarking.com has a greater number of statistical implementations planned. The person responsible for that package is interested in stats and Jeremy has let them work towards that end. Some of that *may* at some point even be brought over to GC.com. Who knows, time will tell...but even if you have a "Stats" tab on your profile pages, it won't be the death toll of caching. Even if there were to be *competitive* stats implemented in the future, it wouldn't impact you if *you* don't want it to (even without an opt-out/opt-in policy!).

 

The fact is that you own your information only in so much as I can't copy it without permission. You don't own a pair of coordinates. GS only owns the aggregate in so much as how it's stored, presented for use, and the access to it. They don't own the fact that 40 people have visited a certain cache or that a state has been increasing in cache total faster than its neighbor. Because the dataset is so large, it's very difficult to abide by the terms of use of the site to generate the statistics, but that information and analysis is free (as in "speech", not as in "beer").

 

In fact, I wish I had time to put up a webpage with an in-depth analysis of your caching, including time-find linear regression and your yearly/monthly/weekly average and everything else. I could do it without breaking the TOU/TOS and then they'd be there for everyone to see!

 

If you truly don't care, then stats aren't detrimental to anything and are interesting to those that like to see them. Maybe you meant to say "I personally hate to have my activities over-analyzed...". At that point, I have to wonder how you cope with people comparing themselves to your yardwork, your driving, your childcare, your job performance...people don't need geocaching to have random others analyzing them. There's nothing wrong with statistics and there's nothing detrimental about them to games (e.g., see 'Baseball', pg. 42).

 

EDIT: BBcode not HTML...duh.

Edited by ju66l3r
Link to comment
Guest
This topic is now closed to further replies.
×
×
  • Create New...