Jump to content

Bad Html In Cache Pages


Shilo

Recommended Posts

During a plucker run of a .gpx file it hangs on bad html on some cache pages. I've figured out a work around but what should I do to get the bad html fixed? I heard something about 'TIDY' that Groundspeak uses to fix bad html. Does this exist? Or they run it on new caches pages or something to that effect.

Who do I go to for help?

Thanks

Link to comment

This has rendered my PDA utterly useless. Can you link to a forum thread where the workaround is discussed, so that I can use pocket queries on my PDA again? I don't wish to derail this thread.

 

I agree with you; the HTML is quite often a nuisance. Is the problem caused by old cache pages, created before the site began using HTML Tidy? If so, Tidy only does its thing if the owner edits the cache page. Tidy acts on all newly created cache pages and edits to existing cache pages.

Link to comment
This has rendered my PDA utterly useless. Can you link to a forum thread where the workaround is discussed, so that I can use pocket queries on my PDA again?

How long has this been going on for you? I haven't had any troubles at all. But then, I use my own application to generate the HTML for Plucker, and if I find something it doesn't like, I just add that to the list of things it strips out.

 

But it seems to me that this is a big problem. Running Tidy on all existing cache pages is not a viable solution, unfortunately. But maybe there could be some mechanism by which people could report cache pages that have bad HTML in them and have just those pages run through Tidy.

Link to comment

This has rendered my PDA utterly useless. Can you link to a forum thread where the workaround is discussed, so that I can use pocket queries on my PDA again? I don't wish to derail this thread.

 

I agree with you; the HTML is quite often a nuisance. Is the problem caused by old cache pages, created before the site began using HTML Tidy? If so, Tidy only does its thing if the owner edits the cache page. Tidy acts on all newly created cache pages and edits to existing cache pages.

So I guess I heard correctly on Tidy. I wasn't for sure. It would be nice to be able to point out certain cache pages for tidy. I understand everyone couldn't be done daily. That would kill the server I'm sure.

Here's my Post In this thread I was reading in. If you need a better explanation, let me know and I'll give it to you tonight after work.

 

Thanks for all the replies and hopefully we can get someone from Groundspeak to help us on the bad html cache pages :laughing:

Link to comment
This has rendered my PDA utterly useless. Can you link to a forum thread where the workaround is discussed, so that I can use pocket queries on my PDA again?

How long has this been going on for you? I haven't had any troubles at all. But then, I use my own application to generate the HTML for Plucker, and if I find something it doesn't like, I just add that to the list of things it strips out.

 

But it seems to me that this is a big problem. Running Tidy on all existing cache pages is not a viable solution, unfortunately. But maybe there could be some mechanism by which people could report cache pages that have bad HTML in them and have just those pages run through Tidy.

I use Watcher, Plucker and Spinner to sort my pocket queries and load them onto my PDA. Since January, I've not been able to do this, because Plucker "hangs" on caches, just as described in the forum thread linked to by Shilo (thanks!). Manually removing each offending cache from a 500-cache query isn't very viable. I stumbled across a similar solution on my own. Remove cache #42, rerun Plucker, watch Plucker hang on cache #85, remove cache #85, etc.

 

What some cache page owners regard as "expressions of creativity" effectively prevents me from using pocket queries for their intended purpose. I am using the software tools recommended by Geocaching.com on the "waypoints" page, but they *do not work.* That is unfortunate.

 

Out of frustration, I solved the problem by purchasing a laptop computer, which I now run in the car using a power invertor. Watcher works just fine in the car. But if I hike to the cache and need to consult the cache page, like for a multicache, I am out of luck.

Link to comment
What some cache page owners regard as "expressions of creativity" effectively prevents me from using pocket queries for their intended purpose.

Amen to that! I couldn't have said it better. Cache descriptions should be in plain-vanilla HTML; bells and whistles belong off-site.

 

Out of frustration, I solved the problem by purchasing a laptop computer, which I now run in the car using a power invertor. Watcher works just fine in the car. But if I hike to the cache and need to consult the cache page, like for a multicache, I am out of luck.

 

If we could have a list of caches that cause Plucker to barf, we can write a set of filters to fix the offending things in Spinner. I know LilDevil is somewhat distracted these days, but I'll bet I could give him some pre-packaged filtering code and he could put it into Spinner.

 

It's odd, though, that you have only been experiencing this since January. Because all new cache pages have been run through Tidy since well before than. Is there a new version of Plucker that is causing this? If there is stuff that Tidy is letting through that causes Plucker to barf, maybe the Tidy configuration should be changed to disallow it.

Link to comment

Fizzy, thanks for the sympathy.

 

When I tried working with a PQ in January, it was the first time in many months that I had done so. (I took a break from high-power geocache finding from September to January.) So I just figured that something had changed, and I hadn't gotten the memo. I lack the technical expertise that people like you and Lil Devil have to deal with such issues. I think there are others in the same boat.

 

I budgeted this evening to do my route planning for a Memorial Day weekend roadtrip out of state. If I have time, I will run a query or two through Plucker and note any caches that hang up the process. If they sail through, then this may isolate the problem to an amateur HTML jockey in my local area, and I will run a PQ of my local area to note offending caches.

 

But if I don't get to it tonight, it will have to wait until next week.

Link to comment

I've been on both Leprechaun's and Plucker's end of that stick - both as a user that wants it to Just Work and as a developer that has a parser blown away by wildy incorrect interpretations of HTML, it's a bummer.

 

Though I've contributed a few fixes to Plucker's python code to make it more resilient, I'm a big fan of pushing the solution to this problem upstream. A Perl jock can simply not regex away every malformed expression, though they can nick at it one by one. Get a reviewer or cache owner to "fix" the cache page in question. 99% of the time, a simple edit (insert a space or something) will fix the page, once you've identified it.

 

Be prepared for noise from the 1% of the offending cache pages are so horribly malformed that tidy dogmeatifies them - and for resistance to the arguments that those pages were already dogmeat and already didn't work correctly in a variety of browsers. The "but it works in Internet Explorer" crowd can be vocal.

 

In my experience, it's frequently a low number of cache page owners in an area responsible for this problem in case you want to solve this problem via ignore lists...

Link to comment

I used to get the same result with my spinner/plucker/palm. So I though to myself what pages use more html than others - "puzzles". On my next PQ I downloaded, I excluded that type and wham it worked. It's been working like a champ ever since and my caching experience has been much more satisfying. But If I do discover a Puzzle I want to do I do enter it manually.

Link to comment

Interestingly, I have never had my PDA hang once while using Cachemate.

 

Ed

Its usually not the PDA that hangs. Its the computer based part of the program that uses the PQ files and crunches them. I've not had a problem once I get the caches to my PDA.

Link to comment

If there is stuff that Tidy is letting through that causes Plucker to barf, maybe the Tidy configuration should be changed to disallow it.

From what I understand is that Tidy is used when the cache is sent for approval and thats it. I always change my cache pages after they get approved (because its usually pretty fast here) so I think this is where the problem gets started.

 

5winters has a good point about puzzle caches. The one cache I had hang from my Gulf Shores PQ wasn't a puzzle cache though. I didn't have my workaround for my Chattanooga, TN PQ last week so I don't know what type of cache(s) it was. I will have to go run one and see. I'm curiuos now.

 

Sorry to hear of your multiple hangs fizzy. Sounds like a long drown out issue trying to do it the way I suggested :sad: . I don't have this issue (so far) here at home.

 

Still wish we could get a technical person from geocaching to find a way we can send caches to a que to be run through Tidy. Some cache owners probably don't know how to fix it if they even knew they had a bad html tag somewhere.

Link to comment

If there is stuff that Tidy is letting through that causes Plucker to barf, maybe the Tidy configuration should be changed to disallow it.

From what I understand is that Tidy is used when the cache is sent for approval and thats it. I always change my cache pages after they get approved (because its usually pretty fast here) so I think this is where the problem gets started.

[snip]

Still wish we could get a technical person from geocaching to find a way we can send caches to a que to be run through Tidy. Some cache owners probably don't know how to fix it if they even knew they had a bad html tag somewhere.

 

No, HTMLTidy gets run every time you edit your cache page. If you want to run your cache page through it, just edit and save.

Link to comment

I've had a similar problem, as Robert is well aware, since he helped me debug some of them over in the GSAK forums.

 

I've found that Sunrise Desktop helps a lot for me; it seems to be much more resilient than Plucker in handling malformed HTML code. You still need Plucker, inasmuch as Sunrise uses the Plucker reader on the handheld itself. However, Sunrise effectively replaces the desktop component of Plucker. It seems to be a bit faster for me, too.

 

The downside is that it doesn't seem to be able to handle quite as large a volume of input as Plucker. My very largest set contains about 2300 caches and I have to tell Sunrise not to output images in order for it to work without a stack overflow. This is not a big loss to me, though.

 

I believe I downloaded it from sourceforge.net.

Edited by WascoZooKeeper
Link to comment

 

 

No, HTMLTidy gets run every time you edit your cache page. If you want to run your cache page through it, just edit and save.

 

Thanks for clearing this up for me. I wasn't for sure.

So how is bad html getting through then or is it the html is ok and plucker desktop can't handle it?

Link to comment

I tried running a PQ file for my trip this weekend through Spinner and Plucker, but with the same roadblocked results I've been experiencing for months. The first cache that hung up plucker was this travel bug hotel. Can an HTML expert diagnose this?

 

Shilo's solution (ignoring this cache) doesn't work well. As I am traveling out of town, a bug hotel like this one would be high on my list of caches to visit.

 

I would really like for the site's pocket queries to work properly with the recommended software. :lol:

 

For my trip this weekend I will get by with using my laptop computer in the car and hunting in the field without cache descriptions or hints.

Link to comment
The first cache that hung up plucker was this travel bug hotel. Can an HTML expert diagnose this?
Yuck. The template used by GC.com is a mess. The HTML under control of the cach owner is fairly clean. The main problem is with the markup for "The Selector", which tries to stuff a table element into a paragraph. Since the GC.com template uses XHTML, the paragraph must be closed explicitly with </p> before the table can be opened.
Link to comment

I tried running a PQ file for my trip this weekend through Spinner and Plucker, but with the same roadblocked results I've been experiencing for months. The first cache that hung up plucker was this travel bug hotel. Can an HTML expert diagnose this?

 

Shilo's solution (ignoring this cache) doesn't work well. As I am traveling out of town, a bug hotel like this one would be high on my list of caches to visit.

 

I would really like for the site's pocket queries to work properly with the recommended software. :lol:

 

For my trip this weekend I will get by with using my laptop computer in the car and hunting in the field without cache descriptions or hints.

 

Hmmmmm, I think I'm seeing a pattern here. The cache in question of mine also has this "the selector" in it. My work around was just to get the PQ to run for the rest of the caches instead of letting 1 or 2 caches ruin the whole PQ run. I would suggest printing out the TB hotel on paper a putting the waypoint in manually or through EasyGPS if you really wanted to do this cache. In my instance the cache hanging up my pq isn't on my route anyway so I'm just ignoring it all together. I have 480 other caches so I will keep busy enough.

I'm not a computer expert or I would try to find a easier or permanant solution to this. This is what I found to work for me in my case. I was fortunate it was only 1 cache. Others here have found 5 or 6 in one PQ so my idea is a lot of work. If it was my home area it would be worth it but just for a weekend getaway, no. If I come up with something better I'll let you all know. :tired:

Link to comment

But in just about every query I run, there are dozens and dozens of caches that still insist on using the non-searchable, page-hogging Selector graphics instead of the GC.com attributes. Some are probably excellent caches, and if they meet my PQ criteria, they are caches I'd like to search for.

 

My questions:

 

1. Can someone with more expertise confirm that "The Selector" is the correct diagnosis? I am not qualified due to limited technical skills. I know that I *used* to be able to get pages with Selector graphics to appear on my PDA. They were annoying, as I had to scroll past the pictures and logos, but at least I could get to the cache description.

 

2. In the next update to the Groundspeak GPX namespace, can GC.com include the built-in Cache Attributes as part of the data contained in a pocket query? This might reduce people's perceived need to include Selector graphics on their cache pages.

 

3. Is there a site-side fix, like tweaking HTML Tidy, or is that a "canned" process? I guess I don't understand why Tidy wouldn't insert the missing command, if that is standard HTML or XHTML or whatever the terminology is.

 

4. Are GC.com volunteer cache reviewers authorized to edit cache pages that make pocket queries choke when run through the recommended software such as Plucker?

Edited by The Leprechauns
Link to comment

 

1. Can someone with more expertise confirm that "The Selector" is the correct diagnosis? I am not qualified due to limited technical skills. I know that I *used* to be able to get pages with Selector graphics to appear on my PDA. They were annoying, as I had to scroll past the pictures and logos, but at least I could get to the cache description.

 

I offer my cache, ANOTHER Cemetery Cache?? as a test. I just did a quick edit on it to make sure it's run through Tidy.

It has 2 "Selector" attributes on it.

If it's confirmed that this cache makes Plucker Puke, I'll remove them posthaste.

Edited by PAWSitraction
Link to comment

 

1. Can someone with more expertise confirm that "The Selector" is the correct diagnosis? I am not qualified due to limited technical skills. I know that I *used* to be able to get pages with Selector graphics to appear on my PDA. They were annoying, as I had to scroll past the pictures and logos, but at least I could get to the cache description.

 

I offer my cache, ANOTHER Cemetery Cache?? as a test. I just did a quick edit on it to make sure it's run through Tidy.

It has 2 "Selector" attributes on it.

If it's confirmed that this cache makes Plucker Puke, I'll remove them posthaste.

I just ran a PQ of the nearest 20 caches including the one you listed and it ran through plucker just fine :lol:

Link to comment

 

1. Can someone with more expertise confirm that "The Selector" is the correct diagnosis? I am not qualified due to limited technical skills. I know that I *used* to be able to get pages with Selector graphics to appear on my PDA. They were annoying, as I had to scroll past the pictures and logos, but at least I could get to the cache description.

 

I offer my cache, ANOTHER Cemetery Cache?? as a test. I just did a quick edit on it to make sure it's run through Tidy.

It has 2 "Selector" attributes on it.

If it's confirmed that this cache makes Plucker Puke, I'll remove them posthaste.

I just ran a PQ of the nearest 20 caches including the one you listed and it ran through plucker just fine :lol:

Hmmm...It could be that something in the Selector code is still the problem, though. I stripped it down pretty throughly - I took out the "Selector table" coloring, and included only what I wanted in that table.

So...hm.

I guess it's not automatically the images from the Selector that are doing it. I wonder if it's the entire table-code that does it.

I still offer up my stripped-down Selector-coded cache page for testing by others to see if it makes their version of Plucker puke.

Link to comment
PAWS, did you add in that one thingie like niraD suggested? He seems to know what he's talking about, and I gather he diagnosed an HTML tag omission as the cause, rather than images or anything.

I checked. The PAWS cache description is indeed missing the same element, but since it didn't make Plucker puke it isn't the problem.

 

That first cache you showed has a LOT of weird stuff in the Selector table. In particular, at the end of the table is this little gem:

 

<tr>
<td style="width:100%;height:10px;" bgcolor="
#ffff33> </td> </tr> <tr> <td style="
filter:="" endcolorstr="#FFffffff);"></td>
</tr>

 

I have no idea what that does, but it's definitely not legal XHTML and I suspect that Plucker doesn't like it. Maybe we could have somebody add that to the end of their Selector table and see what happens.

 

ETA: Maybe that code snippet I had above is an attempt to duplicate the following from the beginning of the table:

 

<tr>
<td style="
width:100%;height:10px; filter: progid:DXImageTransform.Microsoft.gradient(GradientType=0,startColorstr=#ffffffff, endColorstr=#FFffff33);">
</td></tr>

 

Boy, is that nasty. I confess to being very surprised Tidy let that stuff through.

Edited by fizzymagic
Link to comment

PAWS, did you add in that one thingie like niraD suggested? He seems to know what he's talking about, and I gather he diagnosed an HTML tag omission as the cause, rather than images or anything.

Nope, I didn't.

 

PAWS, did you add in that one thingie like niraD suggested? He seems to know what he's talking about, and I gather he diagnosed an HTML tag omission as the cause, rather than images or anything.

I checked. The PAWS cache description is indeed missing the same element, but since it didn't make Plucker puke it isn't the problem.

 

That first cache you showed has a LOT of weird stuff in the Selector table. In particular, at the end of the table is this little gem:

 

<tr>
<td style="width:100%;height:10px;" bgcolor="
#ffff33> </td> </tr> <tr> <td style="
filter:="" endcolorstr="#FFffffff);"></td>
</tr>

 

I have no idea what that does, but it's definitely not legal XHTML and I suspect that Plucker doesn't like it. Maybe we could have somebody add that to the end of their Selector table and see what happens.

 

ETA: Maybe that code snippet I had above is an attempt to duplicate the following from the beginning of the table:

 

<tr>
<td style="
width:100%;height:10px; filter: progid:DXImageTransform.Microsoft.gradient(GradientType=0,startColorstr=#ffffffff, endColorstr=#FFffff33);">
</td></tr>

 

Boy, is that nasty. I confess to being very surprised Tidy let that stuff through.

Wasn't Tidy added early this year or late last year, though? That page was probably generated before Tidy.

 

Hm. I could try to add all that stuff into my Selector Table. Do I just copy the second code you posted before the Selector stuff, and park the first code block at the end of the Selector Stuff?

Link to comment
Hm. I could try to add all that stuff into my Selector Table. Do I just copy the second code you posted before the Selector stuff, and park the first code block at the end of the Selector Stuff?

 

Yeah. They both need to go just INSIDE the <table>....</table> tags.

Link to comment
Hm. I could try to add all that stuff into my Selector Table. Do I just copy the second code you posted before the Selector stuff, and park the first code block at the end of the Selector Stuff?

 

Yeah. They both need to go just INSIDE the <table>....</table> tags.

OK, done. Now what happens to Plucker when it runs that cache?

Link to comment
OK, done. Now what happens to Plucker when it runs that cache?

 

Good job. I am placing my money on Plucker barfing.

 

I ran it through my (very old) copy of Plucker and it didn't gag; in fact, it even displayed the cache page. But I haven't been having the troubles others have, so that likely doesn't mean anything.

 

What's astonishing is that HTML Tidy let that abomination through.

Link to comment
OK, done. Now what happens to Plucker when it runs that cache?

 

Good job. I am placing my money on Plucker barfing.

 

I ran it through my (very old) copy of Plucker and it didn't gag; in fact, it even displayed the cache page. But I haven't been having the troubles others have, so that likely doesn't mean anything.

 

What's astonishing is that HTML Tidy let that abomination through.

Gotta tell you, I was very surprised, indeed, that it let that through.

I even went back to the page editor to make sure that it did go through.

Link to comment
Hm. I could try to add all that stuff into my Selector Table. Do I just copy the second code you posted before the Selector stuff, and park the first code block at the end of the Selector Stuff?

 

Yeah. They both need to go just INSIDE the <table>....</table> tags.

OK, done. Now what happens to Plucker when it runs that cache?

OH, you must be waiting on me? :ph34r:

Let me go run the same PQ and run it through Plucker and I'll let you all know what happened :huh:

Link to comment
Hm. I could try to add all that stuff into my Selector Table. Do I just copy the second code you posted before the Selector stuff, and park the first code block at the end of the Selector Stuff?

 

Yeah. They both need to go just INSIDE the <table>....</table> tags.

OK, done. Now what happens to Plucker when it runs that cache?

OH, you must be waiting on me? :huh:

Let me go run the same PQ and run it through Plucker and I'll let you all know what happened :huh:

 

It worked fine. No problems :ph34r:

Link to comment
Hm. I could try to add all that stuff into my Selector Table. Do I just copy the second code you posted before the Selector stuff, and park the first code block at the end of the Selector Stuff?

 

Yeah. They both need to go just INSIDE the <table>....</table> tags.

OK, done. Now what happens to Plucker when it runs that cache?

OH, you must be waiting on me? :(

Let me go run the same PQ and run it through Plucker and I'll let you all know what happened :laughing:

 

It worked fine. No problems :)

OK. Last try. I've re-generated the "Selector" table altogether by going to the Selector site and picking a couple things, copy-and-pasting their stuff into my cache page and deleting the older Selector table I had.

 

Now what happens?

Link to comment

OK. Last try. I've re-generated the "Selector" table altogether by going to the Selector site and picking a couple things, copy-and-pasting their stuff into my cache page and deleting the older Selector table I had.

 

Now what happens?

 

I ran it and it was fine. It stuck on a cache for about 2 seconds and continued. I looked through the log and it kicked back an error on ISQ# 96(GCM9Y7). The ISQ's always have pictures in them and I imagine that was the error. Other than that it ran fine. So I have no idea what the problem is unless Tidy fixed your 'Selector' to work fine and the caches others are having trouble with aren't being edited since Tidy was put into use.

Link to comment

I tried running a PQ file for my trip this weekend through Spinner and Plucker, but with the same roadblocked results I've been experiencing for months. The first cache that hung up plucker was this travel bug hotel. Can an HTML expert diagnose this?

 

Shilo's solution (ignoring this cache) doesn't work well. As I am traveling out of town, a bug hotel like this one would be high on my list of caches to visit.

 

I would really like for the site's pocket queries to work properly with the recommended software. :(

 

For my trip this weekend I will get by with using my laptop computer in the car and hunting in the field without cache descriptions or hints.

I am back from my Memorial Day weekend trip. "Only" 16 cache finds, but I enjoyed the time with my daughter. Our main purpose for traveling was to attend a competition for one of her other hobbies, and she did very well!

 

Ironically, the example travel bug hotel I mentioned earlier turned out to be one of the 16 caches that we found, out of 900+ that were in my queries. It happened to be at the interstate exit we chose to jump off at for random exploration. It was a wet ammo box(!) in plain sight, hanging from a tree in the woods adjoining a hotel parking lot. It had no travel bugs in it except for the three "ghost" bugs that the owner has parked there (to make it look busy, I guess?). The Selector icons really didn't do much to help this cache.

Link to comment

I tried running a PQ file for my trip this weekend through Spinner and Plucker, but with the same roadblocked results I've been experiencing for months. The first cache that hung up plucker was this travel bug hotel. Can an HTML expert diagnose this?

 

Shilo's solution (ignoring this cache) doesn't work well. As I am traveling out of town, a bug hotel like this one would be high on my list of caches to visit.

 

I would really like for the site's pocket queries to work properly with the recommended software. :D

I've switched to using Sunrise on my pc and still using Plucker on my Palm TX and everything has been running great :D:D

I think Sunrise actually runs faster and I like where you can make lists of all yout PQ's and only update the ones you want on the sunrise page (as long as you don't forget to spin the new GPX file!).

Link to comment
Guest
This topic is now closed to further replies.
×
×
  • Create New...