Sign in to follow this  
Followers 1
BK-Hunters

Broken URLs

5 posts in this topic

Ever discover that a URL you have entered in a Waymark's URL box doesn't work? The on page URL boxes, such as the "Wikipedia Url:" massage the URL, supposedly to handle "unsafe" or non-standard characters, but the Waymarking code apparently doesn't catch all instances of non-standard characters. We came across this last night in a Wiki Waymark we were reviewing. When included in the long description the URL worked just fine, but when entered in the "Wikipedia Url: " field the URL broke. Strangely, the only times we've come across this problem has been with Wiki URLs.

 

Strangely, the URL broke on a "single quote", supposedly a "safe" character. Also in the URL was the word "Musée" which was encoded by Waymarking as "Mus%C3%A9e". I would have encoded it as "Mus%E9e", but "Mus%C3%A9e" works, for some reason. The single quote wasn't encoded and broke the URL, which it shouldn't have done. In any event, this is an explanation of how to fix these little glitches.

 

In the example above, the "%C3%A9" and "%E9" are the enoding for the characters they replace in the URL. The percent sign is there to tell the parser that a hex value follows and the two characters following the percent sign (%) represent the hexadecimal value of a given character in the ASCII table of characters. Have a peek at the table right now to get an idea of what we're talking about.

 

You should see that the Hexadecimal code (HEX) for "é" is "E9" - which is how I would have encoded the é in the word "Musée", the result being "Mus%E9e". How this - "Mus%C3%A9e" works I'll never know as that encoding should result in "Musée".

 

Anyhow, to implement a fix, just find the character(s) which break(s) your URL in the ASCII table at the above URL and substitute with the Hex value from the table preceeded by the percent sign, EXAMPLE - "Musée" = "Mus%E9e".

 

ASCII, incidentally, stands for American Standard Code for Information Interchange and has been around about as long as there have been computers, being changed, extended and modified through the years.

 

EDIT: While we're on the subject, if you have ever wanted to insert a special character, like the copyright symbol (©) or the Euro symbol (€) in a waymark, or anywhere else for that matter, just copy the code from the "HTML Number" or "HTML Name" column of the ASCII table.

 

Examples - Euro symbol - HTML Number = (Ampersand)#128; - HTML Name = (Ampersand)euro;

 

Can't get these to escape properly, the Ampersand and parentheses represents the Ampersand character - just copy the text from the appropriate line of either the HTML Number or HTML Name column of the ASCII table, your choice. Couldn't be simpler.

 

Either works in an HTML document. Just remember that the semicolon is part of the code and is necessary.

Edited by BK-Hunters
0

Share this post


Link to post

Thank you for the useful tips.

 

Perhaps you have the skills to create another workaround. URLs entered on the waymark page for a French Benchmark no longer work after the waymark is created. The .pdf files are correct but broken once Waymarking gets involved.

 

WMTCP3

0

Share this post


Link to post

Thank you for the useful tips.

 

Perhaps you have the skills to create another workaround. URLs entered on the waymark page for a French Benchmark no longer work after the waymark is created. The .pdf files are correct but broken once Waymarking gets involved.

 

WMTCP3

 

http://geodesie.ign.fr/fiches/pdf/6237101.pdf

This is what I see.

 

http://geodesie.ign.fr/fiches/pdf/0924603.pdf

Here's one that works, which means that the pdf number (6237101) has to be incorrect.

 

There are no bad characters in the URL, so I have no idea what may have gone wrong with it. Can you send me a copy of a good, working URL or even post it here.

 

I'm really slooooow... I just realized that's mine. I believe I tested the URL when I submitted it and it worked then. Very Strange.

Edited by BK-Hunters
0

Share this post


Link to post

You should see that the Hexadecimal code (HEX) for "é" is "E9" - which is how I would have encoded the é in the word "Musée", the result being "Mus%E9e". How this - "Mus%C3%A9e" works I'll never know as that encoding should result in "Musée".

"Mus%E9e" is ISO 8859-1, which is good enough for the languages predominantly used in the Americas and Western Europe (except for Icelandic, Welsh, Esperanto, Unified Canadian Aboriginal Syllabics and a few others).

 

"Mus%C3%A9e" is UTF-8, which can encode almost any script ever used on this planet.

 

Both encodings are correct, but a computer needs to know which one to use. Sometimes they guess wrong and then the result looks broken like "Musée".

0

Share this post


Link to post

You should see that the Hexadecimal code (HEX) for "é" is "E9" - which is how I would have encoded the é in the word "Musée", the result being "Mus%E9e". How this - "Mus%C3%A9e" works I'll never know as that encoding should result in "Musée".

"Mus%E9e" is ISO 8859-1, which is good enough for the languages predominantly used in the Americas and Western Europe (except for Icelandic, Welsh, Esperanto, Unified Canadian Aboriginal Syllabics and a few others).

 

Thanks for reminding me of UTF-8. I look at it so seldom that I forget it even exists. I just bookmarked the table on my bookmarks bar to keep it in my face.

0

Share this post


Link to post

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  
Followers 1