+fizzymagic Posted September 8, 2020 Share Posted September 8, 2020 My last attempt at this might have been infelicitous, so I am sorry if I offended anyone at HQ. But I really need to know, both to understand for myself and to help others who are having problems. If there was some information already given about this issue I will be happy to acknowledge my search incompetence. What is going on with the Geocaching.com-hosted images? Here are the symptoms I am seeing: All images that have jpg or png extensions are delivered as jpeg images Uploading a png image with transparency results in a jpeg image with the transparent portions turned to black JPEG images have been transcoded to remove any non-JFIF-standard data (such as appended zip files) GIF images appear to be left alone as long as they fall within size guidelines. Could somebody who knows what is happening please provide information to the caching community so that we can know how to adjust accordingly? 2 1 Quote Link to comment
+Hügh Posted September 9, 2020 Share Posted September 9, 2020 (edited) Some initial observations. All the EXIF fields that I tested (title, comment, camera details, geolocation, etc.) were stripped out by the server. The Content-Type header contains information corresponding to the filetype of the original upload, though the image served has the magic bytes of a JPEG. Digging deeper—I notice that the file upload server is a Microsoft IIS 10.0 running ASP.NET v4.0.30319. I've never used C# nor the ASP.NET framework, but I think it likely that there are powerful image processing options offered. The libraries listed here all look capable of performing the "cleaning operations" that you've described above. Regardless; most of these symptoms can be attributed to the uploaded image being decoded to an array of "pixels" before being re-encoded as a JPEG/written out to disc. This Python snippet does a fairly decent job doing that. import sys from PIL import Image if __name__ == "__main__": im = Image.open(sys.argv[1])\ # load the image data into memory .convert("RGB") # convert("RGB") removes alpha channel, if it exists im.save("processed.jpg", "JPEG") # re-encode the image as a JPEG and write out # usage: python process.py upload.ext Edited September 9, 2020 by Hügh "Always comment your code," they said. 1 Quote Link to comment
+fizzymagic Posted September 9, 2020 Author Share Posted September 9, 2020 2 hours ago, Hügh said: Some initial observations. All the EXIF fields that I tested (title, comment, camera details, geolocation, etc.) were stripped out by the server. The Content-Type header contains information corresponding to the filetype of the original upload, though the image served has the magic bytes of a JPEG. Digging deeper—I notice that the file upload server is a Microsoft IIS 10.0 running ASP.NET v4.0.30319. I've never used C# nor the ASP.NET framework, but I think it likely that there are powerful image processing options offered. The libraries listed here all look capable of performing the "cleaning operations" that you've described above. Regardless; most of these symptoms can be attributed to the uploaded image being decoded to an array of "pixels" before being re-encoded as a JPEG/written out to disc. This Python snippet does a fairly decent job doing that. import sys from PIL import Image if __name__ == "__main__": im = Image.open(sys.argv[1])\ # load the image data into memory .convert("RGB") # convert("RGB") removes alpha channel, if it exists im.save("processed.jpg", "JPEG") # re-encode the image as a JPEG and write out # usage: python process.py upload.ext Yes, I believe you have diagnosed the problem correctly. What you have described is known as transcoding and it was common in the early days of the Internet when bandwidth was at a premium. I have considered a number of theories that explain the observable evidence, but I am anxious to hear the official reason so that I can both adjust my caches and help others adjust theirs. In particular, there are many puzzles that use png files that require lossless compression, which, as I am sure you know, is not used for the jpeg files sent from this server (there is a lossless jpeg standard, but I have never once actually seen such a file in the wild). I'm not looking to beat on GC or anything; I just need to know exactly what is being done! Quote Link to comment
+ecanderson Posted September 9, 2020 Share Posted September 9, 2020 Perhaps some serious paranoia regarding Trojan Horse images? In the past, vulnerabilities in some image decoders/libraries have been exploited to execute a package held within an image. Of course, that kind of code has to be targeted to a particular app, and most commonly, a particular version of the app, but ... 1 Quote Link to comment
+Dgwphotos Posted September 9, 2020 Share Posted September 9, 2020 3 hours ago, ecanderson said: Perhaps some serious paranoia regarding Trojan Horse images? In the past, vulnerabilities in some image decoders/libraries have been exploited to execute a package held within an image. Of course, that kind of code has to be targeted to a particular app, and most commonly, a particular version of the app, but ... I wonder if that would undermine some picture based puzzles, though? Quote Link to comment
+fizzymagic Posted September 9, 2020 Author Share Posted September 9, 2020 (edited) 2 hours ago, Dgwphotos said: I wonder if that would undermine some picture based puzzles, though? It has made at least several hundred picture-based puzzles unsolvable already. This cache is an example. The CO has unsuccessfully tried to fix the problem. Without any guidance on what is happening from GC.com, I don't think he knows what to do. I'd love to help, but I also don't know exactly what is happening. Hence the question. Edited September 9, 2020 by fizzymagic 2 Quote Link to comment
+Hügh Posted September 9, 2020 Share Posted September 9, 2020 (edited) 14 hours ago, fizzymagic said: What you have described is known as transcoding and it was common in the early days of the Internet when bandwidth was at a premium. Ah, I was unfamilliar with the term. My millennial self must have grown accustomed to the mega- even giga-bit/s bandwidth of the modern Internet. Dial-up must have been unbearably slow. (so funny, I know) Regardless, I am legitamately interested in hearing the "official" answer. As you've said, it would be very helpful for creating and maintaining image-based puzzles. Edited September 9, 2020 by Hügh Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.