This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
daviga404's profile picture

[1.6.4/CC1.57] File obtained via HTTP differs from original

Started by daviga404, 14 December 2013 - 05:16 PM
daviga404 #1
Posted 14 December 2013 - 06:16 PM
ComputerCraft Version Info:
1.57 Client/1.57 Server

Description of Bug:
When using the HTTP API to download a file from a web server and process the data, certain bytes in the data differ from the original and new bytes are added in their place. This can be observed through the string.byte function built into Lua, as the bytes can be iterated and printed as integers through this function.

An example of this is a file containing the bytes (represented as hex):
FF FF FF F6

When generating a string containing these bytes in ComputerCraft through Lua's inbuilt string functions, they produce the correct results when converted back into integral representations of bytes. However, if this file is hosted on a web server and obtained using code such as the following:

handle = http.get("http://website.com/file.txt")
data = handle.readAll()
the bytes appear to differ from the original ones in the file.

Steps to reproduce:
  1. Create a file on a web server containing the bytes (represented as hex)
  2. FF FF FF F6
  3. Read the file using the http API of ComputerCraft and store the data into a variable
  4. Convert the data to an array of bytes represented as integers (can be done using string.byte(variable, 1, -1))
  5. Print out the data, and compare it to the integer representations of the original data
I have created a script which can be used to carry out these steps here, but any script to reproduce these steps should work.
distantcam #2
Posted 14 December 2013 - 09:19 PM
The problem here is that ComputerCraft is "helpfully" converting your download to a string, according to the encoding given by the server.

In your case the URL you're using is returning UTF-8 as the encoding. In UTF-8 'FF' becomes 'C3BF' and 'F6' becomes 'C3B6'. See http://www.utf8-chartable.de/

The best way to solve this would be to have the web file return the values as a string, as in "FF" the string, not 'FF' the byte, or if you can control the encoding header of the servers result changing it to UTF-7 or ASCII.
Edited on 14 December 2013 - 08:22 PM