2217 posts
Location
3232235883
Posted 20 November 2012 - 01:23 PM
normally the http api works fine
but
http.request("http://www.google.com/search?q=anything")
always fails instantly, anyone know why? :s
EDIT:
its because of a incompatable user agent sent when connecting to google is preventing me
and cannot be fixed from lua
(unless you use a custom proxy)
1688 posts
Location
'MURICA
Posted 20 November 2012 - 01:39 PM
Try https? (if cc can even use https, that is)
2217 posts
Location
3232235883
Posted 20 November 2012 - 01:42 PM
Try https? (if cc can even use https, that is)
nope, it complains that its not http
the problem could be is that it takes a couple seconds for google to send results, and the http ques http_failure before the message is sent
8543 posts
Posted 20 November 2012 - 02:00 PM
Are you using http.request correctly? You know it throws an event with the results rather than returning them, correct?
2217 posts
Location
3232235883
Posted 20 November 2012 - 02:19 PM
Are you using http.request correctly? You know it throws an event with the results rather than returning them, correct?
yes, i know
8543 posts
Posted 20 November 2012 - 02:48 PM
We'll need to see the code, then.
2217 posts
Location
3232235883
Posted 20 November 2012 - 04:08 PM
http.request("http://www.google.com/search?q=define+greece")
local response
while true do
local event,url,sourceText=os.pullEvent()
if event == "http_success" then
response=sourceText.readAll()
break
elseif event == "http_failure" then
error("http_failure")
end
end
local t,a=string.find(response,'<td valign="top" style="padding-bottom:5px;padding-top:5px"><table class="ts"><tr><td>')
if not t then
print("Definition not found.")
else
local b,t=string.find(response,'</td>',a)
print(string.sub(response,a+1,b-1))
end
it errors http_failure,
but when you change line 1 to
http.request("http://www.google.com/")
it will say definition not found instead of erroring
180 posts
Posted 20 November 2012 - 06:14 PM
Does anyone happen to know what HTTP_USER_AGENT is sent by the HTTP API?
If it contains Java or Lua, it is quite possible google is blocking search requests from it. They do that for a lot of scripting languages.
I've run into that problem with both Java and TCL, when using the default user agent. I had to change the user agent to match FireFox (Which unfortunately is against the Google terms of service) to get non-error result back.
715 posts
Posted 21 November 2012 - 12:39 AM
Does anyone happen to know what HTTP_USER_AGENT is sent by the HTTP API?
If it contains Java or Lua, it is quite possible google is blocking search requests from it. They do that for a lot of scripting languages.
I've run into that problem with both Java and TCL, when using the default user agent. I had to change the user agent to match FireFox (Which unfortunately is against the Google terms of service) to get non-error result back.
It sends a Java UserAgent information.
You can find out the exact string by using the HTTP API on, e.g.
http://whatsmyuseragent.com/ and then looking for "Your User Agent" in the response body.
2217 posts
Location
3232235883
Posted 21 November 2012 - 05:07 AM
It sends a Java UserAgent information.
You can find out the exact string by using the HTTP API on, e.g.
http://whatsmyuseragent.com/ and then looking for "Your User Agent" in the response body.
the user agent is " java/1.7.0_07 "
with a simple recode of the program i posted above:
Spoiler
http.request("http://whatsmyuseragent.com/")
local response
while true do
local event,url,sourceText=os.pullEvent()
if event == "http_success" then
response=sourceText.readAll()
break
elseif event == "http_failure" then
error("http_failure")
end
end
local t,a=string.find(response,'<strong>Your User Agent:</strong>')
if not t then
print("Tag not found.")
else
local b,t=string.find(response,'<br /><br />',t)
print(string.sub(response,a+1,b-1))
end
180 posts
Posted 21 November 2012 - 12:30 PM
Sorry it's taken so long for me to verify this, but Google search does indeed block the Java user agent.
I've confirmed this using wget and the -U option (to specify a user agent)
This is with a FireFox user agent:
$ wget -U "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0" http://www.google.com/search?q=kittens
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `search?q=kittens'
This is the exact same query made 5 seconds later with the Java agent:
$ wget -U "Java/1.7.0_07" http://www.google.com/search?q=kittens
HTTP request sent, awaiting response... 403 Forbidden
2012-11-20 18:25:14 ERROR 403: Forbidden.
Unfortunately there doesn't appear to be any way to "fix" this directly in Lua, short of using a different search engine.
This might help:
http://en.wikipedia.org/wiki/List_of_search_engines
2217 posts
Location
3232235883
Posted 21 November 2012 - 04:07 PM
Sorry it's taken so long for me to verify this, but Google search does indeed block the Java user agent.
I've confirmed this using wget and the -U option (to specify a user agent)
This is with a FireFox user agent:
$ wget -U "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0" http://www.google.com/search?q=kittens
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `search?q=kittens'
This is the exact same query made 5 seconds later with the Java agent:
$ wget -U "Java/1.7.0_07" http://www.google.com/search?q=kittens
HTTP request sent, awaiting response... 403 Forbidden
2012-11-20 18:25:14 ERROR 403: Forbidden.
Unfortunately there doesn't appear to be any way to "fix" this directly in Lua, short of using a different search engine.
This might help:
http://en.wikipedia...._search_engines
none of those seem to work :s
many of them dont define things
some (like bing) aren't accurate
bah, lemme see if a proxy works >_>
180 posts
Posted 21 November 2012 - 04:44 PM
Do you know PHP or Perl, and have a web-server you can put scripts on?
If so you could code a little proxy, where the CGI queries Google with an IE or Firefox user agent. Then your Lua program will hit your cgi/php…
The only other way I know of to do this (a huge pita) is to signup for their developer API.
With a developer API, they give you an API key, and you include that as part of the search query to a special URL.
They give you 100 queries per day for free, and then you have to pay to enable more queries ($5 per 1000 extra queries)
https://developers.google.com/custom-search/v1/overviewAfter all that mess, you can use a URL like this:
https:// www.googleapis.com/customsearch/v1?key={INSERT-YOUR-KEY}&cx=017576662512468239146:omuauf_lfve&q={SEARCH-TERM}
They also have a JSON interface, if you wanted to go that route.
2217 posts
Location
3232235883
Posted 21 November 2012 - 05:14 PM
nah, id rather learn php and host it on moi xampp server :3
715 posts
Posted 21 November 2012 - 06:20 PM
I've just posted a [topic='6259']suggestion[/topic] about including Java-side setRequestProperties on the connection.
This would not only would solve the User-Agent problem, but also allow one to set any header values, which enables a lot of other functions like e.g. sending cookies, etc.