This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
Goof's profile picture

[Http.get] - finding a time between a <div> </div> tag

Started by Goof, 12 February 2014 - 01:37 PM
Goof #1
Posted 12 February 2014 - 02:37 PM
Hello


I've tried to get the time from a website, and print it on my computer, but first of all the http response returns both tables and functions, which cannot be split separate, and therefore i don't know how to find the time between two tags

EDIT: Look at the new question / problem in the comments

My code is as is:

TimePage="http://www.mytimewebpage.somethingmaybedotcom?"
while true do
	page=http.get(TimePage)
	print(textutils.serialize(page))
	sleep(0.5)
end


This code returns both the function AND the table error.


How am i able to solve this weird problem?


Thanks in Advance
Edited on 13 February 2014 - 09:35 AM
CometWolf #2
Posted 12 February 2014 - 02:41 PM
http.get returns a handle, same as fs.open. To get the http response in text use handle.readAll(). Or in this case, page.readAll()
Edited on 12 February 2014 - 01:44 PM
Goof #3
Posted 12 February 2014 - 03:00 PM
Uhh! thanks! now i just need to get rid of everything after the time output ( in this case everything after this: 21:05:46)


that (.-) isn't working in this situation…

gsub("</div>(.-)","%1") -- theres a closing div right after the time, and after that closing div, a lot of other html stuffz

hmm… i keep forgetting all the patterns…

which pattern should i use?
CometWolf #4
Posted 12 February 2014 - 03:10 PM
Post an example string and i can help. Also, you need to use string.match().
Goof #5
Posted 12 February 2014 - 03:14 PM
Well Everything after the 21:24:56 should be removed

21:24:56</div><div id=i_timea><a href="/library/abbreviations/timezones/eu/cet.html" title="Central European Time">CET</a></div></div><script type="text/javascript">
et=1392408496;function f0(d){return ld[d.getUTCDay()]+' '+d.getUTCDate()+'.'+' '+lm[d.getUTCMonth()]+' '+d.getUTCFullYear()+', kl. '+p2(d.getUTCHours())+':'+p2(d.getUTCMinutes())+':'+p2(d.getUTCSeconds());}
function f1(d){return ld[d.getUTCDay()]+' '+d.getUTCDate()+'.'+' '+lm[d.getUTCMonth()]+' '+d.getUTCFullYear();}
function f2(d){return p2(d.getUTCHours())+':'+p2(d.getUTCMinutes())+':'+p2(d.getUTCSeconds());}
cks={ctu:{t:[{t:0,o:0,a:'UTC'}],f:f0},i_date:{t:[{o:3600,a:'CET'}],f:f1},i_time:{t:[{o:3600,a:'CET'}],f:f2},ct:{t:[{o:3600,a:'CET'}],f:f0,d:12}};
lm=[];lm[1]='februar';ld=[];ld[3]='onsdag';ld[4]='torsdag';ld[5]='fredag';idpref={i_date:1.2,i_time:5};
</script></td></tr></table></td></tr></table></div><script src="http://a.tadst.com/common/wcommon_17.js" type="text/javascript"></script></body></html>
CometWolf #6
Posted 12 February 2014 - 03:31 PM
Try "^%d+:%d+:%d+"
Goof #7
Posted 12 February 2014 - 03:34 PM
Didn't work :(/>
still the same output

my code:

page = http.get(site).readAll()
page:match("^%d+:%d+:%d+",1)
print(page)
Edited on 12 February 2014 - 02:38 PM
wieselkatze #8
Posted 12 February 2014 - 03:39 PM
You could just use
page:match("^[^<]*")

This would return anything from start to the first '<' of '</div>'
Goof #9
Posted 12 February 2014 - 03:42 PM
Well that still doesn't work properly… No changes occured to the output
Lyqyd #10
Posted 12 February 2014 - 03:43 PM
Didn't work :(/>/>
still the same output

my code:

page = http.get(site).readAll()
page:match("^%d+:%d+:%d+",1)
print(page)

You threw away the result of the match.


print(string.match(page, "^(%d+:%d+:%d+)"))
wieselkatze #11
Posted 12 February 2014 - 03:45 PM
Sure that works. Just tested that.
Also, you say

xy = page:match(blahblah)

It !returns the value, it doesn't change the actual string.

[EDIT]
Heh, Lyqyd was faster ^^
Edited on 12 February 2014 - 02:45 PM
Goof #12
Posted 12 February 2014 - 03:45 PM
Didn't work :(/>/>
still the same output

my code:

page = http.get(site).readAll()
page:match("^%d+:%d+:%d+",1)
print(page)

You threw away the result of the match.


print(string.match(page, "^(%d+:%d+:%d+)"))

Whooop Whooop!

YUaaay. That worked :D/>

Thank you :D/>
xD

Thank you everyone
Goof #13
Posted 13 February 2014 - 10:42 AM
Ehhmm.. Well I have a new problem…
I totally forgot everything about the date, which is returned as:

<div id="i_date">TheDay(Thursday) 13. February 2014</div>

But… this code is before the time, so i have to make 2 variables with both.

My Code:

TimePage="SomeTimewebsite.something"
term.clear()
local date=nil
local time=nil
while true do
	page=http.get(TimePage).readAll()
	term.setCursorPos(1,1)
	-- two seperate variables should contain Time, and date..
	local time_read=page:gsub('(.-)<div id=i_time>','') -- Removes everything before the time
	local date_read=page:gsub('(.-)<div id=i_date>','') -- Removes everything before the date
	print(date_read) -- prints everything after the date
	print("\n --- --- --- --- --- --- \n")
	time=(string.match(time_read, "^(%d+:%d+:%d+)"))
	date=(string.match(date_read, "^(%w+ %w+ %w+)"))
	print(time)
	print(date)
end

I think i still have to use the match? but with a far different pattern? the code above, doesn't print ( i guess its nil or false )

Thanks in Advance
Lyqyd #14
Posted 13 February 2014 - 12:24 PM
Please paste a complete sample of the page returned by the site. We can help provide a single pattern that will return both pieces of information individually.
surferpup #15
Posted 13 February 2014 - 12:50 PM
I respect what you are doing here for general reasons (pulling a specific piece of info from a web page), however, for time and date, http://www.timeapi.org will give you exactly what you are after with no mucking about in the wilderness of html tags.

I used it in my [url="http://www.computercraft.info/forums2/index.php?/topic/16936-real-world-time-api-ver-10/']Real World Time API[/url]. You are welcome to pull it apart and look at that.
Goof #16
Posted 13 February 2014 - 01:18 PM
Please paste a complete sample of the page returned by the site. We can help provide a single pattern that will return both pieces of information individually.

Okay, here you go:
torsdag = thursday
Spoiler

Its all before that first </div> which is the date..
torsdag 13. februar 2014</div><div id=i_time>19:27:39</div><div id=i_timea><a href="/library/abbreviations/timezones/eu/cet.html" title="Central European Time">CET</a></div></div><script type="text/javascript">
et=1392487859;function f0(d){return ld[d.getUTCDay()]+' '+d.getUTCDate()+'.'+' '+lm[d.getUTCMonth()]+' '+d.getUTCFullYear()+', kl. '+p2(d.getUTCHours())+':'+p2(d.getUTCMinutes())+':'+p2(d.getUTCSeconds());}
function f1(d){return ld[d.getUTCDay()]+' '+d.getUTCDate()+'.'+' '+lm[d.getUTCMonth()]+' '+d.getUTCFullYear();}
function f2(d){return p2(d.getUTCHours())+':'+p2(d.getUTCMinutes())+':'+p2(d.getUTCSeconds());}
cks={ctu:{t:[{t:0,o:0,a:'UTC'}],f:f0},i_date:{t:[{o:3600,a:'CET'}],f:f1},i_time:{t:[{o:3600,a:'CET'}],f:f2},ct:{t:[{o:3600,a:'CET'}],f:f0,d:13}};
lm=[];lm[1]='februar';ld=[];ld[4]='torsdag';ld[5]='fredag';ld[6]='lørdag';idpref={i_date:1.2,i_time:5};
</script></td></tr></table></td></tr></table></div><script src="http://a.tadst.com/common/wcommon_17.js" type="text/javascript"></script></body></html>

I respect what you are doing here for general reasons (pulling a specific piece of info from a web page), however, for time and date, http://www.timeapi.org will give you exactly what you are after with no mucking about in the wilderness of html tags.

I used it in my Real World Time API. You are welcome to pull it apart and look at that.

Hmm.. Well yeah, but then i have to manually ask for Timezones etc, else i have to make some weird IP-timezone shizz…

I'll stick with my code for now..

Thanks in Advance
Edited on 13 February 2014 - 12:19 PM
Lyqyd #17
Posted 13 February 2014 - 01:29 PM
That doesn't look like the whole response. I did specify that we needed the whole thing and it looks like you cut off everything before the date text. I don't know of many HTML pages where the first tag is a closing div tag. If what you did provide is accurate (despite the missing bits), this should work:


local date, time = string.match(page, "(%a+ %d+%. %a+ %d+).-(%d+:%d+:%d+)")
Goof #18
Posted 13 February 2014 - 02:26 PM
Okay, Thanks.

That doesn't look like the whole response. I did specify that we needed the whole thing and it looks like you cut off everything before the date text. I don't know of many HTML pages where the first tag is a closing div tag.
Hmm.. I was sure that i told you i'd cut everything before that code i posted… But nevermind… its working and thats awesome!

Thanks
Edited on 13 February 2014 - 01:27 PM