Le Truong Sinh Le Truong Sinh - 4 months ago 22
HTML Question

How to get text object from a html table using selenium python

I have a part of html file as below

<div><pre> <b>Home:</b> 28-12 <b>Road:</b> 23-16 <b>ExtrInn:</b> 2-5
<b>vsRHP:</b> 38-18 <b>vsLHP:</b> 13-10 <b>1-Run:</b> 17-5
<b>vsEast:</b> 12-8 <b>vsCntrl:</b> 7-5 <b>vsWest:</b> 26-13 <b>IL:</b> 6-2

<strong>Last 10 Games</strong>
Gm# Date &amp; Box Opp W/L Score Record Place/GB
79 <A CLASS=CL HREF="/boxes/NYA/NYA201606290.shtml">Wed, Jun 29</a> @<A CLASS=CL HREF="/teams/NYY/2016_sched.shtml">NYY</A> L 7-9 51-28 1st 9.0 up
78 <A CLASS=CL HREF="/boxes/NYA/NYA201606280.shtml">Tue, Jun 28</a> @<A CLASS=CL HREF="/teams/NYY/2016_sched.shtml">NYY</A> W 7-1 51-27 1st 10.0 up
77 <A CLASS=CL HREF="/boxes/NYA/NYA201606270.shtml">Mon, Jun 27</a> @<A CLASS=CL HREF="/teams/NYY/2016_sched.shtml">NYY</A> W 9-6 50-27 1st 10.0 up
76 <A CLASS=CL HREF="/boxes/TEX/TEX201606260.shtml">Sun, Jun 26</a> <A CLASS=CL HREF="/teams/BOS/2016_sched.shtml">BOS</A> W 6-2 49-27 1st 10.0 up
75 <A CLASS=CL HREF="/boxes/TEX/TEX201606250.shtml">Sat, Jun 25</a> <A CLASS=CL HREF="/teams/BOS/2016_sched.shtml">BOS</A> W 10-3 48-27 1st 9.0 up
74 <A CLASS=CL HREF="/boxes/TEX/TEX201606240.shtml">Fri, Jun 24</a> <A CLASS=CL HREF="/teams/BOS/2016_sched.shtml">BOS</A> L 7-8 47-27 1st 9.0 up
73 <A CLASS=CL HREF="/boxes/TEX/TEX201606220.shtml">Wed, Jun 22</a> <A CLASS=CL HREF="/teams/CIN/2016_sched.shtml">CIN</A> W 6-4 47-26 1st 10.0 up
72 <A CLASS=CL HREF="/boxes/TEX/TEX201606210.shtml">Tue, Jun 21</a> <A CLASS=CL HREF="/teams/CIN/2016_sched.shtml">CIN</A> L 2-8 46-26 1st 9.5 up
71 <A CLASS=CL HREF="/boxes/TEX/TEX201606200.shtml">Mon, Jun 20</a> <A CLASS=CL HREF="/teams/BAL/2016_sched.shtml">BAL</A> W 4-3 46-25 1st 9.5 up
70 <A CLASS=CL HREF="/boxes/SLN/SLN201606190.shtml">Sun, Jun 19</a> @<A CLASS=CL HREF="/teams/STL/2016_sched.shtml">STL</A> W 5-4 45-25 1st 8.5 up
<b>Last 10:</b> 7-3 <b>Last 20:</b>15-5 <b>Last 30:</b>23-7
</pre></div>


Anyone know how to get infos in Last 10 Last 20 and Last 30 using Selenium Python ?

Results should be 7-3, 15-5 and 23-7

Answer

That HTML is... something. The text you want isn't inside of any localized tag. You are going to have to grab all the text inside the outer DIV to find what you want. You can use regex or just parse it. The code below should be close.

alltext = driver.find_element_by_tag_name("div").text // locator needs to be more specific
results = re.findall('(Last \d{2}:\s*\d+-\d+)', alltext)
print results

The regex is looking for "Last " + 2 digits + ":" + 0 or more spaces + 1 or more digits + "-" + 1 or more digits. findall() will return all instances of that regex in the string so it should return all three.

Python regex info

Comments