John Chen John Chen - 3 years ago 190
Python Question

How to use findall or search to extract data in python?

Here is my string,

str = 'A:[{type:"mb",id:9,name:"John",url:"/mb9/",cur:0,num:83498},
{type:"mb",id:92,name:"Mary",url:"/mb92/",cur:0,num:404},
{type:"mb",id:97,name:"Dan",url:"/mb97/",cur:0,num:139},
{type:"mb",id:268,name:"Jennifer",url:"/mb268/",cur:0,num:0},
{type:"mb",id:289,name:"Mike",url:"/mb289/",cur:0,num:0}],B:
[{type:"mb",id:157,name:"Sue",url:"/mb157/",cur:0,num:35200},
{type:"mb",id:3,name:"Rob",url:"/mb3/",cur:0,num:103047},
{type:"mb",id:2,name:"Tracy",url:"/mb2/",cur:0,num:87946},
{type:"mb",id:26,name:"Jenny",url:"/mb26/",cur:0,num:74870},
{type:"mb",id:5,name:"Florence",url:"/mb5/",cur:0,num:37261},
{type:"mb",id:127,name:"Peter",url:"/mb127/",cur:0,num:63711},
{type:"mb",id:15,name:"Grace",url:"/mb15/",cur:0,num:63243},
{type:"mb",id:82,name:"Tony",url:"/mb82/",cur:0,num:6471},
{type:"mb",id:236,name:"Lisa",url:"/mb236/",cur:0,num:4883}]'


I want to use findall or search to extract all the data under "name" and "url" from str. Here is what I did,

pattern = re.comile(r'type:(.*),id:(.*),name:(.*),url:(.*),cur:(.*),num:
(.*)')

for (v1, v2, v3, v4, v5, v6) in re.findall(pattern, str):
print v3
print v4


But unfortunately, this doesn't do what I want. Is there anything wrong? Thanks for your inputs.

Answer Source

You shouldn't call you string "str," because that's a built-in function. But here's an option for you:

# Find all of the entries
x = re.findall('(?<![AB]:)(?<=:).*?(?=[,}])', s)

['"mb"', '9', '"John"', '"/mb9/"', '0', '83498', '"mb"', '92', '"Mary"', 
'"/mb92/"', '0', '404', '"mb"', '97', '"Dan"', '"/mb97/"', '0', '139', 
'"mb"', '268', '"Jennifer"', '"/mb268/"', '0', '0', '"mb"', '289', '"Mike"', 
'"/mb289/"', '0', '0', '"mb"', '157', '"Sue"', '"/mb157/"', '0', '35200', 
'"mb"', '3', '"Rob"', '"/mb3/"', '0', '103047', '"mb"', '2', '"Tracy"', 
'"/mb2/"', '0', '87946', '"mb"', '26', '"Jenny"', '"/mb26/"', '0', '74870', 
'"mb"', '5', '"Florence"', '"/mb5/"', '0', '37261', '"mb"', '127', '"Peter"', 
'"/mb127/"', '0', '63711', '"mb"', '15', '"Grace"', '"/mb15/"', '0', '63243', 
'"mb"', '82', '"Tony"', '"/mb82/"', '0', '6471', '"mb"', '236', '"Lisa"', 
'"/mb236/"', '0', '4883']

# Break up into each section
y = []
for i in range(0, len(x), 6):
    y.append(x[i:i+6])

[['"mb"', '9', '"John"', '"/mb9/"', '0', '83498']
['"mb"', '92', '"Mary"', '"/mb92/"', '0', '404']
['"mb"', '97', '"Dan"', '"/mb97/"', '0', '139']
['"mb"', '268', '"Jennifer"', '"/mb268/"', '0', '0']
['"mb"', '289', '"Mike"', '"/mb289/"', '0', '0']
['"mb"', '157', '"Sue"', '"/mb157/"', '0', '35200']
['"mb"', '3', '"Rob"', '"/mb3/"', '0', '103047']
['"mb"', '2', '"Tracy"', '"/mb2/"', '0', '87946']
['"mb"', '26', '"Jenny"', '"/mb26/"', '0', '74870']
['"mb"', '5', '"Florence"', '"/mb5/"', '0', '37261']
['"mb"', '127', '"Peter"', '"/mb127/"', '0', '63711']
['"mb"', '15', '"Grace"', '"/mb15/"', '0', '63243']
['"mb"', '82', '"Tony"', '"/mb82/"', '0', '6471']
['"mb"', '236', '"Lisa"', '"/mb236/"', '0', '4883']]

# Name is 3rd value in each list and url is 4th
for i in y:
    name = i[2]
    url = i[3]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download