Eduard Florinescu Eduard Florinescu - 7 months ago 29
Python Question

Python: How can I use ggplot with a simple 2 column array?

I try to use ggplot for python I have the following data:

power_data = [[ 4.13877565e+04, 2.34652000e-01],
[ 4.13877565e+04, 2.36125000e-01],
[ 4.13877565e+04, 2.34772000e-01],
...
[ 4.13882896e+04, 2.29006000e-01],
[ 4.13882896e+04, 2.29019000e-01],
[ 4.13882896e+04, 2.28404000e-01]]


And I want to represent it in ggplot with this:

print ggplot(aes(x='TIME', y='Watts'), data=power_data) + \
geom_point(color='lightblue') + \
geom_line(alpha=0.25) + \
stat_smooth(span=.05, color='black') + \
ggtitle("Power comnsuption over 13 hours") + \
xlab("Time") + \
ylab("Watts")


but get the error:

File "C:\PYTHON27\lib\site-packages\ggplot\ggplot.py", line 59, in __init__
for ae, name in self.aesthetics.iteritems():
AttributeError: 'list' object has no attribute 'iteritems'
>>>


I don't know what the line
aes(x='TIME', y='Watts')
should be doing.

How can I format the
power_data
list so I can use it with ggplot, I want first column reprezentedon a time
x
axis and second column on a power
y
axis?

If I am trying with the
meat
example it doesn't show nothing it only shows

>>> print (ggplot(aes(x='date', y='beef'), data=meat) + \
... geom_line())
<ggplot: (20096197)>
>>>


What should I do to further show the graphic?

Answer

There were 3 important steps that I missed:

1) First the data needs to be in a format like this:

[{'TIME': 41387.756495162001, 'Watts': 0.234652},
 {'TIME': 41387.756500821, 'Watts': 0.236125},
 {'TIME': 41387.756506480997, 'Watts': 0.23477200000000001},
 {'TIME': 41387.756512141001, 'Watts': 0.23453099999999999},
...
 {'TIME': 41387.756574386003, 'Watts': 0.23558699999999999},
 {'TIME': 41387.756580046, 'Watts': 0.23508899999999999},
 {'TIME': 41387.756585706004, 'Watts': 0.235041},
 {'TIME': 41387.756591365003, 'Watts': 0.23541200000000001},
 {'TIME': 41387.756597013002, 'Watts': 0.23461699999999999},
 {'TIME': 41387.756602672998, 'Watts': 0.23483899999999999}]

2) Then the data needs to be decorated with DataFrame with

powd = DataFrame(data2)

3) Without the plt.show(1) the plot will not show

Here is the code to solve the above:

from pandas import DataFrame
data2 = []
for i in range(0,len(power_data)):
    data2.append({'TIME': power_data[i][0], 'Watts': power_data[i][1]})

powd = DataFrame(data2)
print powd

# the above can be changed with this line:
# powd = DataFrame(power_data, columns=['TIME', 'Watts'])
# see sugestion in comments 

print ggplot(aes(x='TIME', y='Watts'), data=powd) + \
    geom_point(color='lightblue') + \
    geom_line(alpha=0.25) + \
    stat_smooth(span=.05, color='black') + \
    ggtitle("Power comnsuption over 13 hours") + \
    xlab("Time") + \
    ylab("Watts")