mvd mvd - 1 month ago 14
Python Question

log2 axis doesn't work for histograms in matplotlib/seaborn

when plotting a histogram using matplotlib/seaborn i would like to change the x-axis to be in log2 scale. the data plotted are here

when i take the log2 of values and make a histogram it works, but if i plot the unlogged values and change the x-axis using

set_xscale
, it gives the wrong result. code is:

import numpy as np
import seaborn as sns
import matplotlib.pylab as plt

df = pandas.read_table("data.csv",sep="\t")
plt.figure()
sns.set_style("ticks")
ax1 = plt.subplot(2, 1, 1)
plt.hist(df["y"])
ax1.set_xscale("log", basex=2)
ax2 = plt.subplot(2, 1, 2)
plt.hist(np.log2(df["y"]))


the plot:

enter image description here

is this a bug or did i change axes incorrectly?

Answer

It is neither! See what happens if you increase the bin size:

plt.hist(df["y"], bins = 300)
ax1.set_xscale("log", basex=2)
ax2 = plt.subplot(2, 1, 2)
plt.hist(np.log2(df["y"]), bins=300)

logbins

The data for the histogram is the same, but the bin size distribution is still linear in the top case.

How to make the two cases ideantical? Pass custom bin sizes in log space to plt.hist:

plt.figure()
sns.set_style("ticks")
ax1 = plt.subplot(2, 1, 1)
logbins = np.logspace(np.log2(df["y"].min()),
                      np.log2(df["y"].max()),
                      300, base=2)
plt.hist(df["y"], bins = logbins)
ax1.set_xscale("log", basex=2)
ax2 = plt.subplot(2, 1, 2)
plt.hist(np.log2(df["y"]), bins=300)

fixed

There are still some minor differences between the two plots, but I believe they are not related to your original issue.