I'm analyzing a publicly available dataset: an assessment of properties in San Francisco for tax purposes (https://data.sfgov.org/Housing-and-Buildings/Historic-Secured-Property-Tax-Rolls/wv5m-vpq2). It can be downloaded as a CSV file, which assumes the filename 'Historic_Secured_Property_Tax_Rolls.csv'.
Using this file, I'm trying to figure out the annual growth rate of the Land Values, excluding zero values. The dataset is so large that I get errors if I try to plot it, so I'm firstly trying to rely on my understanding of how
import pandas as pd
# Read in data downloaded from https://data.sfgov.org/api/views/wv5m-vpq2/rows.csv?accessType=DOWNLOAD
df = pd.read_csv('Historic_Secured_Property_Tax_Rolls.csv')
df_nz = df[df['Closed Roll Assessed Land Value'] > 0] # Only consider non-zero Land Values
p = np.polyfit(df_nz['Closed Roll Fiscal Year'], np.log(df_nz['Closed Roll Assessed Land Value']), 1)
In : p
Out: array([ 4.18802559e-02, -7.23804441e+01])
Returns ------- p : ndarray, shape (M,) or (M, K) Polynomial coefficients, highest power first. If `y` was 2-D, the coefficients for `k`-th data set are in ``p[:,k]``.
This tells me that the
4.2% is the coefficient on the log term.
More to come...