Petr Petrov Petr Petrov - 1 month ago 16
Python Question

Python: error in writing class

I want to write func and add that to class.
I use

import pandas as pd
import tldextract

domain = []
df = pd.DataFrame()
df['urls'] = ['ru.vk.com', 'eng.facebook.com', 'ru.ya.ru']
urls = df.urls.values.tolist()
class csv:
def get_domain(self, list_url, list, df):
self.list_url = list_url
self.list = list
self.df = df
for i, url in enumerate(list_url):
get_domain = tldextract.extract(url)
subdomain = get_domain[0] + '.' + get_domain[1] + '.' + get_domain[2]
if subdomain.startswith('.'):
subdomain = subdomain[1:]
elif subdomain.endswith('.'):
subdomain = subdomain[:-1]
elif subdomain.startswith('www.'):
subdomain = subdomain[4:]
list.append(subdomain)
df['subdomain'] = list

df = csv()
df.get_domain(urls, domain, df)


I try to get domain from the urls, but I get error

AttributeError: csv instance has no attribute '__setitem__'


What I should change?

Answer

You named the variable that you use to create the csv instance df, same as the data frame item, by that making df refer to csv object and not panda dataframe. Then, when you try to call get domain, df refers to the class, that has no way to get a string indices, thus failing.


In short - change the variable name in the last 2 rows, like

csv_df = csv()
csv_df.get_domain(urls, domain, df)

By the way, It's not a mistake, but I'm pretty sure the last row on get_domain was supposed to be

self.df['subdomain'] = self.list

(Same goes for all variables all across the function, since you desire to change the class property and not the given variable).

And you shouldn't name a variable list or any other reserved keyword. Might cause issues.