Null-Hypothesis Null-Hypothesis - 26 days ago 7
Python Question

Pandas split a columns value and assign back to the column

I am working with following pandas data frame that has a column containing date as a string. Date also has time.

(Pdb) temp_df_no_na['logged_dt'].head(n=5)
0 01/19/2010 00:00:00.000000
1 03/28/2009 00:00:00.000000
2 09/22/2005 00:00:00.000000
3 12/14/2010 00:00:00.000000
5 02/23/2010 00:00:00.000000


I want to split by the space between date at the time and keep only the date part.

I wrote following lambda function and did an apply. It did work but end up getting warning and I am worried that results might be corrupt. Why would I get a warning like this:

temp_df_no_na['logged_dt'] = temp_df_no_na['logged_dt'].apply(lambda x:x.split(" ")[0] if(x.split(" ") > 0) else x)


Here is the warning

SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
from dataFrameCreator import DataFrameCreator

(Pdb) temp_df_no_na['logged_dt'].head(n=5)
0 01/19/2010
1 03/28/2009
2 09/22/2005
3 12/14/2010
5 02/23/2010

Answer

method 1
get Timestamp with pd.to_datetime

temp_df_no_na.logged_dt = pd.to_datetime(temp_df_no_na.logged_dt)
temp_df_no_na.logged_dt

0   2010-01-19
1   2009-03-28
2   2005-09-22
3   2010-12-14
5   2010-02-23
Name: logged_dt, dtype: datetime64[ns]

method 2
dt.strftime

temp_df_no_na.logged_dt = pd.to_datetime(temp_df_no_na.logged_dt).dt.strftime('%Y-%m-%d')
temp_df_no_na.logged_dt

0   2010-01-19
1   2009-03-28
2   2005-09-22
3   2010-12-14
5   2010-02-23
Name: logged_dt, dtype: object

mehtod 3
str.split

temp_df_no_na.logged_dt = temp_df_no_na.logged_dt.str.split().str[0]
temp_df_no_na.logged_dt

0   2010-01-19
1   2009-03-28
2   2005-09-22
3   2010-12-14
5   2010-02-23
Name: logged_dt, dtype: object