Vladislav Ladenkov Vladislav Ladenkov - 8 months ago 46
Python Question

Extracting date from numbers

I have numbers

array([40472, 40652, 40472, ..., 40282, 40412, 40044])

which are supposed to be DD.MM.YYYY. When this data was in excel( i mean LibreOffice), i tried to reformat it from numeric to data, and, actually first 10% of date's reformatted adequetly. But others were reformatted to 22-23's centuries! Thats not good, because the dates are supposed to be in range of smth like 2005-2016.

Now I want to try
, but it thinks, that timestamps have time also(not only date):
Output:datetime.datetime(1970, 1, 1, 11, 14, 32)

Are there any ways to make it decode only data? Thanks!


Excel converts dates to numbers by using serial numbers. See, for example, comments under the DATEVALUE function:

Excel stores dates as sequential serial numbers so that they can be used in calculations. By default, January 1, 1900 is serial number 1, and January 1, 2008 is serial number 39448 because it is 39,447 days after January 1, 1900.

You can thus find the proper date by adding that number of days. I am subtracting because it is not zero-indexed:

>>> from datetime import datetime, timedelta
>>> start = datetime(1899, 12, 30)
>>> start + timedelta(42677)
datetime.datetime(2016, 11, 3, 0, 0)

You could make a function as follows:

def excel_serial_to_datetime(n):
    start = datetime(1899, 12, 30)
    return start + timedelta(n)

NOTE: this will not work for decimal numbers, which will occur if you are more specific than the day.