DougKruger DougKruger - 24 days ago 17
Python Question

set airflow schedule interval

I have created tasks in airflow which I scheduled to run hourly and

start_date
is set to
2016-11-16


default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2016, 11, 16),
'email': ['airflow@airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
}

dag = DAG('test_hourly_job', default_args=default_args,schedule_interval="@hourly")


I kicked off airflow at current time which is
10:00 AM
and I could see Airflow is running it from
00:00 AM
, then
01:00 AM
and so on:

INFO - Executing command: airflow run test_hourly_job task1 2016-11-16T00:00:00 --local -sd DAGS_FOLDER/test_airflow.py
........
........
INFO - Executing command: airflow run test_hourly_job task1 2016-11-16T01:00:00 --local -sd DAGS_FOLDER/test_airflow.py
.......
.......


How to configure airflow to start say from current time and run hourly going forward, instead of starting from
00:00
?

Answer

In your Question you written Dictionary : default_args

In this there is Key: 'start_date': datetime(2016, 11, 16)

Actually here is datetime object is created that having input YYYY/MM/DD format, we are not providing Time input so it takes as default 00:00, so your script runs at time 00:00 you can check this way: in python

from datetime import datetime

datetime(2016, 11, 16)

 #That Datetime object is generated with 00:00 Time  

#datetime(2016, 11, 16, 0, 0)

#If you need Current date and time to start process you can set value as:
'start_date': datetime.now()
#if you want only current time with respective date then you can use as fallows:

current_date = datetime.now() 
default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(2016, 11, 16, current_date.hour, current_date.minute),
    'email': ['airflow@airflow.com'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5),
    # 'queue': 'bash_queue',
    # 'pool': 'backfill',
    # 'priority_weight': 10,
    # 'end_date': datetime(2016, 1, 1),
}
dag = DAG('test_hourly_job', default_args=default_args,schedule_interval="@hourly")