Skip to content

What performance when using load_yml_dags to loading entire dag folder. #626

@raisinbl

Description

@raisinbl

As I understand, load_yml_dags functions accept dag folder as an argument, it will loop through every .yaml and parsing it to generate DAG.
A core concept of Airflow is Scheduler will parsing entire dag folder in specific time interval, that means, Airflow will execute every .py file.
What I concern is: with every specific time interval like that, Airflow will rerun code and calling load_yml_dags to parse entire dag config, which dosen't like a pure py dag file, which've been compiled and cached.
have you've been testing parsing time of that function on large dag folder, e.x: 1000+ file

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions