Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting parameters in SQLExecuteQueryOperator fails (should be the same as using params) #42344

Closed
1 of 2 tasks
schweizersta opened this issue Sep 19, 2024 · 5 comments · Fixed by #42564
Closed
1 of 2 tasks
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet pending-response provider:postgres

Comments

@schweizersta
Copy link

schweizersta commented Sep 19, 2024

Apache Airflow Provider(s)

standard

Versions of Apache Airflow Providers

apache-airflow-providers-postgres==5.12.0

Apache Airflow version

2.10.1

Operating System

RHEL 9.4

Deployment

Virtualenv installation

Deployment details

Just default airflow with apache-airflow-providers-postgres

What happened

When using "parameters" to provide params to SQLExecuteQueryOperator, they are not available in the template. An exception is thrown: "jinja2.exceptions.UndefinedError: 'dict object' has no attribute '....'"

Only when given as "params" it works.

What you think should happen instead

Both "parameters" and "params" should be usable, but currently only "params" works.

From the docu:

SQLExecuteQueryOperator provides parameters attribute which makes it possible to dynamically inject values into your SQL requests during runtime. The BaseOperator class has the params attribute which is available to the SQLExecuteQueryOperator by virtue of inheritance. Both parameters and params make it possible to dynamically pass in parameters in many interesting ways.

How to reproduce

Define a task using SQLExecuteQueryOperator and provide params using "parameters" like here:

    sql_task = SQLExecuteQueryOperator(
        task_id="task_id",
        conn_id="conn_id",
        sql="file.sql",
        parameters={'param_01':param_01_value},
    )

And the file.sql may look like that:

select * from table where column = '{{ params.param_01 }}'

Once running the DAG an error is thrown. See stacktrace below:

Traceback (most recent call last):
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/models/abstractoperator.py", line 770, in _do_render_template_fields
    rendered_content = self.render_template(
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/template/templater.py", line 171, in render_template
    return self._render(template, context)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/models/abstractoperator.py", line 725, in _render
    return super()._render(template, context, dag=dag)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/template/templater.py", line 127, in _render
    return render_template_to_string(template, context)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/utils/helpers.py", line 301, in render_template_to_string
    return render_template(template, cast(MutableMapping[str, Any], context), native=False)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/utils/helpers.py", line 296, in render_template
    return "".join(nodes)
  File "<template>", line 13, in root
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/jinja2/runtime.py", line 857, in _fail_with_undefined_error
    raise self._undefined_exception(self._undefined_message)
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'param_01'

Anything else

The problem occurs every time

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@schweizersta schweizersta added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Sep 19, 2024
Copy link

boring-cyborg bot commented Sep 19, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@schweizersta schweizersta changed the title setting parameters in SQLExecuteQueryOperator has no effect (should be the same as using params) Sep 19, 2024
@dosubot dosubot bot added area:core-operators Operators, Sensors and hooks within Core Airflow provider:postgres labels Sep 19, 2024
@jscheffl
Copy link
Contributor

jscheffl commented Sep 19, 2024

The {{ params.*** }} you are referring to are the DAG run parameters which can be defined on DAG level as described in https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/params.html#referencing-params-in-a-task

The parameters you are trying to add from the operators need to be passed in form %(parameter_name)s into the query string - as documented in https://airflow.apache.org/docs/apache-airflow-providers-postgres/stable/operators/postgres_operator_howto_guide.html#passing-server-configuration-parameters-into-postgresoperator

Your SQL file need to look like:

select * from table where %(param_01)s
@schweizersta
Copy link
Author

I am not talking about DAG run parameters, but task parameters (given when instantiating SQLExecuteQueryOperator).
With some more investigations I figured out how it works and what I am trying to say:

In the previous section "Passing Parameters into SQLExecuteQueryOperator for Postgres" of the second docu link you referred to, there are 2 examples:

...
  sql="SELECT * FROM pet WHERE birth_date BETWEEN SYMMETRIC %(begin_date)s AND %(end_date)s",
  parameters={"begin_date": "2020-01-01", "end_date": "2020-12-31"},
...
...
  sql="sql/birth_date.sql",
  params={"begin_date": "2020-01-01", "end_date": "2020-12-31"},
...

In the second example (where the sql command is stored in a file), one has to provide values for the sql parameters using params=. But it does not matter how you access the paramters in the sql command, whether by using %(begin_date) or {{ params.begin_date }}, both of them work fine.

What got me confused, is the statement at the beginning of this section, where it says: "Both parameters and params make it possible to dynamically pass in parameters in many interesting ways.". So I thougt, I can use either params or parameters in both cases. But that "parameter" is only valid for the first case and "params" only for the second one, was not obvious to me.

Now that I know how it's meant to be, I leave it up to you, if you want to clarify this more in the docu or not.

@jscheffl
Copy link
Contributor

Thanks for the feedback!
As the community is always happy to receive not only feedback but also direct corrections... would it be possible to open a small PR and propose a change? If you just do this, then I can review and merge it.
Just use the button on the bottom of each page:

image

It is really simple and once you have made your first PR you are joining the crowd and be proud! Might take <5min.

@kunaljubce
Copy link
Contributor

@jscheffl I have raised the PR to make this documentation change. Please review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet pending-response provider:postgres
3 participants