Skip to content

refactor: Improve typing annotations for planner, schema, serde, and transforms modules#7579

Open
OutSquareCapital wants to merge 5 commits intotobymao:mainfrom
OutSquareCapital:improve-annotations
Open

refactor: Improve typing annotations for planner, schema, serde, and transforms modules#7579
OutSquareCapital wants to merge 5 commits intotobymao:mainfrom
OutSquareCapital:improve-annotations

Conversation

@OutSquareCapital
Copy link
Copy Markdown
Contributor

@OutSquareCapital OutSquareCapital commented Apr 29, 2026

Another day, another typing PR!
This one focus more on functions/methods bodies, since the signatures are now well covered.
A lot of variables where implicitly Any due to Expr::args::get usage.

A few non-typing changes had to be made, I explain why in comments.

unrelated, IDK why copilot keep co-authoring my commits, I 100% assume using AI heavily but NOT for typing work. Didn't even prompted him once. Maybe autocomplete, idk

OutSquareCapital and others added 4 commits April 29, 2026 12:11
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Comment thread sqlglot/planner.py
] = [] # final selects in this chain of steps representing a select
operands = {} # intermediate computations of agg funcs eg x + 1 in SUM(x + 1)
aggregations = {}
# final selects in this chain of steps representing a select
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took the liberty to move the comment on the line above because the additional type hints made this very unreadable

Comment thread sqlglot/serde.py
if node.args:
for k, vs in reversed(node.args.items()):
if type(vs) is list:
if isinstance(vs, list):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type(x) is y is legacy pattern and doesn't work as well on type checkers than isinstance

Comment thread sqlglot/serde.py
return payloads


def _has_parent(node: object) -> TypeIs[exp.Expr]:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper was the best option to type this correctly

Comment thread sqlglot/transforms.py

select_candidates = exp.Window if expression.is_star else (exp.Window, exp.Column)
for select_candidate in list(qualify_filters.find_all(select_candidates)):
select_candidates = (exp.Window,) if expression.is_star else (exp.Window, exp.Column)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a potential bug: we could pass a tuple as an argument, but we were not unpacking it afterwards.

Comment thread sqlglot/transforms.py
select_candidates = exp.Window if expression.is_star else (exp.Window, exp.Column)
for select_candidate in list(qualify_filters.find_all(select_candidates)):
select_candidates = (exp.Window,) if expression.is_star else (exp.Window, exp.Column)
for select_candidate in qualify_filters.find_all(*select_candidates):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collecting in a list here served no purpose and was a performance tax

Comment thread sqlglot/transforms.py
alias = join_expr.args.get("alias")
else:
alias = unnest.args.get("alias")
if alias is None:
Copy link
Copy Markdown
Contributor Author

@OutSquareCapital OutSquareCapital Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This had to be done because the only None check that was done on alias was for the alias_col variable.
So later on with the alias.this attribute access, this could crash, and mypy wasn't letting this pass anyways

Comment thread sqlglot/transforms.py
*[
exp.column(col, tables[0]).eq(exp.column(col, tables[1]))
for col in full_outer_join.args.get("using")
for col in t.cast(list[exp.Identifier], full_outer_join.args.get("using"))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cast is needed here, unless we extract this path in a function helper to correctly type the full_outer_join.args.get("using")

Comment thread sqlglot/transforms.py Outdated
return exp.column(alias_or_name, quoted=identifier.args.get("quoted"))
return alias_or_name

outer_selects = exp.select(*list(map(_select_alias_or_name, expression.selects)))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collecting in a list here served no purpose

@OutSquareCapital
Copy link
Copy Markdown
Contributor Author

OutSquareCapital commented Apr 29, 2026

mypy don't fail in local for me. I see that there have been recent changes regarding mypy usage in the repo, could this be linked @VaggelisD ?

@georgesittas
Copy link
Copy Markdown
Collaborator

Could be, can you try to reset the venv and install dependencies again?

@OutSquareCapital
Copy link
Copy Markdown
Contributor Author

OutSquareCapital commented Apr 29, 2026

Could be, can you try to reset the venv and install dependencies again?

to be honest I always use uvx for ruff and mypy instead of uv run, i.e the venv in this codebase, due to repeated issues I had with uv, not having make, and very long dependencies resolving times if I want to change versions/run unnitest and have to rebuild sqlglot when the project metadata has changed since the last build.
And the fact that mypy, ruff etc.. aren't in dev dependencies.
Thus I only use it to run unittest.

Anyway, I just tested, it used to work with 3.9, but not anymore. Still work fine with 3.10 tho. But I suspect now uvx target 3.10 instead of 3.9 version for mypy in consequence, so I can't see issues with this specific version.

PS C:\Users\tibo\python_codes\sqlglot> uv sync --python 3.9
Using CPython 3.9.25
Creating virtual environment at: .venv
  × Failed to build `sqlglotc @
  │ file:///C:/Users/tibo/python_codes/sqlglot/sqlglotc`
  ├─▶ Failed to resolve requirements from `build-system.requires``setuptools-scm`, `sqlglot-mypy>=1.20.0.post4`,
  │   `types-python-dateutil`, `sqlglot`
  ╰─▶ Because only the following versions of sqlglot-mypy are
      available:
          sqlglot-mypy<=1.20.0.post4
          sqlglot-mypy==1.20.1.post1
      and sqlglot-mypy==1.20.1.post1 was yanked (reason: It
      should have been 1.20.0.post1), we can conclude that

      Because the current Python version (3.9.25) does not satisfy
      Python>=3.10, we can conclude that sqlglot-mypy==1.20.0.post4
      cannot be used.
      And because we know from (1) that sqlglot-mypy>1.20.0.post4
      cannot be used, we can conclude that
      sqlglot-mypy>=1.20.0.post4 cannot be used.
      conclude that your requirements are unsatisfiable.
PS C:\Users\tibo\python_codes\sqlglot> uv sync --python 3.10
Using CPython 3.10.19
Removed virtual environment at: .venv
Creating virtual environment at: .venv
Resolved 58 packages in 2m 47s
      Built sqlglot @ file:///C:/Users/tibo/python_codes/sqlglot
Prepared 1 package in 5.20s
Installed 1 package in 28ms
 + sqlglot==0.0.1.dev7554 (from file:///C:/Users/tibo/python_codes/sqlglot)

EDIT:
turns out I can target python versions with uvx, still no fail (those errors are expected, so to me it's good to go in local):

PS C:\Users\tibo\python_codes\sqlglot> uvx --python 3.9 mypy sqlglot/ 
Installed 6 packages in 2.96s
sqlglot\optimizer\simplify.py:20: error: Library stubs not installed for "dateutil.relativedelta"  [import-untyped]
sqlglot\optimizer\simplify.py:20: note: Hint: "python3 -m pip install types-python-dateutil"
sqlglot\optimizer\simplify.py:20: note: (or run "mypy --install-types" to install all missing stub packages)
sqlglot\optimizer\simplify.py:20: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
Found 1 error in 1 file (checked 175 source files)
PS C:\Users\tibo\python_codes\sqlglot> 

@georgesittas georgesittas requested a review from VaggelisD April 29, 2026 13:28
@VaggelisD
Copy link
Copy Markdown
Collaborator

VaggelisD commented Apr 29, 2026

Hey @OutSquareCapital, for context, sqlglot-mypy is a fork of mypy for various sqlglot[c] fixes we add on top to unblock mypyc compilation, the rest of the type checking should behave exactly like mypy.

Mypy 1.20 (latest release) dropped support for Python 3.9 and so did we for sqlglot[c]; The latest project metadata changes basically made it so that:

  • Python 3.9 defaults to mypy, whatever latest version it supports 3.9 for
  • Python 3.10+ defaults to sqlglot-mypy which is based on mypy 1.20 + our fixes on top

As for the CI failures, I think they're justified (?). Afaict, it's rejecting | for the alias so maybe we can move it under t.TYPE_CHECKING?

Co-authored-by: Copilot <copilot@github.com>
@OutSquareCapital
Copy link
Copy Markdown
Contributor Author

OutSquareCapital commented Apr 29, 2026

Okay apologies to both of you guys, switching branches again gave me the same errors than in the CI.
Either I was a moron and wasn't in the correct branch before, or I had a weird cache issue with mypy (I saw that happen already, sometimes the Library stubs not installed for "dateutil.relativedelta error don't show up).

@VaggelisD
Copy link
Copy Markdown
Collaborator

@OutSquareCapital Nothing to worry about! I also feel like I get cryptic mypy cache issues from time to time, so either that makes 2 of us or it's bound to happen, especially with the pyproject change.

Not sure what the new CI failure is about, can you repro that locally?

@OutSquareCapital
Copy link
Copy Markdown
Contributor Author

I don't have the same output but running unittest as it is will with uv run -m unittest make my vscode crash.
It seems to come from TestParser::test_max_nodes that take hours to run.

@georgesittas
Copy link
Copy Markdown
Collaborator

What's the status here?

@OutSquareCapital
Copy link
Copy Markdown
Contributor Author

What's the status here?

I will try to fix this tomorrow if I have some time or sunday, will update once fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants