Skip to content

Commit 1a10806

Browse files
authored
feat(mypyc): Enable incremental compilation, deprecate Python 3.9 (#7574)
Switch sqlglotc's mypyc build to separate=True so each compiled module gets its own shared lib + shim. Clean builds are roughly the same wall-clock; incremental rebuilds after a one-line edit drop from a full ~110s monolithic rebuild to a few seconds once the cache is warm. Python 3.9 is dropped from `sqlglot[c]` because sqlglot-mypy 1.20+ (the build dep needed for the separate=True codegen fixes) only ships wheels for 3.10+. Pure-Python sqlglot still supports 3.9; on 3.9 `pip install sqlglot[c]` is a no-op via an environment marker, and `make install-devc` short-circuits with a clear "requires Python 3.10+" message. - sqlglotc/pyproject.toml: requires-python >= 3.10, build dep pinned to sqlglot-mypy >= 1.20.0.post4 (which carries the cached-SCC fix that makes the wheel-build path actually package shared libs). - setup.py: [c] / [rs] extras gated on python_version >= '3.10'; on 3.9 fall back to upstream `mypy` for type checking instead of sqlglot-mypy. - Makefile: install-devc / install-devc-release skip the build on 3.9. sqlglot-side tweaks for the new build: - sqlglot/__init__.py: drop the legacy *__mypyc*.so bootstrap preloader. It was written for the old monolithic build where a single hash-named .so sat at the package root; under separate=True per-module shared libs sit next to their .py siblings and resolve through the shim. - sqlglot/optimizer/__init__.py: swap eager top-level re-exports for PEP 562 `__getattr__`. Under separate=True's cross-group init ordering, the previous eager `from sqlglot.optimizer.optimizer import ...` could fire while sqlglot/__init__.py was still mid-bootstrap and trip a circular `ImportError` on `from sqlglot import Schema, exp`. Lazy resolution defers the cycle past sqlglot's init. Concurrent first-access lookups are serialised with an RLock, mirroring sqlglot/dialects/__init__.py.
1 parent e078e5d commit 1a10806

6 files changed

Lines changed: 99 additions & 36 deletions

File tree

Makefile

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,11 +38,23 @@ install-dev:
3838
fi; \
3939
fi
4040

41+
# sqlglotc requires Python 3.10+ (sqlglot-mypy 1.20+ dropped 3.9). On 3.9
42+
# we skip the C build; tests fall back to pure-Python sqlglot.
43+
PY_GE_310 := $(shell python -c "import sys; print(int(sys.version_info >= (3, 10)))")
44+
4145
install-devc:
42-
cd sqlglotc && MYPYC_OPT=0 python setup.py build_ext --inplace -j $(NPROC)
46+
@if [ "$(PY_GE_310)" = "1" ]; then \
47+
cd sqlglotc && MYPYC_OPT=0 python setup.py build_ext --inplace -j $(NPROC); \
48+
else \
49+
echo "Skipping sqlglotc build: requires Python 3.10+"; \
50+
fi
4351

4452
install-devc-release: clean
45-
cd sqlglotc && python setup.py build_ext --inplace -j $(NPROC)
53+
@if [ "$(PY_GE_310)" = "1" ]; then \
54+
cd sqlglotc && python setup.py build_ext --inplace -j $(NPROC); \
55+
else \
56+
echo "Skipping sqlglotc build: requires Python 3.10+"; \
57+
fi
4658

4759
install-pre-commit:
4860
pre-commit install

setup.py

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,10 @@
77
extras_require={
88
"dev": [
99
"duckdb>=0.6",
10-
"sqlglot-mypy",
10+
# sqlglot-mypy 1.20+ is the build dep for sqlglotc and only ships
11+
# for py3.10+; on py3.9 just use upstream mypy for type checking.
12+
"sqlglot-mypy >= 1.20.0.post4; python_version >= '3.10'",
13+
"mypy; python_version < '3.10'",
1114
"setuptools_scm",
1215
"pandas",
1316
"pandas-stubs",
@@ -21,9 +24,11 @@
2124
"typing_extensions",
2225
"pyperf",
2326
],
24-
# Compiles from source on the user's machine.
25-
"c": [f"sqlglotc=={version}"],
27+
# Compiles from source on the user's machine. Requires Python 3.10+
28+
# because the build dep (sqlglot-mypy 1.20+) dropped 3.9; on 3.9
29+
# `pip install sqlglot[c]` is a no-op and you just get pure-Python sqlglot.
30+
"c": [f"sqlglotc=={version}; python_version >= '3.10'"],
2631
# Deprecated: the Rust tokenizer has been replaced by sqlglotc.
27-
"rs": ["sqlglotrs==0.13.0", f"sqlglotc=={version}"],
32+
"rs": ["sqlglotrs==0.13.0", f"sqlglotc=={version}; python_version >= '3.10'"],
2833
},
2934
)

sqlglot/__init__.py

Lines changed: 1 addition & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# ruff: noqa: F401, E402
1+
# ruff: noqa: F401
22
"""
33
.. include:: ../README.md
44
@@ -7,25 +7,9 @@
77

88
from __future__ import annotations
99

10-
# bootstrap mypyc runtime: compiled .so modules do a top-level `import HASH__mypyc`,
11-
# but the runtime .so lives inside sqlglot/. Pre-load it into sys.modules.
12-
# this is only needed for editable builds
1310
from collections.abc import Collection
14-
import sys
15-
from pathlib import Path
1611
from builtins import type as Type
1712

18-
for path in Path(__file__).parent.glob("*__mypyc*.so"):
19-
name = path.stem.split(".")[0]
20-
if name not in sys.modules:
21-
import importlib.util
22-
23-
spec = importlib.util.spec_from_file_location(name, path)
24-
if spec and spec.loader:
25-
mod = importlib.util.module_from_spec(spec)
26-
sys.modules[name] = mod
27-
spec.loader.exec_module(mod)
28-
2913
import logging
3014
import typing as t
3115

sqlglot/optimizer/__init__.py

Lines changed: 63 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,65 @@
11
# ruff: noqa: F401
2+
"""Lazy re-exports from optimizer submodules.
23
3-
from sqlglot.optimizer.optimizer import RULES as RULES, optimize as optimize
4-
from sqlglot.optimizer.scope import (
5-
Scope as Scope,
6-
build_scope as build_scope,
7-
find_all_in_scope as find_all_in_scope,
8-
find_in_scope as find_in_scope,
9-
traverse_scope as traverse_scope,
10-
walk_in_scope as walk_in_scope,
11-
)
4+
Eager re-exports here trip a circular import under sqlglot[c]: importing
5+
sqlglot loads compiled `expressions.builders`, which eagerly wires up its
6+
links to compiled optimizer modules, which causes Python to run this
7+
file. The eager `from sqlglot.optimizer.optimizer import ...` then asks
8+
for `sqlglot.Schema`, but `sqlglot/__init__.py` hasn't bound it yet.
9+
10+
PEP 562 `__getattr__` defers the import to first attribute access, by
11+
which point sqlglot is fully loaded. Tracked upstream in python/mypy#21299.
12+
"""
13+
14+
from __future__ import annotations
15+
16+
import threading
17+
import typing as t
18+
19+
# Serialise concurrent first-access lookups; importlib's per-module lock is
20+
# released before our caching write to globals(), so two threads racing on
21+
# the same name could otherwise both run the import + getattr.
22+
_import_lock = threading.RLock()
23+
24+
# Only for type checkers and IDEs; runtime resolution goes through __getattr__.
25+
if t.TYPE_CHECKING:
26+
from sqlglot.optimizer.optimizer import RULES as RULES, optimize as optimize
27+
from sqlglot.optimizer.scope import (
28+
Scope as Scope,
29+
build_scope as build_scope,
30+
find_all_in_scope as find_all_in_scope,
31+
find_in_scope as find_in_scope,
32+
traverse_scope as traverse_scope,
33+
walk_in_scope as walk_in_scope,
34+
)
35+
36+
# Explicit, because some names collide between optimizer.py and submodules
37+
# (e.g. `qualify` is both a function and a submodule).
38+
_LAZY_ATTRS: dict[str, tuple[str, str]] = {
39+
"RULES": ("sqlglot.optimizer.optimizer", "RULES"),
40+
"optimize": ("sqlglot.optimizer.optimizer", "optimize"),
41+
"Scope": ("sqlglot.optimizer.scope", "Scope"),
42+
"build_scope": ("sqlglot.optimizer.scope", "build_scope"),
43+
"find_all_in_scope": ("sqlglot.optimizer.scope", "find_all_in_scope"),
44+
"find_in_scope": ("sqlglot.optimizer.scope", "find_in_scope"),
45+
"traverse_scope": ("sqlglot.optimizer.scope", "traverse_scope"),
46+
"walk_in_scope": ("sqlglot.optimizer.scope", "walk_in_scope"),
47+
}
48+
49+
50+
def __getattr__(name: str) -> t.Any:
51+
import importlib
52+
53+
with _import_lock:
54+
target = _LAZY_ATTRS.get(name)
55+
if target is not None:
56+
module_name, attr = target
57+
value = getattr(importlib.import_module(module_name), attr)
58+
else:
59+
# Submodule fallback so `from sqlglot.optimizer import qualify` works.
60+
try:
61+
value = importlib.import_module(f"{__name__}.{name}")
62+
except ModuleNotFoundError:
63+
raise AttributeError(f"module {__name__!r} has no attribute {name!r}") from None
64+
globals()[name] = value
65+
return value

sqlglotc/pyproject.toml

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,23 @@ dynamic = ["version"]
44
description = "mypyc-compiled extensions for sqlglot"
55
authors = [{ name = "Toby Mao", email = "toby.mao@gmail.com" }]
66
license = "MIT"
7-
requires-python = ">= 3.9"
7+
requires-python = ">= 3.10"
88

99
[project.optional-dependencies]
10-
dev = ["setuptools >= 61.0", "setuptools_scm", "sqlglot-mypy"]
10+
dev = ["setuptools >= 61.0", "setuptools_scm", "sqlglot-mypy >= 1.20.0.post4"]
1111

1212
[project.urls]
1313
Homepage = "https://sqlglot.com/"
1414
Repository = "https://github.com/tobymao/sqlglot"
1515

1616
[build-system]
17-
requires = ["setuptools >= 61.0", "setuptools_scm", "sqlglot-mypy", "types-python-dateutil", "sqlglot"]
17+
requires = [
18+
"setuptools >= 61.0",
19+
"setuptools_scm",
20+
"sqlglot-mypy >= 1.20.0.post4",
21+
"types-python-dateutil",
22+
"sqlglot",
23+
]
1824
build-backend = "setuptools.build_meta"
1925

2026
[tool.setuptools]

sqlglotc/setup.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,8 @@ def run(self):
129129
setup(
130130
name="sqlglotc",
131131
packages=[],
132-
ext_modules=mypycify(_source_paths(), opt_level=os.environ.get("MYPYC_OPT", "2")),
132+
ext_modules=mypycify(
133+
_source_paths(), opt_level=os.environ.get("MYPYC_OPT", "2"), separate=True, verbose=True
134+
),
133135
cmdclass={"build_ext": build_ext, "sdist": sdist},
134136
)

0 commit comments

Comments
 (0)