pytorch主干nightly集成验证#141
Conversation
…h upstream - Add builder Dockerfiles (manylinux2_28, x86_64/aarch64) for torch_npu wheel builds - Add test Dockerfiles (ubuntu:22.04 + Miniforge3 conda py_3.10, aligned with upstream CUDA 13.0 image) - Switch test images from system Python to conda + named env pattern - Align requirements-test.txt with upstream requirements-ci.txt (py3.10/jammy profile) - Switch to PyTorch nightly index for latest daily builds - Add docker_build.sh with explicit case-statement tag mapping - Add .ci/pytorch/ build scripts (common.sh, build_pytorch.sh, build_torch_npu.sh, build.sh, integration_verify.sh) - Update _build.yml with real build commands replacing simulated placeholders - Add build-docker-images.yml workflow: PR/push triggers build+push all 8 images, aarch64 on ARM native runners Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…function GHA expressions do not support 'split'. matrix-prep now outputs a JSON array directly, consumed by fromJSON() in the build matrix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…6_64 - Remove builder Dockerfiles and requirements-builder.txt - Remove test/Dockerfile.x86_64 - Simplify docker_build.sh to only aarch64 A2/A3 - Simplify build-docker-images.yml to aarch64-only, ubuntu-24.04-arm runner - Update _build.yml default image tag to aarch64 - Update README Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Docker image no longer installs PyTorch (built separately at CI runtime) - Move torchvision, torch_geometric, torch-scatter to requirements-post.txt - requirements-post.txt installed after PyTorch + torch_npu are built - Add post-build dependency step to build.sh and _build.yml - Dockerfile now only installs base requirements (no torch needed) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace generic DOCKER_REGISTRY push with quay.io/kerer/pytorch using docker/login-action@v3. Upload each built image path as an artifact, and add a summary job that prints image names and docker pull commands in a Markdown table via GITHUB_STEP_SUMMARY. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add test-build-trigger.yml (PR trigger on master) and _build.yml (reusable workflow) that checkout upstream PyTorch main HEAD, build from source, then checkout downstream torch_npu master and verify both packages import successfully. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
- Add ccache to accelerate C/C++ compilation across runs - Add pip cache via actions/cache@v4 for faster dependency installs - Add BUILD_WITHOUT_SHA=1 env for more cacheable builds - Capture build logs to /tmp and upload as artifacts for debugging - Add proper exit code handling in build steps - Enhance build summary with ccache statistics Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
…erfile - fetch-depth: 0 prevents shallow clone conflicts with recursive submodules - Install ccache in Dockerfile so future images have it pre-installed Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Setting CC="ccache gcc" causes ccache to be invoked directly when CMake builds .S assembly files (qnnpack confu loses the "gcc" part), making ccache reject -D preprocessor flags as invalid options. CMAKE_C_COMPILER_LAUNCHER tells CMake to prefix the compiler natively, which correctly handles C, C++, and assembly compilation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
When the source-built PyTorch is newer than what torch_npu master expects, the code generator adds TORCH_FEATURE_VERSION guards to C shim headers that differ from checked-in files, causing a RuntimeError. Run codegen with --update_aoti_c_shim before ci/build.sh to sync headers first. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
When the op-plugin submodule lacks the config directory for the source-built PyTorch version (e.g. no v2r13 for 2.13.0a0), fall back to the latest available config and symlink it so ci/build.sh can also resolve the expected path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
The manual gen_backend_stubs invocation required constructing version- specific config paths (v2r13 etc.) which break when the op-plugin submodule structure changes. Instead, patch generate_code.sh in-place with sed to add the --update_aoti_c_shim flag, then let ci/build.sh run the full codegen pipeline which handles all version mapping correctly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Don't touch source code in CI. Let the build fail naturally with full error output for diagnosis. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Allow parallel runs during testing phase. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
…orch 2.11+ compat Upstream PR pytorch/pytorch#181782 changed SavedVariable::unpack() to accept c10::intrusive_ptr<Node> instead of std::shared_ptr<Node>. Update the codegen template to match, fixing VariableType_0.cpp.o and python_functions_*.cpp.o build failures. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
…+ compat Same upstream PR pytorch/pytorch#181782 removed autograd::deleteNode and changed set_history() to accept intrusive_ptr<Node>. Replace shared_ptr with make_intrusive for WarnNotImplemented grad_fn. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
… PyTorch 2.11+ compat set_history() now expects c10::intrusive_ptr<Node>. Replace shared_ptr with intrusive_ptr for Identity grad_fn, same upstream change #181782. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…yTorch 2.11+ compat Upstream autograd::impl::grad_accumulator() now returns intrusive_ptr<Node> instead of shared_ptr<Node>. Update member types in reducer.hpp to match. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
The torch_npu._C shared library depends on libhccl.so from CANN, which is not on the default library path. Source set_env.sh before importing. Replace single "Verify installation" step with two steps: 1. Verify NPU device (source env, npu-smi info) 2. Verify NPU availability (source env, import torch + torch_npu) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Two changes:
1. Wheel artifact upload:
- PyTorch build now uses 'pip wheel' + 'pip install' instead of editable
install, producing a wheel at /tmp/wheels/
- torch_npu wheel is copied to /tmp/wheels/ after build
- Both wheels uploaded as artifact 'wheels-<run>' with 7-day retention
2. Checkout speed optimization:
- Submodule init is deferred to a separate step with --depth=1, avoiding
full history clones of huge submodules (third_party/*)
- Combined with --jobs=$(nproc) for parallel submodule clone
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
chmod -R 777 ~/.cache recursively walks entire ccache directory tree (hundreds of thousands of cached object files), taking ~7 minutes on self-hosted runners with persistent home directories. Replace with non-recursive chmod on directories only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Post Cache ccache step uploads the entire ccache directory (~/.cache/ccache) back to GitHub Cache, taking ~8min for a full 10G cache over limited uplink. Changes: - Reduce ccache max size from 10G to 5G (halves upload volume) - Add CCACHE_COMPRESSLEVEL=6 (zstd level 6, was default 1) - Add pre-summary ccache cleanup step (ccache -c) to prune old entries before the post-job cache upload kicks in - Report ccache disk usage after cleanup for visibility Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
fetch-depth: 0 triggers a full clone (all branches, all tags, all history). PyTorch has thousands of branches and decades of commits, taking ~3min just for git fetch. fetch-depth: 1 fetches only the single target SHA, reducing checkout to ~15s. Also apply the same submodule optimization (depth=1, parallel) to the torch_npu checkout. Expected improvement: - PyTorch checkout: ~3min → ~15s - torch_npu checkout: ~30s → ~10s Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
pip wheel + pip install puts torch in site-packages, but Python in the pytorch/ source dir imports the local torch/ directory (which contains source-only _C folder, not compiled .so). This fails with: "Failed to load PyTorch C extensions" Fix: cd /tmp before import, then cd back. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Cache restore (ccache 5.5GB) took ~7min via network, while full recompile takes <5min. Cache save/upload also cost another ~8min post-job. Remove: pip cache, ccache cache, ccache install, ccache configuration in both build steps, ccache cleanup, and ccache summary output. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Include the dynamically generated AOTI C shim header in the build artifact upload for debugging AOTInductor compatibility issues. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Now that c_shim_npu.h is kept in sync with the repo, the CI runs in validation mode — it regenerates the header and fails if it differs from the checked-in version, instead of silently overwriting it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Update the checked-in AOTI C shim header to match the version generated by the current PyTorch nightly's torchgen fallback ops list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Sync with latest upstream Ascend/pytorch master.
CLA Signature Passkerer-ai, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
No description provided.