Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
dda0543
fix: Parallelization broken in lpca and different parallelism backend…
nasiryahm Mar 18, 2026
a4e1851
Tests: Update to CI for new parallel library changes plus fix for ran…
nasiryahm Mar 24, 2026
6aac4b1
speedup: Parallelization of post-processing neighbour distance measur…
nasiryahm Mar 18, 2026
8035701
speedup: Vectorizing pdist
nasiryahm Mar 18, 2026
01ab3c8
fix: Improving error handling and variable descriptions
nasiryahm Mar 18, 2026
0e2313b
style: Code cleanups for neatness and readability. Removed unused pro…
nasiryahm Mar 18, 2026
e425db7
CI: New continuous integration testing for speed. Tests speed across …
nasiryahm Mar 18, 2026
cc84dd7
speedup: Dhruv's eigh implementation
nasiryahm Mar 18, 2026
40db85a
fix: Imports fix
nasiryahm Mar 18, 2026
4641346
speedup: Vectorizing B construction and avoiding sparse to dense conv…
nasiryahm Mar 18, 2026
26528b3
speedup: Speedup of zeta function by replacing pdist with measurement…
nasiryahm Mar 18, 2026
5bdc3d7
Merging zeta change
nasiryahm Mar 19, 2026
cb6fcd1
tests: Updating benchmarking tests to include windows, mac, and linux…
nasiryahm Mar 19, 2026
a1a2806
Test: Adding a test benchmark script to give timings across different…
nasiryahm Mar 19, 2026
ef23418
Fix: Issue with sha code
nasiryahm Mar 19, 2026
bfca429
Matrixifying refs for benchmarks and removing Dhruv's
nasiryahm Mar 19, 2026
274f841
Efficiency: Memefficiency via Joshua code and and updated sha for com…
nasiryahm Mar 24, 2026
1254199
Fix: new mode of batch all and importing
nasiryahm Mar 18, 2026
a82f827
Fix: Past git repo commit needed init
nasiryahm Mar 24, 2026
d1f3123
Update: Cleaner method for src reconstruction for commit comparisons
nasiryahm Mar 24, 2026
81018d5
Test: Alternative memory efficient, with parallelization over num thr…
nasiryahm Mar 25, 2026
3325634
Style: Updating parameter names to match sklearn
nasiryahm Mar 25, 2026
d7e10d6
Fix: Test var names
nasiryahm Mar 25, 2026
2d7695b
style: Naming change of buml_obj to model
nasiryahm Mar 25, 2026
d7fcb10
feat: New method for dynamic allocation of memory to enable large dat…
nasiryahm Mar 25, 2026
d8f85a1
Fix: Dynamic memory allocation fix
nasiryahm Mar 25, 2026
87b2e81
style: progress bar update and verbose cleanup
nasiryahm Mar 25, 2026
1648a79
Update to readme and pyproject (tqdm addition)
nasiryahm Mar 25, 2026
a074cc9
style: Cleaner tqdms
nasiryahm Mar 25, 2026
9203f56
speedup test: bottlnecks re-introduced. Testing shift back.
nasiryahm Mar 25, 2026
a46dceb
feat: add global distortion measurement
ffOj Apr 12, 2026
293c317
Fix for the breakages of parallelization in batched calls
nasiryahm May 3, 2026
5369ea3
Memory Limiting Overhaul Check
nasiryahm May 3, 2026
2069762
Allowing multiithreads within blas workers
nasiryahm May 3, 2026
f01c358
ci fixes
nasiryahm May 3, 2026
a2e7add
Testing more efficient allocation and memory checks
nasiryahm May 3, 2026
84a46f4
Setting cpu core checker and warning
nasiryahm May 3, 2026
7db1ddf
Fix param.v update
chiggum May 4, 2026
39534f8
Cast Utildeg to boolean after matrix multiplication
chiggum May 4, 2026
cf7eced
Fix output collection
chiggum May 4, 2026
07bb35e
Enhance comments on singular value handling in overlaps
chiggum May 4, 2026
8fcf478
Reverting to old parallel lib for exact matching to Dhruv's initial c…
nasiryahm May 21, 2026
02ca601
Memory cache update for memory testing
nasiryahm May 21, 2026
5d35f4c
Merge branch 'main' into efficiencies
nasiryahm May 21, 2026
898a7bb
Merge remote-tracking branch 'origin/chiggum-patch-1' into efficiencies
nasiryahm May 21, 2026
9c4107c
Merge remote-tracking branch 'origin/chiggum-patch-2' into efficiencies
nasiryahm May 21, 2026
c78c0ce
Merge remote-tracking branch 'origin/chiggum-patch-3' into efficiencies
nasiryahm May 21, 2026
c9890a7
Merge remote-tracking branch 'origin/feature/global-distortion' into …
nasiryahm May 21, 2026
101ba26
tqdm cleanup
nasiryahm May 21, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: Performance Benchmark

on:
push:
branches: [ '**' ]
pull_request:
branches: [ 'main' ]
workflow_dispatch:

permissions:
contents: write

jobs:
benchmark:
name: RATS Benchmark (${{ matrix.os }}, ${{ matrix.ref_name }})
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
ref: ["${{ github.sha }}", "5eaa1c6a82eb0a316ff2a6911e0ef7379add0bad"]
include:
- ref: "${{ github.sha }}"
ref_name: "Current HEAD"
- ref: "5eaa1c6a82eb0a316ff2a6911e0ef7379add0bad"
ref_name: "5eaa1c6 (Joshua V1)"
runs-on: ${{ matrix.os }}
defaults:
run:
shell: bash
steps:
- name: Checkout Code
uses: actions/checkout@v4
with:
ref: ${{ github.sha }}
fetch-depth: 0

- name: Overlay src/ from Comparison Ref
if: matrix.ref != github.sha
run: |
git fetch origin ${{ matrix.ref }} --depth=1
git checkout FETCH_HEAD -- src/

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'

- name: Install Dependencies
run: |
python -m pip install --upgrade pip
python -m pip install -e .[test]

- name: Run Benchmark
run: |
echo "## RATS Performance Benchmarks (${{ matrix.os }})" >> $GITHUB_STEP_SUMMARY
echo "Commit: ${{ matrix.ref_name }} (${{ matrix.ref }})" >> $GITHUB_STEP_SUMMARY
echo "Dataset: Klein Bottle (4D)" >> $GITHUB_STEP_SUMMARY
python tests/scripts/benchmark.py >> $GITHUB_STEP_SUMMARY
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ jobs:

- name: Install Dependencies
run: |
pip install --upgrade pip
pip install -e .[test]
python -m pip install --upgrade pip
python -m pip install -e .[test]

- name: Set Fast Mode Flag
id: set-flag
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
src/pyRATS/__pycache__/
src/pyRATS.egg-info
tests/data
.vscode
.DS_Store
__pycache__
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ import datasets, vis
X, labels, _ = datasets.Datasets().kleinbottle4d(n=5000)

# create a RATS object projecting the data to 2d while tearing the manifold
model = rats.RATS(d=2, k=28, eta_min=5, to_tear=True)
model = rats.RATS(n_components=2, n_neighbors=28, min_cluster_size=5, tear=True)
y = model.fit_transform(X)

# compute the gluing instructions along the tear
Expand Down Expand Up @@ -61,6 +61,15 @@ python -m http.server 8000
```
and opening http://localhost:8000 in your browser.

# Memory Management
`pyRATS` implements dynamic chunking to prevent memory spikes on large or high-dimensional datasets. By default, it attempts to use up to 75% of available system RAM. This measurement can fail and defaults to 4GB otherwise. If you need to explicitly set memory usage, set the `PYRATS_MEMORY_LIMIT` environment variable (in bytes):

```bash
# Set pyRATS memory usage to 40GB
export PYRATS_MEMORY_LIMIT=42949672960
python your_script.py
```


Citation
----------
Expand Down
Empty file added examples/__init__.py
Empty file.
16 changes: 8 additions & 8 deletions examples/barbell_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,14 +58,14 @@
"metadata": {},
"outputs": [],
"source": [
"buml_obj = rats.RATS(\n",
" d=2,\n",
" k=28,\n",
" eta_min=5,\n",
" to_tear=True,\n",
"model = rats.RATS(\n",
" n_components=2,\n",
" n_neighbors=28,\n",
" min_cluster_size=5,\n",
" tear=True,\n",
")\n",
"\n",
"y = buml_obj.fit_transform(X=X)\n",
"y = model.fit_transform(X=X)\n",
"y"
]
},
Expand All @@ -85,7 +85,7 @@
"outputs": [],
"source": [
"tear_color_eig_inds = [0,1,2] # interior has green color so assigned smallest value of i to g-channel\n",
"color_of_pts_on_tear = buml_obj.compute_color_of_pts_on_tear(\n",
"color_of_pts_on_tear = model.compute_color_of_pts_on_tear(\n",
" y,\n",
" tear_color_eig_inds\n",
")\n",
Expand Down Expand Up @@ -129,4 +129,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
20 changes: 10 additions & 10 deletions examples/curvedtorus_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,17 +58,17 @@
"metadata": {},
"outputs": [],
"source": [
"buml_obj = rats.RATS(\n",
" d=2,\n",
" k=21,\n",
" eta_min=5,\n",
"model = rats.RATS(\n",
" n_components=2,\n",
" n_neighbors=21,\n",
" min_cluster_size=5,\n",
" nu=3,\n",
" max_iter=40,\n",
" to_tear=True,\n",
" cost_fn_name='distortion'\n",
" n_iter=40,\n",
" tear=True,\n",
" cost_function='distortion'\n",
")\n",
"\n",
"y = buml_obj.fit_transform(X=X)\n",
"y = model.fit_transform(X=X)\n",
"y"
]
},
Expand All @@ -88,7 +88,7 @@
"outputs": [],
"source": [
"tear_color_eig_inds = [6, 1, 4] # interior has green color so assigned smallest value of i to g-channel\n",
"color_of_pts_on_tear = buml_obj.compute_color_of_pts_on_tear(\n",
"color_of_pts_on_tear = model.compute_color_of_pts_on_tear(\n",
" y,\n",
" tear_color_eig_inds\n",
")\n",
Expand Down Expand Up @@ -132,4 +132,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
50 changes: 32 additions & 18 deletions examples/kleinbottle_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install git+https://github.com/Mishne-Lab/pyRATS\n",
"# %pip install git+https://github.com/Mishne-Lab/pyRATS\n",
"# if not already installed on the system:\n",
"# %pip install pandas imageio seaborn matplotlib"
]
Expand All @@ -19,9 +19,12 @@
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"sys.path.insert(0, '../src/')\n",
"\n",
"from matplotlib import pyplot as plt\n",
"\n",
"from pyRATS import rats\n",
"from pyRATS import rats, global_distortion\n",
"import datasets, vis"
]
},
Expand Down Expand Up @@ -58,17 +61,36 @@
"metadata": {},
"outputs": [],
"source": [
"buml_obj = rats.RATS(\n",
" d=2,\n",
" k=28,\n",
" eta_min=5,\n",
" to_tear=True,\n",
"model = rats.RATS(\n",
" n_components=2,\n",
" n_neighbors=28,\n",
" min_cluster_size=5,\n",
" tear=True,\n",
")\n",
"\n",
"y = buml_obj.fit_transform(X=X)\n",
"y = model.fit_transform(X=X)\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0fdf51f9",
"metadata": {},
"outputs": [],
"source": [
"n_nbrs = global_distortion.get_n_nbrs_for_distortion_analysis(X)\n",
"\n",
"global_distortion.compute_global_distortion(\n",
" X,\n",
" y,\n",
" n_nbrs,\n",
" buml_obj,\n",
" metric=\"euclidean\",\n",
" max_crossings=20,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -77,7 +99,7 @@
"outputs": [],
"source": [
"tear_color_eig_inds = [7, 2, 4]\n",
"color_of_pts_on_tear = buml_obj.compute_color_of_pts_on_tear(\n",
"color_of_pts_on_tear = model.compute_color_of_pts_on_tear(\n",
" y,\n",
" tear_color_eig_inds, \n",
")\n",
Expand All @@ -90,14 +112,6 @@
")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c9b56299",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -121,4 +135,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
18 changes: 9 additions & 9 deletions examples/small_noisyswissroll3_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,15 +58,15 @@
"metadata": {},
"outputs": [],
"source": [
"buml_obj = rats.RATS(\n",
" d=2,\n",
" k=28,\n",
" eta_min=5,\n",
" to_tear=True,\n",
" cost_fn_name='alignment'\n",
"model = rats.RATS(\n",
" n_components=2,\n",
" n_neighbors=28,\n",
" min_cluster_size=5,\n",
" tear=True,\n",
" cost_function='alignment'\n",
")\n",
"\n",
"y = buml_obj.fit_transform(X=X)\n",
"y = model.fit_transform(X=X)\n",
"y"
]
},
Expand All @@ -78,7 +78,7 @@
"outputs": [],
"source": [
"tear_color_eig_inds = [0, 1, 2] # interior has green color so assigned smallest value of i to g-channel\n",
"color_of_pts_on_tear = buml_obj.compute_color_of_pts_on_tear(\n",
"color_of_pts_on_tear = model.compute_color_of_pts_on_tear(\n",
" y,\n",
" tear_color_eig_inds, \n",
")\n",
Expand Down Expand Up @@ -122,4 +122,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
16 changes: 8 additions & 8 deletions examples/small_spherewithhole_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,14 +58,14 @@
"metadata": {},
"outputs": [],
"source": [
"buml_obj = rats.RATS(\n",
" d=2,\n",
" k=28,\n",
" eta_min=5,\n",
" to_tear=True,\n",
"model = rats.RATS(\n",
" n_components=2,\n",
" n_neighbors=28,\n",
" min_cluster_size=5,\n",
" tear=True,\n",
")\n",
"\n",
"y = buml_obj.fit_transform(X=X)\n",
"y = model.fit_transform(X=X)\n",
"y"
]
},
Expand All @@ -87,7 +87,7 @@
"figsize = (3,3)\n",
"\n",
"tear_color_eig_inds = [7, 1, 2]\n",
"color_of_pts_on_tear = buml_obj.compute_color_of_pts_on_tear(\n",
"color_of_pts_on_tear = model.compute_color_of_pts_on_tear(\n",
" y,\n",
" tear_color_eig_inds, \n",
")\n",
Expand Down Expand Up @@ -131,4 +131,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
16 changes: 8 additions & 8 deletions examples/small_swissrollwithhole.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,14 +58,14 @@
"metadata": {},
"outputs": [],
"source": [
"buml_obj = rats.RATS(\n",
" d=2,\n",
" k=28,\n",
" eta_min=5,\n",
" to_tear=True,\n",
"model = rats.RATS(\n",
" n_components=2,\n",
" n_neighbors=28,\n",
" min_cluster_size=5,\n",
" tear=True,\n",
")\n",
"\n",
"y = buml_obj.fit_transform(X=X)\n",
"y = model.fit_transform(X=X)\n",
"y"
]
},
Expand All @@ -85,7 +85,7 @@
"outputs": [],
"source": [
"tear_color_eig_inds = [0, 1, 2] # interior has green color so assigned smallest value of i to g-channel\n",
"color_of_pts_on_tear = buml_obj.compute_color_of_pts_on_tear(\n",
"color_of_pts_on_tear = model.compute_color_of_pts_on_tear(\n",
" y,\n",
" tear_color_eig_inds, \n",
")\n",
Expand Down Expand Up @@ -129,4 +129,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
Loading
Loading