Skip to content

Distributed inoperable in Docker Hub version #65

@dimalvovs

Description

@dimalvovs

There is an issue with operation of the Dockerhub version of pycogaps. There is a newer version of docker image available in the ghcr, it would make sense to continue maintaining just one of them.

Steps to reproduce:

  1. pull the image docker pull fertiglab/pycogaps
  2. run the image docker run -it --entrypoint /bin/bash fertiglab/pycogaps
  3. validate that standard version works well:
echo "if __name__ == '__main__':
    from PyCoGAPS.parameters import *
    from PyCoGAPS.pycogaps_main import CoGAPS
    import scanpy as sc

    modsimpath = 'data/ModSimData.txt'
    modsim = sc.read_text(modsimpath)

    params = CoParams(path=modsimpath)
    params.printParams()

    setParams(params, {
        'nIterations':10000,
        'seed': 42,
        'nPatterns': 3
    })

    params.printParams()
    start = time.time()
    result = CoGAPS(modsimpath, params)
    end = time.time()
    print('TIME:', end - start)

    result.write('data/dist_modsim.h5ad')" > test2.py

python3 test2.py 

______      _____       _____   ___  ______  _____ 
| ___ \    /  __ \     |  __ \ / _ \ | ___ \/  ___|
| |_/ /   _| /  \/ ___ | |  \// /_\ \| |_/ /\ `--. 
|  __/ | | | |    / _ \| | __ |  _  ||  __/  `--. |
| |  | |_| | \__/\ (_) | |_\ \| | | || |    /\__/ /
\_|   \__, |\____/\___/ \____/\_| |_/\_|    \____/ 
       __/ |                                       
      |___/             
                                 
                    

-- Standard Parameters --
nPatterns:  3
nIterations:  1000
seed:  0
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

setting distributed parameters - call this again if you change nPatterns
if you wish to perform genome-wide distributed cogaps, please run setParams(params, "distributed", "genome-wide")

-- Standard Parameters --
nPatterns:  3
nIterations:  10000
seed:  42
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

This is pycogaps version  0.0.1
Running Standard CoGAPS on ModSimData.txt ( 25 genes and 20 samples) with parameters: 

-- Standard Parameters --
nPatterns:  3
nIterations:  10000
seed:  42
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

Data Model: Dense, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
-- Equilibration Phase --
1000 of 10000, Atoms: 64(A), 45(P), ChiSq: 1830, Time: 00:00:00 / 00:00:00
2000 of 10000, Atoms: 69(A), 42(P), ChiSq: 1466, Time: 00:00:00 / 00:00:00
3000 of 10000, Atoms: 80(A), 50(P), ChiSq: 1229, Time: 00:00:00 / 00:00:00
4000 of 10000, Atoms: 73(A), 54(P), ChiSq: 1212, Time: 00:00:00 / 00:00:00
5000 of 10000, Atoms: 86(A), 52(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
6000 of 10000, Atoms: 81(A), 52(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
7000 of 10000, Atoms: 75(A), 48(P), ChiSq: 1178, Time: 00:00:00 / 00:00:00
8000 of 10000, Atoms: 70(A), 57(P), ChiSq: 1155, Time: 00:00:00 / 00:00:00
9000 of 10000, Atoms: 73(A), 54(P), ChiSq: 1173, Time: 00:00:00 / 00:00:00
10000 of 10000, Atoms: 79(A), 58(P), ChiSq: 1159, Time: 00:00:00 / 00:00:00
-- Sampling Phase --
1000 of 10000, Atoms: 74(A), 51(P), ChiSq: 1125, Time: 00:00:00 / 00:00:00
2000 of 10000, Atoms: 78(A), 56(P), ChiSq: 1161, Time: 00:00:00 / 00:00:00
3000 of 10000, Atoms: 79(A), 57(P), ChiSq: 1166, Time: 00:00:00 / 00:00:00
4000 of 10000, Atoms: 69(A), 55(P), ChiSq: 1176, Time: 00:00:00 / 00:00:00
5000 of 10000, Atoms: 80(A), 55(P), ChiSq: 1175, Time: 00:00:00 / 00:00:00
6000 of 10000, Atoms: 81(A), 48(P), ChiSq: 1168, Time: 00:00:00 / 00:00:00
7000 of 10000, Atoms: 73(A), 56(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
8000 of 10000, Atoms: 72(A), 51(P), ChiSq: 1156, Time: 00:00:00 / 00:00:00
9000 of 10000, Atoms: 75(A), 60(P), ChiSq: 1155, Time: 00:00:00 / 00:00:00
10000 of 10000, Atoms: 80(A), 50(P), ChiSq: 1179, Time: 00:00:00 / 00:00:00

GapsResult result object with 25 features and 20 samples
3 patterns were learned

TIME: 0.9086663722991943
  1. provide the distributed config file:
echo "if __name__ == '__main__':
    from PyCoGAPS.parameters import *
    from PyCoGAPS.pycogaps_main import CoGAPS
    import scanpy as sc

    modsimpath = 'data/ModSimData.txt'
    modsim = sc.read_text(modsimpath)

    params = CoParams(path=modsimpath)
    params.printParams()

    setParams(params, {
        'nIterations':10000,
        'seed': 42,
        'nPatterns': 3,
        'useSparseOptimization': True,
        'distributed': 'genome-wide'
    })

    params.setDistributedParams(nSets=2)
    params.printParams()
    start = time.time()
    result = CoGAPS(modsimpath, params)
    end = time.time()
    print('TIME:', end - start)

    result.write('data/dist_modsim.h5ad')" > test.py
  1. run the program python3 test.py
  2. Observed output contains an error:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 313, in callInternalCoGAPS
    gapsresult = standardCoGAPS(adata, params, uncertainty, transposeData=params.coparams["transposeData"])
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 166, in standardCoGAPS
    result = GapsResultToAnnData(gapsresultobj, adata, prm)
  File "/home/user/pycogaps-docker/PyCoGAPS/helper_functions.py", line 434, in GapsResultToAnnData
    Pmean = toNumpy(gapsresult.Pmean)[prm.coparams["subsetIndices"], :]
IndexError: index 22 is out of bounds for axis 0 with size 20
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 23, in <module>
    result = CoGAPS(modsimpath, params)
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 44, in CoGAPS
    result = distributedCoGAPS(path, params, uncertainty=None)
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 197, in distributedCoGAPS
    result = list(result)
  File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
IndexError: index 22 is out of bounds for axis 0 with size 20

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions