Skip to content

Implement custom caching strategy #1

@polyrand

Description

@polyrand

I'm not happy very with the caching strategies that fsspec provides. fsspec optimizes for certain memory access patterns that, I believe, are not always optimal for a DB file.

The mmap caching caught my attention, but the tests I've run are very slow. The mmap logic can be optimized.

I think I should implement my own caching strategy. Some ideas:

  • Least-Frequently-Used strategy: This could be useful for DB files that can't/shouldn't be fully copied to disk.
  • Incremental full mmap. mmap-ed file, eventually the full database. (Have a mmaped bitset to store which pages have been fetched already?).

Both approaches would need to enable sharing a cache between multiple processes in the same VM.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions