Skip to content

Blog#

Introducing BunchStore: A Bunch with Persistency

The BunchStore class serves as a straightforward key-value store, mapping a Bunch object seamlessly across both memory and the filesystem. It is particularly useful for caching API responses, utilizing its dictionary-like interface for easy data access and storage. Its file-based persistence guarantees data continuity across sessions.

Tip

The DATAPAK storage format used for serialization supports a wide range of complex data types. You can store text, images, dictionaries, lists, sets, arrays, dataframes, and more.

BunchStore example

import numpy as np

from mltraq.opts import options
from mltraq.storage.serialization import deserialize
from mltraq.utils.bunch import BunchStore
from mltraq.utils.fs import tmpdir_ctx

with tmpdir_ctx():

    # Default location of BunchStore on filesystem
    print("Pathname:", options().get("bunchstore.pathname"))

    # Initialize object, creating file
    bs = BunchStore()

    # Set two keys
    bs["A"] = 123
    bs.B = np.array([4, 5, 6])

    # Reinitialize object object, reloading file
    bs = BunchStore()
    bs["C"] = 789

    # Accessing previouslty stored valued
    print("bs.A:", bs.A)

    data = deserialize(
        open(options().get("bunchstore.pathname"), "rb").read()
    )
    print(f"File contents: {data}")
Output
Pathname: bunchstore.data
bs.A: 123
File contents: {'A': 123, 'B': array([4, 5, 6]), 'C': 789}