Sometimes, serializing the contents of an entire directory as a bytes
field in your experiment is a convenient way to share code and other small files across different environments.
The Archive
interface simplifies the creation and extraction of in-memory TAR archives. The example below demonstrates
how to archive a src
directory, extracted to src_archived
.
Warning
Anything below 100 MB can easily fit in a field as a binary blob with Archive
.
We recommend to rely on the DataStore
interface to persist and move larger archives.
Archive example
| from os import mkdir
from mltraq import create_session
from mltraq.storage.archivestore import Archive
from mltraq.utils.fs import glob, tmpdir_ctx
with tmpdir_ctx():
# Work in a temporary directory
# Create a directory with a file
mkdir("src")
with open("src/simple_print.py", "w") as f:
f.write("print(1 + 2)\n")
# Create an experiment
s = create_session()
e = s.create_experiment("test")
# Create the archive
e.fields.src = Archive.create(src_dir="src", arc_dir="src_archived")
# Persist the experiment, including the binary TAR blob
e.persist()
# Load the experiment
e = s.load_experiment("test")
# Extract the contents of the archive
e.fields.src.extract()
# Print contents of current directory
print("Contents of current directory:")
for idx, name in enumerate(glob("**", root_dir=".", recursive=True)):
print(f"[{idx:2d}] {name}")
|
OutputContents of current directory:
[ 0] src_archived
[ 1] src_archived/simple_print.py
[ 2] src
[ 3] src/simple_print.py