Skip to content

Reading and Writing

Reading and writing AnnData objects.

Reading

Use anndata.io.read_h5ad to load a previously saved .h5ad file, or val.datasets.polis.load to import directly from a Polis conversation.

Writing

valency_anndata.write

write(
    filename: Path | str,
    adata: AnnData,
    *,
    include: Sequence[str] | None = None,
    ext: Literal["h5", "csv", "txt", "npz"] | None = None,
    compression: Literal["gzip", "lzf"] | None = "gzip",
    compression_opts: int | None = None,
) -> None

Write an AnnData object to file with automatic sanitization.

Wraps scanpy.write but first copies and sanitizes adata so that problematic fields (mixed-type uns["statements"] columns, categorical kmeans_* labels with NA) do not cause serialization errors.

Parameters:

Name Type Description Default
filename Path | str

Output path. If the filename has no file extension it is interpreted the same way as scanpy.write.

required
adata AnnData

Annotated data matrix. Not mutated — a sanitized copy is written.

required
include Sequence[str] | None

When not None, only the listed "namespace/key" paths are kept in the written file. Glob patterns are supported (e.g. "obsm/X_*", "obs/kmeans_*"). Valid namespaces are obs, var, obsm, varm, layers, uns, obsp, and varp. X and raw are always retained.

None
ext Literal['h5', 'csv', 'txt', 'npz'] | None

File extension from which to infer file format.

None
compression Literal['gzip', 'lzf'] | None

See h5py dataset docs <https://docs.h5py.org/en/latest/high/dataset.html>_.

'gzip'
compression_opts int | None

See h5py dataset docs <https://docs.h5py.org/en/latest/high/dataset.html>_.

None

Examples:

Basic — write everything:

val.write("conversation.h5ad", adata)

Advanced — selectively include keys with glob patterns:

val.write(
    "export.h5ad",
    adata,
    include=["obsm/X_pca", "obsm/X_pacmap", "obs/kmeans_*", "uns/*"],
)
Source code in src/valency_anndata/_write.py
def write(
    filename: Path | str,
    adata: AnnData,
    *,
    include: Sequence[str] | None = None,
    ext: Literal["h5", "csv", "txt", "npz"] | None = None,
    compression: Literal["gzip", "lzf"] | None = "gzip",
    compression_opts: int | None = None,
) -> None:
    """Write an [AnnData][anndata.AnnData] object to file with automatic sanitization.

    Wraps [scanpy.write][] but first copies and sanitizes `adata` so that
    problematic fields (mixed-type ``uns["statements"]`` columns, categorical
    ``kmeans_*`` labels with ``NA``) do not cause serialization errors.

    Parameters
    ----------
    filename
        Output path.  If the filename has no file extension it is interpreted
        the same way as [scanpy.write][].
    adata
        Annotated data matrix.  **Not** mutated — a sanitized copy is written.
    include
        When not ``None``, only the listed ``"namespace/key"`` paths are kept
        in the written file.  Glob patterns are supported
        (e.g. ``"obsm/X_*"``, ``"obs/kmeans_*"``).  Valid namespaces are
        ``obs``, ``var``, ``obsm``, ``varm``, ``layers``, ``uns``, ``obsp``,
        and ``varp``.  ``X`` and ``raw`` are always retained.
    ext
        File extension from which to infer file format.
    compression
        See `h5py dataset docs <https://docs.h5py.org/en/latest/high/dataset.html>`_.
    compression_opts
        See `h5py dataset docs <https://docs.h5py.org/en/latest/high/dataset.html>`_.

    Examples
    --------
    Basic — write everything:

    ```py
    val.write("conversation.h5ad", adata)
    ```

    Advanced — selectively include keys with glob patterns:

    ```py
    val.write(
        "export.h5ad",
        adata,
        include=["obsm/X_pca", "obsm/X_pacmap", "obs/kmeans_*", "uns/*"],
    )
    ```
    """
    sanitized = _sanitize_for_export(adata)
    if include is not None:
        _filter_adata(sanitized, include)
    sc.write(
        filename,
        sanitized,
        ext=ext,
        compression=compression,
        compression_opts=compression_opts,
    )