Data Access

Download scMORA data on demand

Full 10x Multiome datasets are distributed as paired RNA-ATAC MuData objects through Hugging Face. Use the scmora-db Python package to search metadata, download selected datasets, and load files directly into analysis workflows.

PyPI package Hugging Face repository

Recommended install

pip install scmora-db

Files are fetched from shiny321/genome-db only when requested, keeping local storage under user control.

Search the catalog

Filter datasets by dataset ID, sample source, biological condition, or model-usage labels before downloading large files.

Download selected data

Retrieve only the matched .h5mu files you need from the Hugging Face dataset repository.

Load into Python

Open downloaded MuData objects directly for downstream RNA, ATAC, and paired multiome analyses.

Python API

Search, download and load datasets

from scmora_db import search_datasets, download_datasets, load_datasets

catalog = search_datasets(
    usage_tag="control",
    detailed_condition="Control",
)

paths = download_datasets(
    dataset_id="GSM5085810_GM12878_rep1",
    cache_dir="./scmora-cache",
)

mdata = load_datasets(
    dataset_id="GSM5085810_GM12878_rep1",
)

Command Line

Use scMORA from the terminal

scmora-db search --usage-tag control
scmora-db search --detail-source "GM12878 (Cell Line)"
scmora-db download --dataset-id GSM5085810_GM12878_rep1
scmora-db load --dataset-id GSM5085810_GM12878_rep1 --backed r

scmora-db list usage-tags
scmora-db list detailed-conditions
scmora-db list detail-sources

Repository Structure

What is provided

Paired multiome files

.h5mu objects preserve barcode-matched RNA and ATAC measurements from the same cells.

Curated metadata

Catalog fields include dataset ID, GSE accession, sample source, condition, cell count, and model-usage labels.

Reusable access layer

The package provides the same search and download logic for scripts, notebooks, and reproducible pipelines.