True paired multiome profiles
Each accepted dataset preserves same-cell RNA and ATAC measurements, avoiding post hoc matching of separate scRNA-seq and scATAC-seq experiments.
scMORA curates public 10x Multiome datasets in which RNA expression and ATAC accessibility are measured from the same cell barcodes. The database is designed for browsing paired single-cell regulatory profiles, selecting datasets by biological context, and reusing curated data for model training and benchmark construction.
Each accepted dataset preserves same-cell RNA and ATAC measurements, avoiding post hoc matching of separate scRNA-seq and scATAC-seq experiments.
Entries are organized by processed sample or analysis directory, while study accessions and biological source labels are retained for traceability.
Curated usage tags indicate whether a dataset is suitable for model training, controls, perturbation analysis, disease modeling, cancer studies, or related tasks.
The website serves precomputed UMAP, QC, gene-expression, gene-activity and nearby-peak summaries for fast interactive exploration.
scMORA focuses on datasets generated by 10x Multiome or equivalent paired RNA-ATAC workflows where both modalities share cell barcode identifiers.
Barcode matching means that one cell barcode maps to one RNA profile and one ATAC profile. This pairing is the core data unit used by scMORA.
cell_barcode_i
-> RNA expression profile_i
-> ATAC accessibility profile_i
-> shared metadata_i
The website reads curated sample metadata from
data/metadata/metadata.csv. Public dataset cards display
clean dataset IDs rather than local processing-folder suffixes.
Dataset_id: sample-level display ID and URL query IDGSE_id: GEO study accessionDetail_source: curated source, tissue, cell type or model labelCondition and Detailed_condition: broad and specific biological contextUsage_primary and Usage_tags: model-usage labelsUsage labels describe how each dataset can be reused for model training, evaluation or biological comparison. A dataset can have multiple tags.
Provides the database overview, key statistics, searchable entry access and interactive metadata summaries.
Open HomeFilters datasets by condition, sample source and model-usage label, then opens sample-level visualization pages.
Open BrowserDisplays Joint, RNA and ATAC UMAPs with metadata coloring, QC summaries and composition charts.
Select a datasetSupports gene-centered RNA FeaturePlot, ATAC-derived gene activity and nearby peak summaries.
Open AnalysisProvides access to full paired .h5mu datasets through Hugging Face and the scmora-db package.
Dataset pages load lightweight precomputed files instead of opening full
.h5mu objects at request time.
data/visual/umap_qc/h5mu_plot/
GSE166797/
GSM5085810_GM12878_rep1/
cell_embeddings.csv.gz
rna_gene_index.csv.gz
atac_peak_index.csv.gz
dataset_summary.json
Gene-level analysis is served from the Gene_Activity asset
store, which is generated before website deployment.
data/visual/Gene_Activity/
GSM5085810_GM12878_rep1/
gene_metadata.parquet
rna_values/
activity_values/
peak_summary.parquet
peak_cluster_summary.parquet
Full analysis-ready .h5mu files are hosted externally to keep
the website lightweight. Users can search and retrieve selected datasets
with the scmora-db Python package.
pip install scmora-db
scmora-db search --usage-tag control
scmora-db download --dataset-id GSM5085810_GM12878_rep1
Repository: shiny321/genome-db
Gene activity is computed as an ATAC peak-based approximation for web visualization and fast interactive querying.
exp(-abs(distance) / 5000) + exp(-1)log1p/api/genomes/examples: dataset list/api/genomes/filter: condition, source and label filtering/api/visualization/meta: dataset visualization metadata/api/visualization/points: UMAP points and composition summaries/api/visualization/gene-feature: RNA FeaturePlot payload/api/visualization/gene-activity: ATAC-derived gene activity payload/api/visualization/gene-peaks: nearby peak payloadIf scMORA is used in published work, please cite the database manuscript and include the accessed version/date of the website and downloaded datasets.
scMORA: a barcode-matched 10x Multiome resource for paired RNA-ATAC data.
Version 0.1.0, 2026.