SNH_AIH_spatial_transcriptome

This repository contains the analysis code and workflows associated with the manuscript:

“Spatial transcriptomics links hepatocyte-macrophage interactions to viral signatures in seronegative hepatitis” Submitted to Nature Communications, 2025

The repository includes workflows for CosMx single-cell spatial transcriptomics, Visium whole-transcriptome profiling, and metatranscriptomic analysis of explanted liver tissue, along with scripts for data processing, integration, and figure generation.

Overall Repository Structure

SNH_spatial_transcriptome/
├── CosMx/
├── Visium/
├── Metatranscriptomics/
├── .gitignore/
├── LICENSE/
└── README.md/

CosMx Processing the Data and Figure Generation

This repository provides an easy-to-use pipeline for pre-processing CosMx data and generating figures. It is designed for liver disease research (e.g., autoimmune hepatitis (AIH), seronegative (SN) liver disease, and donor (D) samples) and can produce UMAPs, barplots, boxplots, heatmaps, dotplots, and CellChat-based cell-cell interaction visualizations.

The scripts are structured for reproducibility, with figures saved automatically to a dedicated folder. This repository is ideal for immunology researchers looking to visualize cell types, gene expression patterns, and intercellular communication in spatial transcriptomics data.

CosMx Repository Structure

The CosMx repository files are structured by the below:

├── 2.1.combine_all.r          # Script to combine all samples
├── 2.2.cell_label.r           # Script to label cell types
├── 2.3.singleR.r              # Script to use singleR to identify cell types
├── 2.4.immune.r               # Script to split cells into immune and non-immune to help with cell labelling
├── 2.5.polygon.r              # Script to 
├── 2.6.de.r                   # Script to do differentials
├── Choosing_CosMx_areas.py    # Script to chosse areas of interest on spatial CosMx plot
└── CosMx_plots.R                # Script to generate all figures

Data structure

├── data/                          
│   ├── seurat_obj_processed.rds   # Preprocessed Seurat object
├── scripts/                       # Script for workflow
│   └── 2.1.combine_all.r          # Script to combine all samples
│   └── 2.2.cell_label.r           # Script to label cell types
│   └── 2.3.singleR.r              # Script to use singleR to identify cell types
│   └── 2.4.immune.r               # Script to split cells into immune and non-immune to help with cell labelling
│   └── 2.5.polygon.r              # Script to 
│   └── 2.6.de.r                   # Script to do differentials
│   └── CosMx_plots.r              # Script to generate all figures
└── figures/                       # Output folder for all generated plots

data/: Store your preprocessed Seurat object here. Must contain a final_cell_types column in meta.data for annotated cell types.
figures/: All output figures (UMAP, barplot, boxplot, heatmap, dotplot, CellChat networks) will be automatically saved here.
scripts/: Contains the combined plotting script. Designed to run as a single R script for all figure types.

Requirements

R ≥ 4.2
Required packages:

install.packages(c("dplyr", "ggplot2", "pheatmap", "patchwork", "ggrepel"))
BiocManager::install(c("SingleR", "CellChat", "Seurat"))

• Input data: a Seurat object with:

• Annotated cell types in meta.data$final_cell_types

• Expression layers including data and counts

Usage

1. Clone the repository:

git clone https://github.com/AmberBozward/SNH_AIH_spatial_transcriptomics.git
cd spatial-figures

2. Place your preprocessed Seurat/CosMx object in data/:

data/seurat_obj_processed.rds

3. Run the combined plotting script in R:

source("scripts/combined_plots.R")

4. Check the figures/ folder for all generated outputs: • UMAPs

• Barplots

• Boxplots

• Heatmaps

• Dotplots

• CellChat network plots

Customisation

• Modify genes for plots: Edit the genes_to_plot vector in the script for boxplots and dotplots.

• Change cell types: Ensure final_cell_types contains the desired annotations for coloring and subsetting.

Approximate run time for a CosMx Seurat object

The following estimates are based on running an R session in the Birmingham BlueBEAR supercomputing resources with 72 cores (each core allocates 4GB memory).

A Seurat object containing the centroids, segmentation and molecules data in the images slot:

• Reading the object from the directory onto the R session ~ 6 mins

• Integrating the data on individual samples using Harmony ~ 90 mins

• Plotting single molecules spatially ~ 3 mins

• All other plots that do not use the molecules slot ~ 1-2 mins

Choosing areas on CosMx spatial plot

'Choosing_CosMx_areas.py' is a Python script for interactively selecting and extracting specific spatial regions of interest (ROIs) from CosMx spatial transcriptomics data created by John Cole.

The script allows users to:

• Load the CosMx segmentation image and associated spatial coordinates.

• Visually select regions within the tissue using polygon or rectangular selection tools.

• Extract cell IDs or molecular data corresponding to the selected region.

Dependencies:

• Python ≥ 3.8

• matplotlib

• numpy

• pandas

• opencv-python (for image handling)

Purpose: This tool was developed to facilitate manual spatial subsetting of CosMx datasets, enabling region-specific analyses (e.g., comparing parenchyma vs non-parenchyma areas in liver tissue).

Neighbourhood Enrichment Analysis and Ripleys Spatial Statistics

All relevant Python code for neighborhood enrichment analysis and Ripley's spatial statistics to understand cellular spatial organisation patterns can be found in the the following project repository:

https://github.com/kyliesavoye/hepatitis-spatial-transcriptomics/

Visium Processing the Data and Figure Generation

This repository provides an easy-to-use pipeline for pre-processing Visium data and generating figures. It is designed for liver disease research (e.g., autoimmune hepatitis (AIH), seronegative (SN) liver disease, and donor (D) samples) and can produce UMAPs, barplots, boxplots, heatmaps and dotplots.

Visium Repository Structure

The repository files are structured by the below:

├── .gitignore                 
├── 1.QC/r                                      # Script for inital QC
├── 2a.integration.light.r                      # Script for light integration
├── 2b.integration.strict.r                     # Script for strict integration
├── 3a.cell_types_public.r                      # Script to label cell types
├── 3b.spatial_deconvolution_strict.r           # Script to strictly deconvolute cell types
├── 3c.spatial_deconvolution_light.r            # Script to lightly deconvolute cell types
├── 3d.pseudobulk_linneages_deconvolution.r     # Script to pseudobulk deconvoluted cell types
├── 3f_spatial_deconvolution_lineagges_light.r  # Script to pseudobulk deconvoluted cell types
├── 4a_region_barcodes.r                        # Script to generate barcodes from regions of interest (parenchyma vs non-parenchyma)
├── 4b_region_barcodes_pseudo.r                 # Script to pseudobulk chosen regions
├── 5a.plots.r                                  # Script to generate all figures
├── LICENSE                                     # License information for this repository
└── README.md/                                  # This document

Visium Requirements

R Version

R >= 4.2.0

Required R Packages

Make sure the following packages are installed before running the scripts:

Seurat (>= 5.0)
ggplot24.0.0
dplyr1.1.4
tidyr1.3.1
patchwork1.3.0
Matrix1.7-3
stringr1.5.2
cowplot1.1.3
ggrepel0.9.6
gridExtra2.3
hdf5r1.3.12
spatstat (if using spatial statistics)
SeuratDisk (if working with .h5Seurat or converting from AnnData)

You can install missing packages with:

install.packages(c("ggplot2", "dplyr", "tidyr", "patchwork", 
                   "Matrix", "stringr", "cowplot", "ggrepel", "gridExtra"))

And from Bioconductor if needed:

install.packages("Seurat")
install.packages("SeuratDisk")
install.packages("spatstat.geom")
install.packages("spatstat.core")

Input Data

The scripts assume 10x Genomics Space Ranger outputs in the following format:

data/ sample1/ filtered_feature_bc_matrix.h5 spatial/ tissue_hires_image.png scalefactors_json.json sample2/ ...

Metatranscriptomics

This repository provides an easy-to-use pipeline for pre-processing metatranscriptomics data and generating figures. It is designed for liver disease research (e.g., autoimmune hepatitis (AIH), seronegative (SN) liver disease, and donor (D) samples).

Metatrascriptomics Repository Structure

└── Metatranscriptomics.R # Script for all analysis and figures

Metatranscriptomics Code Usage

Barcode trimming and QC with fastp: fastp -i input_R1.fastq.gz -I input_R2.fastq.gz
-o output_R1_trimmed.fastq.gz -O output_R2_trimmed.fastq.gz
--detect_adapter_for_pe --trim_poly_g --html fastp_report.html
--length_required 40 --qualified_quality_phred 20 --thread 10

Creating bowtie2 index and aligning reads to it bowtie2-build /path/to/GRCh38.fasta GRCh38_index bowtie2 -x GRCh38_index -1 trimmed_R1.fastq.gz -2 trimmed_R2.fastq.gz --very-sensitive-local -k 100 --score-min L, 0, 1.6 -S output.sam samtools view -bS output.sam | samtools sort -o output_sorted.bam samtools index output_sorted.bam

Running Telescope telescope assign /path/to/output_sorted.bam/ /path/to/GTF_file/transcripts.gtf --ncpu 12 Note: obtain HERV and L1 annotation file (transcripts.gtf) following the instructions available in: https://github.com/mlbendall/telescope_annotation_db/tree/af3c359/builds/retro.hg38.v1

Host gene quantification with Salmon: preparing metadata: grep "^>" <(gunzip -c primary_assembly.genome.fa.gz) | cut -d " " -f 1 > decoys.txt sed -i.bak -e 's/>//g' decoys.txt preparing concatenated transcriptome and genome reference file for index: zcat gencode.transcripts.fa.gz primary_assembly.genome.fa.gz | gzip -c > gentrome.fa.gz running Salmon: salmon index -t gentrome.fa.gz -d decoys.txt -p 12 -i salmon_index --gencode salmon quant -i salmon_index -l A -1 trimmed_R1.fastq.gz -2 trimmed_R2.fastq.gz -p 16 -o salmon_output

Import Salmon into R and run DESeq2 using tximport (see R script for full detail, the main steps are listed below) use tximport and tx2gene run DESeq2 pipeline visualisation and combining Telescope counts with Salmon gene counts

Contributors

Dr Amber Bozward — CosMx and Visium workflows, integration, figure generation

Dr Mahboobeh Behruznia - Metatranscriptomics sequencing analysis

John Cole - CosMx and Visium workflows, integration, figure generation

Chiranjit Das - Figure generation

Kylie Savoye - CosMx neighbourhood analysis, figure generation

Professor Ye Oo - Supervisory role

License

This repository is licensed under the MIT License (see LICENSE file).

Citation

If you use this code, please cite:

Bozward et al., “Spatial transcriptomics links hepatocyte-macrophage interactions to viral signatures in seronegative hepatitis”, Nature Communications (2025).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SNH_AIH_spatial_transcriptome

Table of Contents

Overall Repository Structure

CosMx Processing the Data and Figure Generation

CosMx Repository Structure

Data structure

Requirements

Usage

Customisation

Approximate run time for a CosMx Seurat object

Choosing areas on CosMx spatial plot

Neighbourhood Enrichment Analysis and Ripleys Spatial Statistics

Visium Processing the Data and Figure Generation

Visium Repository Structure

Visium Requirements

R Version

Required R Packages

Input Data

Metatranscriptomics

Metatrascriptomics Repository Structure

Metatranscriptomics Code Usage

Contributors

License

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
CosMx		CosMx
Metatranscriptomics		Metatranscriptomics
Visium		Visium
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

AmberBozward/SNH_AIH_spatial_transcriptome

Folders and files

Latest commit

History

Repository files navigation

SNH_AIH_spatial_transcriptome

Table of Contents

Overall Repository Structure

CosMx Processing the Data and Figure Generation

CosMx Repository Structure

Data structure

Requirements

Usage

Customisation

Approximate run time for a CosMx Seurat object

Choosing areas on CosMx spatial plot

Neighbourhood Enrichment Analysis and Ripleys Spatial Statistics

Visium Processing the Data and Figure Generation

Visium Repository Structure

Visium Requirements

R Version

Required R Packages

Input Data

Metatranscriptomics

Metatrascriptomics Repository Structure

Metatranscriptomics Code Usage

Contributors

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages