Semantic Bio — Search Biological Databases

Loading record...

Summary

Enrichment libraries and Hi-C libraries were processed using identical scripts. Initially, different read types were demultiplexed based on identification sequences in the R2 read. For RNA data, we first removed reads containing no poly-A sequence following GAT5-RT primer sequence. We then aligned Read 1 to the GRCm38 reference genome using STAR. Duplicates were removed based on UMIs with UMI-tools. RNA count matrix was generated by FeatureCounts with parameters “-O -M --fraction”. We used RNAsnpSplit for phasing of allele- specific RNA counts.For HiC data, reads were mapped to the GRCm38 reference genome with BWA-mem2 in “5SP” mode. Contacts and 3D genome structures were generated with Hickit software using default parameters. To remove potential contamination from RNA reads, we cleaned contacts between two exons from the same transcript, which accounted for ~0.04% of all contacts. For quality control of reconstructed 3D genome structures, five replicates were generated with random seeds, and median RMSD for each combination of 3 replicates was calculated. The first replicate in the combination with the minimum median RMSD value (if < 1.5) was used for downstream analysis. 3D proximity and 3D distance maps were generated as previously described with custom codes. For accessibility reads and histone modification reads, R2 reads, which represent Tn5 insertion sites, were aligned using BWA-mem2 with default parameters. Afterward, a custom Python script was employed to remove PCR duplicates, generating single-cell signal fragment files. For haplotype phasing, fragments were first assigned to a haplotype using SNP information from the R2 read. If the haplotype could not be determined from R2, SNP information from the R1 read was then utilized. All reads were mapped to the mm10 reference genome. All these processing steps were organized and implemented using Snakemake, and are available at https://github.com/skelviper/CHARM.

Gene regulatory landscape dissected by single-cell four-omics sequencing

Summary

Details

Identifier