UMAP

Loading...

Loading...

Tables













Expression Plot


Loading...

































































































































































In Situ Projection


























































































































Facet UMAP


Loading...









































































Dotplot

Download Dot Plot

Loading...
























































































































Gene Differential Expression Tests

As of 2021-01-06 we have swapped out the "pseudo-Bulk" differential testing for a simpler testing system: the scran findMarkers tool. Why? Well frankly as we used the "pseudo-Bulk" results more and more we regularly found odd results with low p values that were being driven by very few cells with high expression. The findMarkers tool, in comparison, is more robust and reliable. It uses the wilcox test. Genes are required to 1. be expressed in >20% of the cells in the group (e.g. Cluster or Cell Type) and 2. be expressed at least twice as high against the other cells (e.g. GeneA has a mean expression of "100" in cells in ClusterX and "40" in cells-not-in-ClusterX

What is that "AUC" column? Area Under the Curve. Instead of reporting differential expression, which can be skewed by a low number of cells with very high expression, we report the power of the gene of interest to distinguish between the base group (e.g. Rods) versus the comparison group (e.g. Cones). An AUC of 1 mean thats the marker can perfectly (100%) distinguish cells between the two groups with the marker. AUC of 0 (or missing) means that the gene has no power.

Why are all the PValues and FDR the same for each gene? The p value is calculated at the gene level, while the AUC is calculated for each pair wise test. What that means is that if you test Rods vs not-Rods, the FDR for GeneA could be 0.01. Which means that GeneA "significantly" (well actually means you can reject the null hypothesis that....) distinguishes Rods against all other cells. If you want to dive into Rods vs some-specific-Cell-Type then use the AUC, which are calculated for each tissue - tissue combination.











Data


The codebase for the creation of the scEiaD dataset is on github

If size not given, it is less than 1 GB

Run plae Locally

If you have 200GB of free hard drive space, you can run plae on your own computer. Installation instructions are available in our Github repository (this is the codebase for the app you are using now).

Seurat Objects

AnnData Objects

Diff Testing Results

Metadata

Counts



















plae v0.72


PLatform for Analysis of scEiad

plae pronounced play logo. eye ball with arms running to slide made of retina cells

What is scEiaD?

single cell Eye in a Disk

The light-sensitive portion of the eye is the retina. The retina itself is not a monolithic tissue - there are over 10 major cell types. The cones and rods which convert light into signal are supported by a wide variety of neural cell types with distinct roles in interpretting and transmitting the visual signal to the brain. Behind the retina is the RPE and vasculature, which supports the high energetic needs of the rods and cones. scEiaD is a meta-atlas that compiles 1.2 million single-cell back of the eye transcriptomes across 28 studies, 18 publications, and 3 species. Deep metadata mining, rigorous quality control analysis, differential gene expression testing, and deep learning based batch effect correction in a unified bioinformatic framework allow the universe of retina single cell expression information to be analyzed in one location.

tldr

You can look up gene expression by retina cell type across loads of different studies, three organisms, and multiple developmental stages.

Preprint of the data creation and benchmarking now on bioRxiv!


Data Sources

Citation PMID SRA Accession organism Platform Count Post QC
Count
Labels
Voigt AP, Whitmore SS, Flamme- ... 31075224 SRP194595 Homo sapiens 10xv3 146353 242 Yes
Cowan CS, Renner M, De Gennaro ... 32946783 EGAD00001006350 Homo sapiens 10xv2 99878 44679 Yes
Lu Y, Shiau F, Yi W, Lu S et a ... 32386599 SRP223254 Homo sapiens 10xv2 88202 74624 Yes
Yan W, Peng YR, van Zyl T, Reg ... 32555229 SRP255195 Homo sapiens 10xv3 78706 16 Yes
Lu Y, Shiau F, Yi W, Lu S et a ... 32386599 SRP151023 Homo sapiens 10xv2 47217 43742 Yes
Voigt AP, Whitmore SS, Mulfaul ... 32531351 SRP257883 Homo sapiens 10xv3 34044 3882 Yes
Yan W, Peng YR, van Zyl T, Reg ... 32555229 SRP255195 Homo sapiens 10xv2 32195 15262 Yes
Sridhar A, Hoshino A, Finkbein ... 32023475 SRP238587 Homo sapiens 10xv2 23682 16616 No
Voigt AP, Mulfaul K, Mullin NK ... 31712411 SRP218652 Homo sapiens 10xv3 16633 1539 Yes
Lukowski SW, Lo CY, Sharov AA ... 31436334 E-MTAB-7316 Homo sapiens 10xv2 13803 10158 Yes
Menon M, Mohammadi S, Davila-V ... 31653841 SRP222001 Homo sapiens 10xv2 8737 680 Yes
Menon M, Mohammadi S, Davila-V ... 31653841 SRP222958 Homo sapiens DropSeq 8016 1012 Yes
Lu Y, Shiau F, Yi W, Lu S et a ... 32386599 SRP170761 Homo sapiens 10xv2 7452 5372 No
Voigt AP, Binkley E, Flamme-Wi ... 32069977 SRP238409 Homo sapiens 10xv3 1696 28 No
OGVFB_Hufnagel_iPSC_RPE Homo sapiens 10xv2 697 694 Yes
Hu Y, Wang X, Hu B, Mao Y et a ... 31269016 SRP125998 Homo sapiens SMARTSeq_v2 16 11 No
Peng YR, Shekhar K, Yan W, Her ... 30712875 SRP158528 Macaca fascicularis 10xv2 112058 109275 Yes
Clark BS, Stein-O'Brien GL, Sh ... 31128945 SRP158081 Mus musculus 10xv2 185546 176668 Yes
Tabula Muris Consortium., Over ... 30283141 SRP131661 Mus musculus 10xv2 78567 64314 Yes
Yan W, Laboulaye MA, Tran NM, ... 32457074 SRP259930 Mus musculus 10xv2 39596 38254 Yes
SRP257758 Mus musculus 10xv2 37006 36010 No
Shekhar K, Lapan SW, Whitney I ... 27565351 SRP075719 Mus musculus DropSeq 36295 24772 Yes
Tran NM, Shekhar K, Whitney IE ... 31784286 SRP212151 Mus musculus 10xv2 32526 30395 Yes
Heng JS, Hackett SF, Stein-O'B ... 31843893 SRP200499 Mus musculus 10xv2 22344 21244 Yes
Macosko EZ, Basu A, Satija R, ... 26000488 SRP050054 Mus musculus DropSeq 12831 11093 Yes
Lehmann GL, Hanke-Gogokhia C, ... 32196081 SRP216903 Mus musculus 10xv2 10543 10275 No
Fadl BR, Brodie SA, Malasky M, ... 33088174 SRP269635 Mus musculus 10xv2 9019 7832 No
Buenaventura DF, Corseri A, Em ... 31260032 SRP200599 Mus musculus 10xv2 7415 7400 No
Lo Giudice Q, Leleu M, La Mann ... 31399471 SRP168426 Mus musculus 10xv2 5575 5355 No
O'Koren EG, Yu C, Klingeborn M ... 30850344 SRP186407 Mus musculus 10xv2 3793 2596 No
Clark BS, Stein-O'Brien GL, Sh ... 31128945 SRP158081 Mus musculus SMARTSeq_v2 864 790 No
Lo Giudice Q, Leleu M, La Mann ... 31399471 SRP186396 Mus musculus SMARTSeq_v2 800 797 No
Fadl BR, Brodie SA, Malasky M, ... 33088174 SRP269634 Mus musculus 10xv2 399 386 No
Shekhar K, Lapan SW, Whitney I ... 27565351 SRP075720 Mus musculus SMARTSeq_v2 384 363 No
Shekhar K, Lapan SW, Whitney I ... 27565351 SRP073242 Mus musculus SMARTSeq_v2 291 236 No
Dharmat R, Kim S, Liu H, Fu S ... bioRxiv 774950 SRP220355 Mus musculus SCRBSeq 3 3 No

scEiaD Curated Published Cell Type Labels

CellType Species Studies Count
Retinal Ganglion Cells HS, MF, MM 11 88253
Amacrine Cells HS, MF, MM 14 67036
Rods HS, MF, MM 12 54622
Muller Glia HS, MF, MM 13 45484
Early RPCs MM 1 28643
Bipolar Cells HS, MF, MM 13 27535
Late RPCs MM 1 21643
RPCs HS 2 17927
Cones HS, MF, MM 12 15738
Neurogenic Cells HS, MM 3 10604
Rod Bipolar Cells HS, MM 2 9164
B-Cell HS, MM 4 8656
Horizontal Cells HS, MF, MM 11 8323
Photoreceptor Precursors HS, MM 3 7123
T-Cell HS, MM 4 5443
Endothelial HS, MF, MM 10 4565
Fibroblasts HS, MM 5 1879
Red Blood Cells HS, MM 3 1873
Pericytes HS, MF, MM 6 1608
Amacrine/Horizontal Precursors HS 2 1450
Astrocytes HS, MM 4 1197
Microglia HS, MF, MM 9 602
Macrophage HS 2 561
Vein HS 1 490
Monocyte HS 1 317
Schwann HS 2 288
Melanocytes HS 3 259
Choriocapillaris HS 1 225
Natural Killer HS 1 168
Artery HS 1 167
Mast HS 2 101
Smooth Muscle Cell HS 1 45
Labelled cell types from published papers were pulled, where possible, from a combination of the Sequence Read Archive (SRA), lab web sites, and personal correspondence, then adjusted to be consistent (e.g. MG to Muller Glia) between all studies.


scEiaD Machine Learned Cell Type Labels

CellType Species Studies Count
Rods HS, MF, MM 29 141472
Amacrine Cells HS, MF, MM 27 111445
Retinal Ganglion Cells HS, MF, MM 28 63915
Muller Glia HS, MF, MM 28 60858
Bipolar Cells HS, MF, MM 26 56770
RPCs HS, MF, MM 21 41290
Early RPCs HS, MF, MM 13 35984
Late RPCs HS, MM 9 34762
Neurogenic Cells HS, MF, MM 13 34078
Cones HS, MF, MM 28 26460
Photoreceptor Precursors HS, MF, MM 9 22648
Horizontal Cells HS, MF, MM 22 17793
Rod Bipolar Cells HS, MM 16 12438
Fibroblasts HS, MF, MM 21 5530
Pericytes HS, MF, MM 17 5514
Endothelial HS, MF, MM 22 5344
Amacrine/Horizontal Precursors HS, MF, MM 8 3072
Red Blood Cells HS, MF, MM 24 2554
RPE HS, MF, MM 15 2101
Astrocytes HS, MF, MM 18 2089
Microglia HS, MF, MM 24 2013
T-Cell HS, MF, MM 15 1311
Vein HS, MM 8 1058
Monocyte HS, MM 17 823
Schwann HS, MF, MM 12 790
Macrophage HS, MM 12 677
B-Cell HS, MF, MM 17 439
Mast HS, MM 12 393
Melanocytes HS, MF, MM 10 327
Natural Killer HS, MM 7 225
The labels above were used to create a machine learning modeled which was used to relabel all* cells in the scEiaD (*above a confidence threshold of 0.5).


Using and Extending plae and the scEiaD


All Links are External

Analyses Colab Bash Web Guide
Using plae Go
Using scEiaD Seurat object Go
UMAP projection of your data on scEiaD Go
Auto cell type label your data Go Go

Contact

If you have questions about scEiaD dataset or the plae application, please contact David McGaughey, Ph.D .

Otherwise the National Eye Institute's Office of Science Communications, Public Liaison and Education responds directly to requests for information on eye diseases and vision research in English and Spanish. We cannot provide personalized medical advice to individuals about their condition or treatment.

Phone: 301-496-5248 — English and Spanish
Mail: National Eye Institute
Information Office
31 Center Drive MSC 2510
Bethesda, MD 20892-2510








































Change log


0.72 (2021-04-29): Added more content and a table organzation to the Info -> Analysis... section

0.71 (2021-04-14): scEiaD preprint on bioRxiv! Added a filter in the "exp plot" section to remove data points with user-selected (default 50) minimum cells. I was finding that data points (e.g. cell type - study) with low N would often have "outlier" results. Added a new section to the web page - Analyses (under the "Info" tab)!

0.70 (2021-03-22): New scEiaD built with corrected fastq file sets (potential bug in 10x bamtofastq tool resulted ina few datasets getting scrambled barcodes). Removed a macaque dataset (SRR7733526) with odd behavior (clustering in 2D UMAP space largely alone). Tweaked the UMAP 2D gene view with a darker "background" cell color scheme to reduce "over-emphasis" on cells with low expression of a gene. CPM replaced with counts as some odd behaviour was detected in some genes in the UMAP view where there was high "background" expression. Counts have more consistent behavior. Removed hard filter that tossed cells with >2500 detected genes.

0.60 (2021-02-08): New scEiaD built with more studies. Removed several retinal organoid datasets that had snuck in. Added a filter option for the diff searching to search, for example, one cluster directly against another cluster.

0.52 (2021-01-13): Adding "missing" genes (we had only retained genes which were expressed in all three species, which naturally led to many genes (some important, like OPN1MW) to be dropped. That has been fixed. Tweak "Expression Plot" dot size to prevent crazy tiny point sizes.

0.51 (2021-01-06): Hello 2021! Adam Gayoso kindly pointed out that I was using scVI in a non-optimal manner, so I updated the scVI modeling to match their recommend "scArches" parameters. This (fortunately for my sanity) only subtly changes the downstream result. The more significant change is that we have totally changed the diff testing section to use the scran findMarkers test instead of our complicated and compute expensive pseudo-Bulk testing which continually gave odd results.

0.50 (2020-12-31): Goodbye 2020! Major update to the scVI-based UMAP projection which improves data quality. Removed non-tissue samples (e.g. organoid/cell lines). They will be added back later once I figure out a logical/simple way to do it. Fixed major bug in QC filtering which failed to remove high mitochondrial count (likely apoptosing cells) cells. Dot plot tweaked to improve relative dot sizes. Cowan et al. dataset added.

0.43 (2020-11-09): Downloadable diff results added to "Data." The diff results reactive data table now has a "Download all ..." button which replaces the "CSV" button that only downloaded the viewable data (100 max).

0.42 (2020-10-16): Alt text added to each button, tweaked UMAP-Tables layout again. Slide logo added. Site went public at the version on 2020-11-02!

0.41 (2020-10-06): Download buttons added for each plot.

0.40 (2020-10-05): UI and text labels tweaked in UMAP-Tables to improve tab selection order. Dot plot given a bar plot to show category size. Error handling improved when user fails to provide a category value to filter on. Data Table help button added. Help buttons moved to bottom of page with consistent visual - tabbing order. Colors of UI elements tweaked to improve contrast.

0.39 (2020-09-18): Fixed calculation error in dotplot where expression not scaled by number of cells in grouping variable.

0.38 (2020-09-02): Contact section and footer added for compliance.

0.37 (2020-08-24): Help pop section populated with text. Put white halo back around text in UMAP - Meta section. Loading circles added to plots. Row names added to tables. Diff testing filtered to only return results with FDR < 0.05 and abs(logFC) > 0.5.

0.36 (2020-08-17): Data download section added. Change log moved to separate section. CSS tweaked to show links in blue. First overview table updated to improve contrast. UMAP plots axis fixed.

0.35 (2020-08-14): Moved Overview tables to html for improved rendering and switched over color-blind friendly palette. Temporarily removed Temporal Plotting section. Improved filtering for Facet Plot. DotPlot plotting fixed and improved. Back-end server.R code moved into separate functions. Colors fixed so they stay consistent when filtering/subsetting the plots. Site now starts from scratch in under 5 seconds with improved fst-based data loading and pre-calculating more operations.

0.34 (2020-08-03): Fixed issue with TabulaMuris labels not appearing. Scanned app with koa11y for 508 compliance - changed headers from h2 to h1 to comply.

0.33 (2020-07-30): Exp plot can now take space or comma separated Genes as input. User can selected number of columns in Exp Plot. Diff Table formatting improved with rounding and PB_Test can be selected as a drop down now in the data table search.

0.32 (2020-07-29): In situ Projection viz added courtesy of Zachary Batz! It's a simulated cross section of the retina with each cell type colored by intensity of scRNA expression! Move table draw button under filtering in UMAP - Tables. Sort Diff Exp results by FDR. Filtering on numeric column now returns slider UI. Remove super dangerous ability to create faceted plots on numeric values.

0.31 (2020-07-24): Re-created scEiaD with better internal (Hufnagel) transwell RPE labelling (there are roughly two groups - mature RPE with high TTR expression and less (?) mature RPE with lower TTR), removal of the SRP166660 study as it was *all* non-normal (injured retina) (confirmed with correspondence with Dr. Poche), removed the pan RGC CellType labelling for the SRP212151 as I see post-hoc that there are LOADS of non-RGC cells. Did the same for SRP186407, which has substantial non-microglia. Generally, FACS != 100% celltype purity. Added differential testing against all Tabula Muris cell types. Removing clusters/cells with high doublet scores. Added cell cycle phase (G1/G2M/S) assignment. More study level metadata.

0.30 (2020-07-20): Huge update. Hundreds of thousands of cells added. The Tabula Muris project data (pan mouse) has been added to faciliate non-eye comparison. Filtering options added to most of the plotting views to allow for quick slicing into this huge dataset. Differential expression testing totally reworked - now uses "pseudoBulk" approach to better utilize the large number of studies we have.

0.23 (2020-06-16): Remove low N cell type from diff expression tables, tweak Overview with spacing alterations and updated text.

0.22 (2020-06-15): Added expression plot by user selected groups plot view. Fixed bug in mean cpm expression calculation for Viz -> UMAP - Table gene tables

0.21 (2020-06-15): Added subcluster diff testing tables, temporal gene expression by celltype plot section.

0.20 (2020-06-06): New 2D UMAP projection that includes the full Yu - Clark Human scRNA dataset. Added tables to "Overview" section showing data stats. Added "filtering" functionality to UMAP plot section.