Gene expression atlases have transformed our understanding of the cellular and molecular components of human tissues. As the field enters an era of large transcriptional datasets and move towards single cell resolution, one challenge is adapting and integrating current knowledge frameworks to include new cell types, or better defined cell states. Projects such as the Human Cell Atlas exemplify our efforts towards building reference atlases that can help us benchmark many types of assayed cells as well as identify new and novel cell types and their functions.
Motivated by the idea that a truly integrated approach would allow comparison of information derived from old and new data types, we created integrated atlases of blood cells using high quality transcriptional datasets from microarray and bulk RNA-seq platforms. This approach also takes advantage of the deep functional phenotyping that accompany more traditional profiling methods. The resulting atlases provide a multi-scaled approach to visualise and analyse the relationships between sets of genes and blood cell lineages, including the maturation and activation of leukocytes in vivo and in vitro. The variance partition method we employed on data integration seems robust to any technical artefacts, ensuring that real biological states are reflected in the atlases.
Projection of new data onto the atlas allows users to benchmark cell isolation or derivation methods, cell line models, and assess new cell activation states. Single cell RNA-Seq samples can also be projected successfully, and represents how new data types can be combined with existing knowledge. The atlases as well as the projection tools are readily available through intuitive and interactive interfaces at stemformatics.org.