No description provided.

Red blood cells clustering#2

Open
Opened 9/19/20240 commentsby AlbertoFabbri93
AlbertoFabbri93

The following code was used to cluster the red blood cells separately from the other cells so that it is possible to remove them: ``` analyze_proteins <- function(patient_data) { metadata_proteins_columns <- c( "Mean.PanCK", "Max.PanCK", "Mean.CD68", "Max.CD68", "Mean.Membrane", "Max.Membrane", "Mean.CD45", "Max.CD45", "Mean.DAPI", "Max.DAPI") for (col in metadata_proteins_columns) { patient_data[[col]] <- as.numeric(as.factor([email protected][[col]])) } # Extract proteins features and replace underscores in column names proteins_features <- [email protected][, metadata_proteins_columns] colnames(proteins_features) <- gsub("_", "-", colnames(proteins_features)) proteins_matrix <- as.matrix(proteins_features) # Ensure row names (cell names) match the Seurat object cell names rownames(proteins_matrix) <- rownames([email protected]) # Create a new assay with the proteins data proteins_assay <- CreateAssay5Object(counts = t(proteins_matrix)) # Add the new assay to the Seurat object with the name "proteins" patient_data[["proteins"]] <- proteins_assay DefaultAssay(patient_data) <- "proteins" # Scale the raw metadata values patient_data <- ScaleData( layer = "counts", patient_data, assay = "proteins", features = metadata_proteins_columns, do.center = TRUE, do.scale = TRUE) # Run PCA and specify the number of PCs to compute # Only 10 features available, use 10 PCs n_pcs <- 9 patient_data <- RunPCA( patient_data, assay = "proteins", features = metadata_proteins_columns, npcs = n_pcs, reduction.name = "pca_proteins", reduction.key = "PCPR_", seed.use = 2) # Use the available number of PCs for FindNeighbors and clustering patient_data <- FindNeighbors(patient_data, dims = 1:n_pcs, assay = "proteins", reduction = "pca_proteins") patient_data <- FindClusters( patient_data, resolution = 0.5, assay = "proteins", cluster.name = "protein_clusters", graph.name = "proteins_snn", random.seed = 2) # Run UMAP for visualization patient_data <- RunUMAP( patient_data, assay = "proteins", dims = 1:n_pcs, reduction = "pca_proteins", reduction.name = "umap_proteins", reduction.key = "UMAPPR_", seed.use = 2) return(patient_data) } ``` which resulted in the following plots: ![Patient_1_protein_clusters_core_M2_stamp_1](https://github.com/user-attachments/assets/67f5c202-5ef5-4919-9585-871199afeaaa) ![Patient_1_protein_clusters_umap](https://github.com/user-attachments/assets/fe213d39-b2e9-4ba8-9757-f1a42044190b) ![Patient_1_featureplots_umap_proteins](https://github.com/user-attachments/assets/e955eae4-52d3-4d99-823d-2be03fbb383e) where the red blood cells are grouped in cluster 5. ---- If I remove this code: ``` for (col in metadata_proteins_columns) { patient_data[[col]] <- as.numeric(as.factor([email protected][[col]])) } ``` that transforms the columns used for clustering from integer to numeric (WARNING: this method may assign numeric values based on the factor levels of the column, not the original numeric content), I get the following plots: ![Patient_1_protein_clusters_core_M2_stamp_1](https://github.com/user-attachments/assets/e662c08e-4840-4373-bf2e-e6cf62e119b2) ![Patient_1_protein_clusters_umap](https://github.com/user-attachments/assets/93f577bc-f990-41aa-a2ab-2ed6b1f2ed2c) ![Patient_1_featureplots_umap_proteins](https://github.com/user-attachments/assets/ece3d48e-c86e-4dd0-a9f0-053045f0c047) where the red blood cells cluster is the number 5. Is it necessary to convert the metadata columns from integer to numeric?

AI Analysis

This issue appears to be discussing a feature request or bug report related to the repository. Based on the content, it seems to be still under discussion. The issue was opened by AlbertoFabbri93 and has received 0 comments.

Add a comment
Comment form would go here