The following code was used to cluster the red blood cells separately from the other cells so that it is possible to remove them: ``` analyze_proteins <- function(patient_data) { metadata_proteins_columns <- c( "Mean.PanCK", "Max.PanCK", "Mean.CD68", "Max.CD68", "Mean.Membrane", "Max.Membrane", "Mean.CD45", "Max.CD45", "Mean.DAPI", "Max.DAPI") for (col in metadata_proteins_columns) { patient_data[[col]] <- as.numeric(as.factor([email protected][[col]])) } # Extract proteins features and replace underscores in column names proteins_features <- [email protected][, metadata_proteins_columns] colnames(proteins_features) <- gsub("_", "-", colnames(proteins_features)) proteins_matrix <- as.matrix(proteins_features) # Ensure row names (cell names) match the Seurat object cell names rownames(proteins_matrix) <- rownames([email protected]) # Create a new assay with the proteins data proteins_assay <- CreateAssay5Object(counts = t(proteins_matrix)) # Add the new assay to the Seurat object with the name "proteins" patient_data[["proteins"]] <- proteins_assay DefaultAssay(patient_data) <- "proteins" # Scale the raw metadata values patient_data <- ScaleData( layer = "counts", patient_data, assay = "proteins", features = metadata_proteins_columns, do.center = TRUE, do.scale = TRUE) # Run PCA and specify the number of PCs to compute # Only 10 features available, use 10 PCs n_pcs <- 9 patient_data <- RunPCA( patient_data, assay = "proteins", features = metadata_proteins_columns, npcs = n_pcs, reduction.name = "pca_proteins", reduction.key = "PCPR_", seed.use = 2) # Use the available number of PCs for FindNeighbors and clustering patient_data <- FindNeighbors(patient_data, dims = 1:n_pcs, assay = "proteins", reduction = "pca_proteins") patient_data <- FindClusters( patient_data, resolution = 0.5, assay = "proteins", cluster.name = "protein_clusters", graph.name = "proteins_snn", random.seed = 2) # Run UMAP for visualization patient_data <- RunUMAP( patient_data, assay = "proteins", dims = 1:n_pcs, reduction = "pca_proteins", reduction.name = "umap_proteins", reduction.key = "UMAPPR_", seed.use = 2) return(patient_data) } ``` which resulted in the following plots:    where the red blood cells are grouped in cluster 5. ---- If I remove this code: ``` for (col in metadata_proteins_columns) { patient_data[[col]] <- as.numeric(as.factor([email protected][[col]])) } ``` that transforms the columns used for clustering from integer to numeric (WARNING: this method may assign numeric values based on the factor levels of the column, not the original numeric content), I get the following plots:    where the red blood cells cluster is the number 5. Is it necessary to convert the metadata columns from integer to numeric?
This issue appears to be discussing a feature request or bug report related to the repository. Based on the content, it seems to be still under discussion. The issue was opened by AlbertoFabbri93 and has received 0 comments.