STACAS semi-supervised integration completes but during anchor finding throws Error in if (totalCols == 0) return(NULL) : argument is of length zero

Hi,
I am running STACAS on a large Seurat object (≈281k cells) with some batches and very unbalanced sample sizes and I would like to clarify whether the behaviour I observe is expected or indicates a problem in the integration.
Below I summarize three different approaches I tried. In all cases, the pipeline runs to completion, but in some cases STACAS prints an error during anchor finding, even though the integration and downstream UMAP are produced.

**Dataset / setup**
Seurat / SeuratObject recently updated to SeuratV5, RNA assay, log-normalized
Total cells of the object: ~281,600
Number of samples (orig.ident, used as batch): 57
Highly unbalanced batches
11/57 samples have < 1,000 cells
Smallest sample: 265 cells
Largest sample: 13,669 cells
anchor.features = 1000

**Cell labels used for semi-supervised mode**
"clusters" metadata contains ~38 annotated clusters
Annotated cells: ~38,800
Unannotated cells (NA): 242,828 (majority of the dataset)

**Method 1 – Stepwise STACAS, semisupervised, ndim = 28**
ndim was chosen based on PCA variance (95% cumulative variance).

```{r}
library(STACAS)
nfeatures = 1000
ndim = 28
obj.list <- SplitObject(All1, split.by = "orig.ident")
for (n in 1:length(obj.list)) {
  print(n)
  print(obj.list[[n]])
  Idents(obj.list[[n]]) <- "clusters"
}

stacas_anchors <- FindAnchors.STACAS(obj.list, 
                                     anchor.features = nfeatures,
                                     dims = 1:ndim, 
                                     cell.labels = "clusters")

st1 <- SampleTree.STACAS(
  anchorset = stacas_anchors,
  obj.names = names(obj.list))
object_integrated <- IntegrateData.STACAS(stacas_anchors,
                                          sample.tree = st1,
                                          dims=1:ndim)

object_integrated <- object_integrated %>% ScaleData() %>%
  RunPCA(npcs=28) %>% RunUMAP(dims=1:28)
```
This finishes successfully, but during FindAnchors.STACAS I observe errors like:
Error in if (totalCols == 0) return(NULL) : argument is of length zero
The pipeline does not stop and produces an integrated object.

**Method 2: One-liner Run.STACAS, semi-supervised, ndim = 20**

```{r}
library(STACAS)
nfeatures = 1000
ndim = 20
Idents(All1) = "clusters"
object_integrated1 <- All1 %>% SplitObject(split.by = "orig.ident") %>%Run.STACAS(dims = 1:ndim, anchor.features = nfeatures, cell.labels = "clusters") %>% RunUMAP(dims = 1:ndim) 
```
This also finishes to the end, but I again see the same error message during the run:
Warning: sparse->dense coercion: allocating vector of size 1.0 GiBWarning: pseudoinverse used at -2.2162Warning: neighborhood radius 0.30103Warning: reciprocal condition number  1.4523e-14Preparing PCA embeddings for objects...
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=44s  
  |+++++++                                           | 13% ~07h 26m 18s  Error in if (totalCols == 0) return(NULL) : argument is of length zero
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=06h 03m 47s

**Method 3: One-liner Run.STACAS, unsupervised, ndim = 20**

```{r}
object_integrated2 <- All1 %>% SplitObject(split.by = "orig.ident") %>%Run.STACAS(dims = 1:ndim, anchor.features = nfeatures) %>% RunUMAP(dims = 1:ndim) 
```

This version:
finishes without explicit errors
produces an integrated object and UMAP

**Questions**
Is it expected that:
STACAS completes even when FindAnchors.STACAS encounters cases where totalCols == 0?
When using cell.labels, does this error indicate that some batch pairs have no compatible anchors and are effectively skipped?
Is there a recommended way to diagnose which datasets or batch pairs fail to form anchors?
For large, heterogeneous datasets, is semi-supervised STACAS still recommended, or should the unsupervised mode be preferred?

Thank you very much for your help, and for developing STACAS!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STACAS semi-supervised integration completes but during anchor finding throws Error in if (totalCols == 0) return(NULL) : argument is of length zero #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

STACAS semi-supervised integration completes but during anchor finding throws Error in if (totalCols == 0) return(NULL) : argument is of length zero #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions