To identify doublets, we implement a mixture model to calculate the likelihood that the sequence reads originated from two individuals, and the likelihoods are compared to determine whether a droplet contains cells from one or two samples. If sequence reads from the c-th droplet originate from two different samples, s1, s2 with mixing proportions (1 − α) : α, then the likelihood in (1) can be represented as the following mixture distribution18, Lc(s1,s2,α)=∏v=1V[∑g1,g2{∏i=1dcv(∑e=01(1−α)Pr(bcvi|g1,e)+αPr(bcvi|g2,e))Psv(g1)Psv(g2)}]