Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rank.logFC.detected Produces Dubious Rankings #116

Open
DarioS opened this issue Jan 24, 2024 · 4 comments
Open

rank.logFC.detected Produces Dubious Rankings #116

DarioS opened this issue Jan 24, 2024 · 4 comments

Comments

@DarioS
Copy link

DarioS commented Jan 24, 2024

There is already a lfc filtering parameter. Something useful missing is min.detected which would apply to self or other.

> orderDetected <- order(cluster1$rank.logFC.detected)
> cluster1[orderDetected[1:10], c(1:4, 9, 14, 19)]
DataFrame with 10 rows and 7 columns
        self.average other.average self.detected other.detected rank.logFC.cohen  rank.AUC rank.logFC.detected
           <numeric>     <numeric>     <numeric>      <numeric>        <integer> <integer>           <integer>
PAX7       0.0188465  0.0000877382     0.0115113   0.0000830314            13833     12014                   1
ECRG4      0.0542055  0.0015406061     0.0400046   0.0008963788            12084     10431                   1
NNMT       4.3730723  2.1111045671     0.9563483   0.6148204061                1         1                   1
MYF5       2.2181507  0.0330166809     0.5924322   0.0156052101                5         5                   1
RPS27L     4.2351264  2.9271708476     0.9731023   0.7666684260                1         1                   1
CALM2      4.5810915  3.3270106207     0.9658081   0.7676900297                2         2                   2
MLIP       0.0222212  0.0031465257     0.0178938   0.0033352642            14273     15185                   2
SOD2       3.9477973  2.6287379749     0.8990198   0.7366703714               19         5                   2
RARRES2    1.8958659  0.3565642800     0.4678596   0.1429341960                9        16                   2
MAG        0.0231304  0.0008944649     0.0153864   0.0008786545            13578     11628                   2

PAX7 is biologically a dubious marker gene if it only appears in 1.15% of cells of a cluster. MAG is another case.

@LTLA
Copy link
Collaborator

LTLA commented Jan 25, 2024

lfc doesn't actually do any filtering, but is instead a TREAT-like threshold for the calculation of p-values. Basically it's the mu in t.test. Nothing is explicitly filtered out when you set lfc, the shape of the DataFrame remains unchanged.

If you want an equivalent experience for the minimum detected proportion, you'd have to figure out what the null hypothesis becomes. I suppose we could just require a minimum absolute increase in the detected proportion, equivalent to bumping up the p for a one-sided binom.test (which is the analogous test for the detected proportions).

If you really just want to filter, you can do that outside of the function. It's a pretty complicated function already and I don't want to add more arguments, and also, I like keeping the DataFrame shape consistent across all clusters.

@DarioS
Copy link
Author

DarioS commented Jan 25, 2024

a minimum absolute increase in the detected proportion

That sounds good.

@LTLA
Copy link
Collaborator

LTLA commented Feb 3, 2024

Update on this: while I think it's a good idea, I just don't have the time to work on it. I'd be happy to take a PR, though someone will have to delve into the C++ code to implement this change.

I'll also note that the proposed libscran-based replacement for scran will use the "delta detected" as one of its effect sizes for ranking, which is pretty much what is being proposed here, so you could just wait until that hits the shelves.

@DarioS
Copy link
Author

DarioS commented Feb 4, 2024

libscran certainly looks worth waiting for! I haven not written C++ code in over twelve years, so best that I not meddle with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants