Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LINCS hit query #14

Open
gwaybio opened this issue Jun 24, 2021 · 7 comments
Open

LINCS hit query #14

gwaybio opened this issue Jun 24, 2021 · 7 comments

Comments

@gwaybio
Copy link
Member

gwaybio commented Jun 24, 2021

@michaelbornholdt presented an analysis on querying "hits" during profiling checkin today.

I'd love to be able to include this analysis in the LINCS complementarity paper. Michael estimated that his analysis would take ~2 hours.

Specification

  • Perform hit analysis in L1000 data
  • Perform hit analysis in Cell Painting data (CellProfiler features)

Input data

  • L1000 profiles here
  • Cell Painting profiles here

Output

I think I need to understand the "hit" analysis better. Given a compound, you're asking if you match another replicate as the top hit? So essentially the output is a ranked list of matches per compound and whether or not they map to the same category?

If so, then can you output the following data frame?

target_compound compound_to_match rank same_replicate same_MOA
X Y 1 True True
X Z 2 False True
and so on... ... ... ... ...

Let's iterate on the final output specifications if my understanding above is limited in any way.

A couple preliminary figures and statistics would also be helpful.

Motivation

  • I think it'll be great to include this in the LINCS complementarity project b/c it aligns with one of the LINCS goals of being able to query signatures and infer function by guilt-by-association
  • It will be a cool analysis to compare the efficient-net features vs. CellProfiler features (even though we will not include efficient net in this paper)
  • If you perform this analysis and it makes it into the paper, we would offer you authorship (which would also require you to review the final submission and stand by its claims)
@michaelbornholdt
Copy link

Thanks for setting up this issue. I agree with the motivation and the authorship, if we include this.

I can definitely give you that output. But it will take a bit longer to build that df. You proposed df also has a lot more information then needed for the simple histogram.
But overall its best because I will want to put this onto cyto eval anyways.

@michaelbornholdt
Copy link

We should also add the MOAs of both compounds to that df since that will allow for more MOA focussed plots instead of counting the compounds

@gwaybio
Copy link
Member Author

gwaybio commented Jun 24, 2021

Sounds good! I am aiming for a July 1 submission, so in order for us to include it, I will need it before then.

Two additional points that you might want to consider:

  • One path to getting this on cytominer-eval is to use this analysis as a test case. Apply it without the constraints of a package, learn how you need to wrangle the input and output data, what info is required, etc. and only after you've validated the output, code the general case.
  • The MOA information can be included in this output for sure, I was thinking not to since it can be acquired with a simple join on compound name, and it will substantially increase data size. I agree it is nice to have though, so definitely up to u.

Thanks!

@michaelbornholdt
Copy link

Yea good points!
Not very familiar with test-driven development but I could try it :)

@michaelbornholdt
Copy link

@gwaygenomics
The data you linked up there is level3. I just require level5 data. I misspoke in the meeting!

Can you point me to which level5 I should be using

@gwaybio
Copy link
Member Author

gwaybio commented Jun 24, 2021

level 5 links:
Cell Painting
L1000

Thanks!

@gwaybio
Copy link
Member Author

gwaybio commented Jul 1, 2021

@michaelbornholdt produced the analysis here: broadinstitute/neural-profiling#2 (comment)

I will ingest the output files in this repo for visualization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants