Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Instanovo #51796

Merged
merged 24 commits into from
Nov 19, 2024
Merged
Changes from 6 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
08b2877
Add `instanovo` recipe
BioGeek Oct 22, 2024
6c34a71
Add TODO comments
BioGeek Oct 22, 2024
cf1a46c
dependencies of matchms have been updated
BioGeek Oct 22, 2024
04119e0
Remove duplicate python lines, remove uper version bound
BioGeek Oct 29, 2024
3bcc1d4
Add upper python bound again
BioGeek Oct 29, 2024
d846578
Merge branch 'master' into instanovo
BioGeek Oct 29, 2024
6e0da45
unpin all packages
BioGeek Oct 29, 2024
f6a93d4
re-add pinned packages, add eigen, add doi
BioGeek Oct 29, 2024
3d9b497
remove eigen, use pytorch-gpu
BioGeek Oct 29, 2024
5b33ce3
Merge branch 'master' into instanovo
BioGeek Oct 30, 2024
26ddf5a
Merge branch 'master' into instanovo
BioGeek Oct 31, 2024
9c2b661
Merge branch 'master' into instanovo
hechth Nov 7, 2024
28477ca
Update recipes/instanovo/meta.yaml
hechth Nov 7, 2024
a5bff4f
Update recipes/instanovo/meta.yaml
hechth Nov 7, 2024
92e9668
Merge branch 'master' into instanovo
hechth Nov 7, 2024
ac53d18
Update meta.yaml
hechth Nov 7, 2024
0fbcd01
Bump version of matchms to 0.28.0
BioGeek Nov 7, 2024
01c40d1
Use pytorch instead of pytorch-gpu
BioGeek Nov 7, 2024
177596f
Update meta.yaml
hechth Nov 7, 2024
87ba17a
make numpy version a lower bound
BioGeek Nov 14, 2024
2587292
Dropping all bounds
BioGeek Nov 16, 2024
c7c05e4
Explicitly add pytorch-gpu
BioGeek Nov 16, 2024
7f423f2
Add packges which `pip check` complained about
BioGeek Nov 16, 2024
217a56f
remove `pip check`, `pytorch-gpu`, `rdkit`, `lxml`, and `scipy` from …
BioGeek Nov 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions recipes/instanovo/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
{% set name = "instanovo" %}
{% set version = "1.0.0" %}

package:
name: {{ name|lower }}
version: {{ version }}

source:
url: https://pypi.org/packages/source/{{ name[0] }}/{{ name }}/instanovo-{{ version }}.tar.gz
sha256: fd9cfc377d9f8da5272f96b2eb4c14c08b579d7a65466aa402601ec6c4b42672

build:
noarch: python
script: {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation
number: 0
run_exports:
- {{ pin_subpackage(name | lower, max_pin="x.x") }}

mfansler marked this conversation as resolved.
Show resolved Hide resolved

requirements:
host:
- python >=3.10,<3.12
- setuptools >=69.1.1
- pip
run:
- click >=8.1.7
- datasets >=3.0.1
- hydra-core >=1.3.2
- jaxtyping >=0.2.34
- jiwer >=3.0.4
- matchms >=0.27.0
hechth marked this conversation as resolved.
Show resolved Hide resolved
- neptune >=1.12.0
- numpy >=1.26.4
- omegaconf >=2.3.0
- pandas >=2.2.3
- polars >=1.9.0
- pyopenms >=3.2.0
- python-dotenv >=1.0.1
- pytorch-lightning >=2.4.0
- s3fs >=2024.6.1
- scikit-learn >=1.5.2
- spectrum_utils >=0.4.2
- tensorboard >=2.18.0
- pytorch >=2.4.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update PyTorch and CUDA configuration

Current issues:

  1. pytorch version >=2.4.1 is not yet available
  2. cudatoolkit should have a version constraint
  3. Consider making CUDA optional for CPU-only installations

Recommendations:

  1. Use a more conservative PyTorch version that's available in conda-forge
  2. Add version constraint for cudatoolkit
  3. Consider splitting into CPU/GPU variants

Apply this diff to update the configuration:

-    - pytorch >=2.4.1
-    - cudatoolkit
+    - pytorch >=2.0.0
+    - cudatoolkit >=11.8

Also applies to: 46-46

- tqdm >=4.66.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Based on the verification results, I can now provide a final response about the dependency availability concerns:

Several dependencies require attention in conda channels

Some required dependencies are not readily available in the standard conda channels:

  • pytorch package name might be pytorch-cpu or pytorch-gpu in conda-forge
  • matchms, spectrum-utils, and pyopenms were not found in conda-forge or bioconda
  • Other core dependencies like numpy, pandas, scikit-learn, pytorch-lightning, jiwer, and jaxtyping are available in conda-forge

Please ensure these packages are available in the specified versions through conda channels or consider providing alternative installation methods in the documentation.

🔗 Analysis chain

Verify dependency availability in conda channels.

Please ensure all runtime dependencies are available in conda-forge or bioconda channels with the specified versions.


Since the sandbox environment doesn't have conda installed, I'll try to verify the package availability through a different approach by checking the conda-forge feedstock repositories which contain the package build recipes.


Let me check the bioconda channel for the missing packages, particularly for the bioinformatics-related ones like matchms, spectrum-utils, and pyopenms.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check availability of key dependencies
for pkg in pytorch-lightning pytorch numpy pandas scikit-learn; do
  echo "Checking $pkg..."
  conda search -c conda-forge $pkg
done

Length of output: 1322


Script:

#!/bin/bash
# Check existence of conda-forge feedstock repositories for the dependencies
for pkg in pytorch-lightning pytorch numpy pandas scikit-learn matchms spectrum-utils pyopenms jiwer jaxtyping; do
  echo "Checking $pkg feedstock..."
  gh repo view conda-forge/${pkg}-feedstock --json name,description,url 2>/dev/null || echo "Not found: $pkg"
done

Length of output: 3695


Script:

#!/bin/bash
# Check existence of bioconda packages
for pkg in matchms spectrum-utils pyopenms; do
  echo "Checking $pkg in bioconda..."
  gh repo view bioconda/bioconda-recipes --json name,description,url 2>/dev/null && \
  rg -l "name:\s*$pkg\s*$" recipes/
done

# Also check for PyTorch which wasn't found in conda-forge
echo "Checking for pytorch package..."
rg -l "name:\s*pytorch\s*$" recipes/

Length of output: 1224


test:
imports:
- instanovo
commands:
- pip check
- python -c "import instanovo; print(instanovo.__version__)"
requires:
- pip

about:
home: https://github.com/instadeepai/instanovo
summary: De novo peptide sequencing with InstaNovo
license: Apache-2.0
license_file: LICENSE.md

extra:
recipe-maintainers:
- BioGeek
Loading