Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

实现mol文件转换成sdf文件格式修改爬取域名,更换为老域名; #4

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dsxksss
Copy link

@dsxksss dsxksss commented Sep 11, 2024

修复pandas依赖过旧导致的问题
删除爬取后产生的数据文件

Summary by Sourcery

Implement conversion of MOL files to SDF format and update the domain used for data retrieval to the old domain.

New Features:

  • Add functionality to convert MOL files to SDF format using RDKit and save them to a specified path.

Enhancements:

  • Update the root URL and other related URLs in the TcmspSpider class to use the old domain for data retrieval.

实现mol文件转换成sdf文件格式
修复依赖过久导致的问题
删除爬取后产生的数据文件
Copy link

sourcery-ai bot commented Sep 11, 2024

Reviewer's Guide by Sourcery

This pull request implements changes to the TCMSP (Traditional Chinese Medicine Systems Pharmacology) spider, including updating the domain URL, adding functionality to convert MOL files to SDF format, and making minor adjustments to existing code. The changes primarily focus on improving data retrieval and export capabilities.

File-Level Changes

Change Details Files
Updated the domain URL from 'tcmsp-e.com' to 'old.tcmsp-e.com'
  • Changed root_url in TcmspSpider class
  • Updated URL in get_data function
src/tcmsp.py
src/get_all_data.py
Added functionality to convert MOL files to SDF format
  • Imported required libraries (rdkit, tempfile, tqdm)
  • Implemented mol2sdf method in TcmspSpider class
  • Added call to mol2sdf method in get_herb_data
src/tcmsp.py
Made minor adjustments to existing code
  • Added trailing commas to function calls
  • Improved error handling and logging in mol2sdf method
src/tcmsp.py

Tips
  • Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
  • Continue your discussion with Sourcery by replying directly to review comments.
  • You can change your review settings at any time by accessing your dashboard:
    • Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
    • Change the review language;
  • You can always contact us if you have any questions or feedback.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dsxksss - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider improving error handling in the mol2sdf function. Currently, errors are just printed, which might not be sufficient for production use.
  • The mol2sdf function processes files sequentially. For large datasets, consider implementing batch processing or parallel execution to improve performance.
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant