Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BBB example for LIME #142

Open
wants to merge 42 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
a5787ac
Groomed solubility notebook for docs website
hgandhi2411 May 7, 2022
92c3a61
Merge with origin main
hgandhi2411 May 11, 2022
1449bbd
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 May 12, 2022
b6440e0
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 May 20, 2022
51e830a
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 Jun 1, 2022
dcacfdf
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 Jun 9, 2022
5e19275
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 Jun 20, 2022
f9176c7
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 Nov 1, 2022
d9558a5
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 Nov 19, 2022
9d12551
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 Dec 27, 2022
835f5a8
Merge branch 'main' of https://github.com/ur-whitelab/exmol
hgandhi2411 Feb 19, 2023
9467f57
Added langchain and improve NLE
whitead Feb 22, 2023
7ccd2a1
Tweaked prompt
whitead Feb 22, 2023
0940717
Updated notebook
whitead Feb 22, 2023
2b06865
Added correlation direction
whitead Feb 22, 2023
1131b67
More work on prompt
whitead Feb 22, 2023
e75480b
Updated prompt for single vs multi molecules
whitead Feb 24, 2023
3a8c20f
More prompt refinement
whitead Feb 24, 2023
92f4203
Model changes
whitead Feb 24, 2023
1a37f5b
Removed oai key dependency
whitead Feb 24, 2023
c321469
Updated notebook experiments
whitead Mar 2, 2023
3338de8
Merge branch 'nle-lc' of https://github.com/ur-whitelab/exmol into nl…
hgandhi2411 Mar 8, 2023
27b775c
Fixed old text generate code
whitead Mar 8, 2023
174cd59
Addded OAI key
whitead Mar 8, 2023
7b552e5
merge with main
hgandhi2411 Mar 9, 2023
86fbfb7
Merge branch 'nle-lc' of https://github.com/ur-whitelab/exmol into nl…
hgandhi2411 Mar 9, 2023
3b55017
Added openai to dev requirements
whitead Mar 9, 2023
6e83594
Added OAI Key and uncommented explains
whitead Mar 9, 2023
2de9cb5
Adding BBB notebook
hgandhi2411 Nov 27, 2023
218883b
merge with main
hgandhi2411 Nov 27, 2023
83096c4
Merge branch 'nle-lc' of https://github.com/ur-whitelab/exmol into nl…
hgandhi2411 Nov 27, 2023
1a40851
testing pre-commit
hgandhi2411 Dec 4, 2023
2e7a69a
remove langchain dependency
geemi725 Dec 7, 2023
a932c8b
remove langchain from setup.py
geemi725 Dec 7, 2023
a6869dd
Merge branch 'main' of https://github.com/ur-whitelab/exmol into nle-lc
hgandhi2411 Dec 7, 2023
bf6aeb3
Fixed pre-commit phew
hgandhi2411 Dec 7, 2023
2a414aa
Merge branch 'issue-144' of https://github.com/ur-whitelab/exmol into…
hgandhi2411 Dec 7, 2023
c14cbd7
merge with Geemi's changes
hgandhi2411 Dec 7, 2023
558393b
Ran openai migrate, test_synspace_anybond failing?
hgandhi2411 Dec 7, 2023
89924f4
made suggested change in chat completion response output
hgandhi2411 Dec 7, 2023
8ff1371
py310 seems to resolve typing TypeAliasType import error
hgandhi2411 Dec 7, 2023
2ec0797
Try MolToRandomSmilesVect with a ramsom seed for consist chem space
hgandhi2411 Jun 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/paper.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.8
- name: Set up Python 3.10
uses: actions/setup-python@v2
with:
python-version: "3.8"
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.2.3
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace
- id: mixed-line-ending
- repo: https://github.com/psf/black
- repo: https://github.com/psf/black-pre-commit-mirror
rev: "23.11.0"
hooks:
- id: black
Expand Down
30 changes: 20 additions & 10 deletions exmol/exmol.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,10 @@
from rdkit.Chem.Draw import MolToImage as mol2img, DrawMorganBit # type: ignore
from rdkit.Chem import rdchem # type: ignore
from rdkit.DataStructs.cDataStructs import BulkTanimotoSimilarity, TanimotoSimilarity # type: ignore
import langchain.llms as llms
import langchain.prompts as prompts

from openai import OpenAI

client = OpenAI()
from . import stoned
from .plot_utils import _mol_images, _image_scatter, _bit2atoms
from .data import *
Expand Down Expand Up @@ -392,6 +393,7 @@ def _check_alphabet_consistency(
alphabet_symbols = _alphabet_to_elements(set(alphabet_symbols))
# find all elements in smiles (Upper alpha or upper alpha followed by lower alpha)
smiles_symbols = set(re.findall(r"[A-Z][a-z]?", smiles))

if check and not smiles_symbols.issubset(alphabet_symbols):
# show which symbols are not in alphabet
raise ValueError(
Expand Down Expand Up @@ -1410,7 +1412,7 @@ def merge_text_explains(
def text_explain_generate(
text_explanations: List[Tuple[str, float]],
property_name: str,
llm: Optional[llms.BaseLLM] = None,
llm_model: str = "gpt-4",
single: bool = True,
) -> str:
"""Insert text explanations into template, and generate explanation.
Expand All @@ -1430,14 +1432,22 @@ def text_explain_generate(
for x in text_explanations
]
)
prompt_template = prompts.PromptTemplate(
input_variables=["property", "text"],
template=_single_prompt if single else _multi_prompt,
)

prompt_template = _single_prompt if single else _multi_prompt
prompt = prompt_template.format(property=property_name, text=text)
if llm is None:
llm = llms.OpenAI(temperature=0.05)
return llm(prompt)

messages = [
{
"role": "system",
"content": "Your goal is to explain which molecular features are important to its properties based on the given text.",
},
{"role": "user", "content": prompt},
]
response = client.chat.completions.create(
model=llm_model, messages=messages, temperature=0.05
)

return response.choices[0].message.content


def text_explain(
Expand Down
8 changes: 5 additions & 3 deletions exmol/stoned/stoned.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,7 @@
from rdkit import Chem # type: ignore
from rdkit.Chem import MolFromSmiles as smi2mol # type: ignore
from rdkit.Chem import MolToSmiles as mol2smi # type: ignore
from rdkit.Chem import MolToRandomSmilesVect # type: ignore

from rdkit.Chem import AllChem # type: ignore
from rdkit.DataStructs.cDataStructs import TanimotoSimilarity # type: ignore
Expand All @@ -237,9 +238,10 @@ def randomize_smiles(mol):
if not mol:
return None

return mol2smi(
mol, canonical=False, doRandom=True, isomericSmiles=True, kekuleSmiles=True
)
# return mol2smi(
# mol, canonical=False, doRandom=True, isomericSmiles=True, kekuleSmiles=True
# )
return MolToRandomSmilesVect(mol, 1, isomericSmiles=True, kekuleSmiles=True, randomSeed=np.random.randint(0,100))[0]


def largest_mol(smiles):
Expand Down
Loading
Loading