Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geometry optimizations on AWS return incorrect structures. #1151

Open
amyjystad opened this issue Dec 19, 2024 · 6 comments
Open

Geometry optimizations on AWS return incorrect structures. #1151

amyjystad opened this issue Dec 19, 2024 · 6 comments
Labels
unconfirmed This report has not yet been confirmed by the developers

Comments

@amyjystad
Copy link

amyjystad commented Dec 19, 2024

Describe the bug
I have an example of the geometry output of a water molecule on
Image

To Reproduce
Steps to reproduce the behaviour:

  1. Spin up AWS parallel cluster using the following config file:
Region: us-east-1
Imds:
 ImdsSupport: "v2.0"
Image:
 Os: alinux2
HeadNode:
 InstanceType: m7a.medium
 Imds:
   Secured: true
Scheduling:
 Scheduler: slurm
 SlurmQueues:
 - Name: queue1
   ComputeResources:
   - Name: workers
     InstanceType: c7g.4xlarge
     MinCount: 0
     MaxCount: 20
   Networking:
     PlacementGroup:
       Enabled: true
     SubnetIds:
     - subnet-061b349a7d1667bc8
SharedStorage:
 - MountDir: /shared
   Name: pcluster-fxsl
   StorageType: FsxLustre
   FsxLustreSettings:
     StorageCapacity: 1200
     DataCompressionType: LZ4
     WeeklyMaintenanceStartTime: '7:03:00'
     AutoImportPolicy: NEW_CHANGED_DELETED
     DeletionPolicy: Retain

Install conda
'conda install xtb'
2. Input coordinates: H2O_C0_M1_input.txt

happens with input (include input files)!!
3. 'xtb H2O_C0_M1.xyz --opt > H2O_C0_M1.out'
4. Output:
H2O_C0_M1_output.txt
xtbopt.log

  1. Error is not reported in this case (geometry optimization of other structures often fail) But the image in the issue description should suffice to show an issue in the geometry optimization.

Expected behaviour
Image
H2O_C0_M1_expected_output.txt
expected_xtbopt.log

I have other, more exaggerated examples if desired.

@amyjystad amyjystad added the unconfirmed This report has not yet been confirmed by the developers label Dec 19, 2024
@amyjystad
Copy link
Author

The 'Expected behavior' are files run locally on my Mac with a conda installed xtb.

@awvwgk
Copy link
Member

awvwgk commented Dec 20, 2024

Since this is a slurm cluster, are you submitting parallel jobs using MPI? In this case xtb will fail because parallelization with MPI is not supported. Instead xtb uses OpenMP for parallelization.

@amyjystad
Copy link
Author

No I am submitting with OpenMP. And the same behavior would occur when I ran it from the commandline with only 1 processor.

I spun up a new cluster with instances more similar to my local laptop (Mac with ARM64). And xtb is behaving as expected. This is peculiar behavior but no longer an urgent need to address.

@foxtran
Copy link
Contributor

foxtran commented Dec 22, 2024

@amyjystad, your calculation was restarted on AWS. Also, you are using different versions of xtb: it also can be a reason that you do not see this issue on your Mac. Try to run your calc on AWS from scratch. Preferably with xtb 6.7.0.

@awvwgk, which compiler (version + opt.flags) was used for compiling xtb 6.7.1 for Arm? If it is GCC>=13, Arm has a problem :(

@foxtran
Copy link
Contributor

foxtran commented Dec 22, 2024

https://conda-metadata-app.streamlit.app/?q=conda-forge%2Flinux-aarch64%2Fxtb-6.7.1-h0fb133d_2.conda

As I see, GCC-13 is used. Unfortunately, I do not see optimization flags, but I would expect -O3 and therefore Arm build is broken. (Fixed in #1121)

@awvwgk
Copy link
Member

awvwgk commented Dec 22, 2024

I guess we can patch the xtb build on conda-forge, either by already including the patch or by reducing the optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
unconfirmed This report has not yet been confirmed by the developers
Projects
None yet
Development

No branches or pull requests

3 participants