Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Website: 2nd quarter improvements #67

Closed
9 of 10 tasks
siddharth-krishna opened this issue Dec 2, 2024 · 5 comments · Fixed by #69
Closed
9 of 10 tasks

Website: 2nd quarter improvements #67

siddharth-krishna opened this issue Dec 2, 2024 · 5 comments · Fixed by #69
Assignees

Comments

@siddharth-krishna
Copy link
Contributor

siddharth-krishna commented Dec 2, 2024

  • On commit 59f1e54 on Full Results: I can't see results for Highs versions '1.5.0.dev0' and '1.6.0.dev0', I only see '1.8.1'
  • On the above commit, there's also a bug in the MIP information table
  • On the above commit, I can't see any of the new benchmarks (non pypsa-*) in the scaling page, is there a bug?
  • benchmarks: filter the sizes table to show only those that are in the results CSV
  • Home: automatically compute number of solvers, benchmarks (also add sizes), and time out from the results CSV file
  • scaling: y-axis label should not say Run Solver it should say either runtime or memory as appropriate
  • scaling: can we split the plots to be 1 plot per row, instead of 3 per row? Now with a lot of data the plots are rather small
  • Home: combine the SGM tables into one table, which has columns "Solver", (Solver) "Version", (normalized) "SGM Runtime", (normalized) "SGM Memory", (number of benchmarks) "Solved"; and sort this table by SGM runtime; and give it the caption "Results"
  • History: also add a plot with "Number of benchmarks solved" (i.e. status = ok) on the y-axis and years on the x-axis, and plot all solvers on the same chart like we do with the existing charts. The aim is to see how the number solved increases with newer solver versions.
  • To discuss: Should we remove warnings from SGM calculation? Or use TO value? (But then what do we do for memory?) See GLPK errors on JuMP-HiGHS MPS benchmarks #68
@jacek-oet
Copy link
Collaborator

benchmarks: filter the sizes table to show only those that are in the results CSV

This means the validation check will only check the benchmark names and ignore cases where some sizes within a benchmark do not appear in the CSV, right?

# Assert that the set of benchmark names in the metadata matches those in the data
csv_benchmarks = set(data_df["Benchmark"].unique())
metadata_benchmarks = set(metadata_df["Benchmark Name"].unique())
# Assertion to check if both sets are the same
assert csv_benchmarks == metadata_benchmarks, (
    f"Mismatch between CSV benchmarks and metadata benchmarks:\n"
    f"In CSV but not metadata: {csv_benchmarks - metadata_benchmarks}\n"
    f"In metadata but not CSV: {metadata_benchmarks - csv_benchmarks}"
)

Home: automatically compute number of solvers, benchmarks (also add sizes), and time out from the results CSV file

We will display it like this, right

Solvers: 3: HiGHS, GLPK, SCIP
...
Number running benchmarks: xx
Number timeout: yy
  • Regarding the sizes:
    Do you mean we will have a table showing all the sizes we have, or will we add a table with benchmarks and their size information?

@siddharth-krishna
Copy link
Contributor Author

siddharth-krishna commented Dec 2, 2024

This means the validation check will only check the benchmark names and ignore cases where some sizes within a benchmark do not appear in the CSV, right?

Yes, sounds good.

Home: automatically compute number of solvers, benchmarks (also add sizes), and time out from the results CSV file

Yes, but to clarify, I meant the Timeout value, not the number of benchmarks that time out. (Thanks for checking!) I think the following rows of the Details table in the home page can be computed from the results CSV:

Solvers: 3 (6, including versions)
Benchmarks: X (Y, including sizes)
Timeout: 15 min

So X is the number of benchmark names, and Y is the number of (benchmark name, benchmark size) combinations. If we had 2 benchmarks with 3 sizes each, X = 2, Y = 6. We don't need to add another table for sizes.

@siddharth-krishna
Copy link
Contributor Author

siddharth-krishna commented Dec 3, 2024

  • I noticed another bug on commit 59f1e54 in the Benchmark Details page: the MIP information table is blank even though the Full Results page for this benchmark shows some non-zero values

Image

Image

@jacek-oet
Copy link
Collaborator

@siddharth-krishna

We also need to add these new metrics to the website. For now, perhaps the simplest way to do this is to add a little table to the Benchmark details page, with columns "Solver, Max Integrality Violation, Duality Gap". You can display the results from the last iteration for each solver, if the results CSV has multiple iterations. Also, make sure you only display this info for Benchmarks that are MIPs, not for LPs. Thanks!

For now, the MIP information table only displays data from the last iteration, but it contains blank values. Should we filter out these blank values first? Additionally, should we display the information for each size of the benchmarks and include a column for the size information

Benchmark Size Solver Solver Version Solver Release Year Status Termination Condition Runtime (s) Memory Usage (MB) Objective Value Max Integrality Violation Duality Gap
tulipa-1_EU_investment_simple 28-24h highs 1.8.1 2024 ok optimal 8.176160335540771 206.452 223314143.52914777 0.5 9.676663361641651e-05
tulipa-1_EU_investment_simple 28-24h scip 9.1.1 2024 ok optimal 8.024469375610352 209.644 223310171.76019543 0.5 0.0
tulipa-1_EU_investment_simple 28-168h highs 1.8.1 2024 ok optimal 38.88190937042236 553.792 520255461.06727207 0.5 1.6326766790540713e-05
tulipa-1_EU_investment_simple 28-168h scip 9.1.1 2024 ok optimal 21.625496864318848 530.328 520247745.0896578 0.5 0.0
tulipa-1_EU_investment_simple 28-672h highs 1.8.1 2024 TO Timeout 60 1087.94
tulipa-1_EU_investment_simple 28-672h scip 9.1.1 2024 TO Timeout 60 919.94
tulipa-1_EU_investment_simple 28-2016h highs 1.8.1 2024 TO Timeout 60 1570.64
tulipa-1_EU_investment_simple 28-2016h scip 9.1.1 2024 TO Timeout 60 2412.604
tulipa-1_EU_investment_simple 28-4032h highs 1.8.1 2024 TO Timeout 60 2253.336
tulipa-1_EU_investment_simple 28-4032h scip 9.1.1 2024 TO Timeout 60 4710.044
tulipa-1_EU_investment_simple 28-8760h highs 1.8.1 2024 TO Timeout 60 4438.652
tulipa-1_EU_investment_simple 28-8760h scip 9.1.1 2024 TO Timeout 60 7325.508

@siddharth-krishna
Copy link
Contributor Author

siddharth-krishna commented Dec 3, 2024

For now, the MIP information table only displays data from the last iteration, but it contains blank values. Should we filter out these blank values first?

Iteration? I think on commit 59f1e54 I ran only one iteration per benchmark, so there should only be one iteration. Perhaps the issue is that there are rows for each size, and some sizes have blank values? Then maybe it is solved by the below point.

Additionally, should we display the information for each size of the benchmarks and include a column for the size information

Good point, please add a size column to the table.

drifter089 pushed a commit to drifter089/solver-benchmark that referenced this issue Dec 9, 2024
closes open-energy-transition#67 
### Summary
 
- [x] On commit
[59f1e54](open-energy-transition@59f1e54)
on Full Results: I can't see results for Highs versions '1.5.0.dev0' and
'1.6.0.dev0', I only see '1.8.1'

![image](https://github.com/user-attachments/assets/cfae7b93-e1ce-4460-b929-d32930c7785b)



- [x] On the above commit, there's also a [bug in the MIP information
table](open-energy-transition#67 (comment))


![image](https://github.com/user-attachments/assets/238ba668-9069-4984-9751-5d71821aaa2c)

- [x] benchmarks: filter the sizes table to show only those that are in
the results CSV

![image](https://github.com/user-attachments/assets/9b82a778-081d-4368-bc8a-17a53fc12460)


- [x] Home: automatically compute number of solvers, benchmarks (also
add sizes), and time out from the results CSV file


![image](https://github.com/user-attachments/assets/6de2b836-19b2-4895-a824-ec78064df5fa)

- [x] scaling: y-axis label should not say `Run Solver` it should say
either runtime or memory as appropriate
- [x] scaling: can we split the plots to be 1 plot per row, instead of 3
per row? Now with a lot of data the plots are rather small

![image](https://github.com/user-attachments/assets/7f965ab3-30c1-409a-b852-edd2ceb2f5fa)

![image](https://github.com/user-attachments/assets/ab136e61-6c08-415b-bd95-72b603469793)

- [x] Home: combine the SGM tables into one table, which has columns
"Solver", (Solver) "Version", (normalized) "SGM Runtime", (normalized)
"SGM Memory", (number of benchmarks) "Solved"; and sort this table by
SGM runtime; and give it the caption "Results"


![image](https://github.com/user-attachments/assets/70be6d0d-a251-4491-8322-69ab664a17da)

- [x] History: also add a plot with "Number of benchmarks solved" (i.e.
status = ok) on the y-axis and years on the x-axis, and plot all solvers
on the same chart like we do with the existing charts. The aim is to see
how the number solved increases with newer solver versions.


![image](https://github.com/user-attachments/assets/4337e4a3-bc8b-4a7c-b76e-d572f487a83b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants