Website: 2nd quarter improvements #67

siddharth-krishna · 2024-12-02T08:51:18Z

jacek-oet · 2024-12-02T13:08:51Z

benchmarks: filter the sizes table to show only those that are in the results CSV

This means the validation check will only check the benchmark names and ignore cases where some sizes within a benchmark do not appear in the CSV, right?

# Assert that the set of benchmark names in the metadata matches those in the data
csv_benchmarks = set(data_df["Benchmark"].unique())
metadata_benchmarks = set(metadata_df["Benchmark Name"].unique())
# Assertion to check if both sets are the same
assert csv_benchmarks == metadata_benchmarks, (
    f"Mismatch between CSV benchmarks and metadata benchmarks:\n"
    f"In CSV but not metadata: {csv_benchmarks - metadata_benchmarks}\n"
    f"In metadata but not CSV: {metadata_benchmarks - csv_benchmarks}"
)

Home: automatically compute number of solvers, benchmarks (also add sizes), and time out from the results CSV file

We will display it like this, right

Solvers: 3: HiGHS, GLPK, SCIP
...
Number running benchmarks: xx
Number timeout: yy

Regarding the sizes:
Do you mean we will have a table showing all the sizes we have, or will we add a table with benchmarks and their size information?

siddharth-krishna · 2024-12-02T13:16:57Z

This means the validation check will only check the benchmark names and ignore cases where some sizes within a benchmark do not appear in the CSV, right?

Yes, sounds good.

Home: automatically compute number of solvers, benchmarks (also add sizes), and time out from the results CSV file

Yes, but to clarify, I meant the Timeout value, not the number of benchmarks that time out. (Thanks for checking!) I think the following rows of the Details table in the home page can be computed from the results CSV:

Solvers: 3 (6, including versions)
Benchmarks: X (Y, including sizes)
Timeout: 15 min

So X is the number of benchmark names, and Y is the number of (benchmark name, benchmark size) combinations. If we had 2 benchmarks with 3 sizes each, X = 2, Y = 6. We don't need to add another table for sizes.

siddharth-krishna · 2024-12-03T08:14:57Z

I noticed another bug on commit 59f1e54 in the Benchmark Details page: the MIP information table is blank even though the Full Results page for this benchmark shows some non-zero values

jacek-oet · 2024-12-03T14:17:54Z

@siddharth-krishna

We also need to add these new metrics to the website. For now, perhaps the simplest way to do this is to add a little table to the Benchmark details page, with columns "Solver, Max Integrality Violation, Duality Gap". You can display the results from the last iteration for each solver, if the results CSV has multiple iterations. Also, make sure you only display this info for Benchmarks that are MIPs, not for LPs. Thanks!

For now, the MIP information table only displays data from the last iteration, but it contains blank values. Should we filter out these blank values first? Additionally, should we display the information for each size of the benchmarks and include a column for the size information

Benchmark	Size	Solver	Solver Version	Solver Release Year	Status	Termination Condition	Runtime (s)	Memory Usage (MB)	Objective Value	Max Integrality Violation	Duality Gap
tulipa-1_EU_investment_simple	28-24h	highs	1.8.1	2024	ok	optimal	8.176160335540771	206.452	223314143.52914777	0.5	9.676663361641651e-05
tulipa-1_EU_investment_simple	28-24h	scip	9.1.1	2024	ok	optimal	8.024469375610352	209.644	223310171.76019543	0.5	0.0
tulipa-1_EU_investment_simple	28-168h	highs	1.8.1	2024	ok	optimal	38.88190937042236	553.792	520255461.06727207	0.5	1.6326766790540713e-05
tulipa-1_EU_investment_simple	28-168h	scip	9.1.1	2024	ok	optimal	21.625496864318848	530.328	520247745.0896578	0.5	0.0
tulipa-1_EU_investment_simple	28-672h	highs	1.8.1	2024	TO	Timeout	60	1087.94
tulipa-1_EU_investment_simple	28-672h	scip	9.1.1	2024	TO	Timeout	60	919.94
tulipa-1_EU_investment_simple	28-2016h	highs	1.8.1	2024	TO	Timeout	60	1570.64
tulipa-1_EU_investment_simple	28-2016h	scip	9.1.1	2024	TO	Timeout	60	2412.604
tulipa-1_EU_investment_simple	28-4032h	highs	1.8.1	2024	TO	Timeout	60	2253.336
tulipa-1_EU_investment_simple	28-4032h	scip	9.1.1	2024	TO	Timeout	60	4710.044
tulipa-1_EU_investment_simple	28-8760h	highs	1.8.1	2024	TO	Timeout	60	4438.652
tulipa-1_EU_investment_simple	28-8760h	scip	9.1.1	2024	TO	Timeout	60	7325.508

siddharth-krishna · 2024-12-03T14:23:53Z

For now, the MIP information table only displays data from the last iteration, but it contains blank values. Should we filter out these blank values first?

Iteration? I think on commit 59f1e54 I ran only one iteration per benchmark, so there should only be one iteration. Perhaps the issue is that there are rows for each size, and some sizes have blank values? Then maybe it is solved by the below point.

Additionally, should we display the information for each size of the benchmarks and include a column for the size information

Good point, please add a size column to the table.

closes open-energy-transition#67 ### Summary - [x] On commit [59f1e54](open-energy-transition@59f1e54) on Full Results: I can't see results for Highs versions '1.5.0.dev0' and '1.6.0.dev0', I only see '1.8.1' ![image](https://github.com/user-attachments/assets/cfae7b93-e1ce-4460-b929-d32930c7785b) - [x] On the above commit, there's also a [bug in the MIP information table](open-energy-transition#67 (comment)) ![image](https://github.com/user-attachments/assets/238ba668-9069-4984-9751-5d71821aaa2c) - [x] benchmarks: filter the sizes table to show only those that are in the results CSV ![image](https://github.com/user-attachments/assets/9b82a778-081d-4368-bc8a-17a53fc12460) - [x] Home: automatically compute number of solvers, benchmarks (also add sizes), and time out from the results CSV file ![image](https://github.com/user-attachments/assets/6de2b836-19b2-4895-a824-ec78064df5fa) - [x] scaling: y-axis label should not say `Run Solver` it should say either runtime or memory as appropriate - [x] scaling: can we split the plots to be 1 plot per row, instead of 3 per row? Now with a lot of data the plots are rather small ![image](https://github.com/user-attachments/assets/7f965ab3-30c1-409a-b852-edd2ceb2f5fa) ![image](https://github.com/user-attachments/assets/ab136e61-6c08-415b-bd95-72b603469793) - [x] Home: combine the SGM tables into one table, which has columns "Solver", (Solver) "Version", (normalized) "SGM Runtime", (normalized) "SGM Memory", (number of benchmarks) "Solved"; and sort this table by SGM runtime; and give it the caption "Results" ![image](https://github.com/user-attachments/assets/70be6d0d-a251-4491-8322-69ab664a17da) - [x] History: also add a plot with "Number of benchmarks solved" (i.e. status = ok) on the y-axis and years on the x-axis, and plot all solvers on the same chart like we do with the existing charts. The aim is to see how the number solved increases with newer solver versions. ![image](https://github.com/user-attachments/assets/4337e4a3-bc8b-4a7c-b76e-d572f487a83b)

siddharth-krishna assigned jacek-oet Dec 2, 2024

jacek-oet mentioned this issue Dec 4, 2024

67 website 2nd quarter improvements #69

Merged

8 tasks

jacek-oet closed this as completed in #69 Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Website: 2nd quarter improvements #67

Website: 2nd quarter improvements #67

siddharth-krishna commented Dec 2, 2024 •

edited

Loading

jacek-oet commented Dec 2, 2024

siddharth-krishna commented Dec 2, 2024 •

edited

Loading

siddharth-krishna commented Dec 3, 2024 •

edited

Loading

jacek-oet commented Dec 3, 2024

siddharth-krishna commented Dec 3, 2024 •

edited

Loading

Website: 2nd quarter improvements #67

Website: 2nd quarter improvements #67

Comments

siddharth-krishna commented Dec 2, 2024 • edited Loading

jacek-oet commented Dec 2, 2024

siddharth-krishna commented Dec 2, 2024 • edited Loading

siddharth-krishna commented Dec 3, 2024 • edited Loading

jacek-oet commented Dec 3, 2024

siddharth-krishna commented Dec 3, 2024 • edited Loading

siddharth-krishna commented Dec 2, 2024 •

edited

Loading

siddharth-krishna commented Dec 2, 2024 •

edited

Loading

siddharth-krishna commented Dec 3, 2024 •

edited

Loading

siddharth-krishna commented Dec 3, 2024 •

edited

Loading