-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathconclusion.tex
23 lines (18 loc) · 2.7 KB
/
conclusion.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
\chapter{Conclusion}\label{sec:conclusion}
The results, discussed in \Cref{sec:eval:perf}, fall in line with the results from the ones for static bound unrolling~\cite{aebi18bachelorarbeit}.
They, therefore, suggest that there is now empirical evidence that independent of the unrolling method, and the factor chosen, loop unrolling does not yield a significant performance benefit in the current state of~\libFIRM.
Even though more loops were able to be unrolled through the added loop optimization, the increase in unrollability only led to about one in ten loops being unrolled, which certainly is a contributing factor to the underwhelming improvements.
Probably some restrictions, such as disallowing \texttt{break}-like structures, are too limited and could be dealt with through further development.
Other restrictions, such as the conservative alias or call manipulation checks for the bound are unavoidable if the semantics are to be kept and forthright inherent to the task at hand.
Inconsiderate of these reasons, even the benchmarks with high unrollability of their loops, did not seem to benefit (with \texttt{h264ref} being an exception).
Further, it can be concluded that the choice of the fixup code strategy seems to have a negligible impact on performance.
Due to the very low standard deviations across all benchmarks, the results also lead to a firm belief the obtained results are trustable and hence provide a solid foundation for empirical conclusions.
Thus, the eminent challenge seems to be the lack of performance gain through unrolling loops.
Therefore, it would be a natural starting point to use the unrolled loops and optimize their bodies further.
An optimization could be created that takes advantage of the implicitly added semantics as for having a specific modulus, respective to $f$, for each copied block.
Before this potential is used, it likely would be a more lucrative endeavor, to stick to less fancy optimizations that can take advantage of the unrolled loop structures, such as automatically parallelizing non-conflicting operations.
Another factor that might have influenced the results was the method used to determine the unroll-factor.
In the future, it could be evaluated, whether the performance would improve through a more sophisticated unroll-factor selection, with a multi-parameter cost function.
Once these changes are in-place, the feasibility of loop unrolling in~\libFIRM{} should be reevaluated.
Currently, the efforts of increasing unrollable loops seem to exceed the benefits.
Though, if the desire for more unroallability should pick up again, it would seem a good point to look at other loop structures, such as loops with breaks, or a non-counting loop, unlike the ones examined in this thesis.