diff --git a/main.tex b/main.tex index a91c657..81a6284 100644 --- a/main.tex +++ b/main.tex @@ -306,6 +306,8 @@ \chapter*{Acknowledgments} \include{pysipfennTutorial2} +\include{nimcsotutorial} + \end{appendices} diff --git a/nimcsotutorial.tex b/nimcsotutorial.tex new file mode 100644 index 0000000..7b9a418 --- /dev/null +++ b/nimcsotutorial.tex @@ -0,0 +1,514 @@ +\chapter{\texttt{nimCSO} Tutorial} \label{chap:nimplextutorial2}} + +The purpose of this guide is to demonstrate some common use cases of +\texttt{nimCSO} and go in a bit more into the details +of how it could be used, but it is not by any means extensive. If +something is not covered but you would like to see it here, please do +not hesitate to open an issue on GitHub and let use know! + +\hypertarget{dataset-config-and-compilation}{% +\section{Dataset, Config, and +Compilation}\label{nimcsotutorial:dataset-config-and-compilation}} + +To get started, let's first recap what we need to do to get +\texttt{nimCSO} up and running. + +\textbf{1.} Install nim and dependencies, but \textbf{that's already +done for you if you are in the Codespace}. You can see what was run to +get the environment set up in the +\href{../.devcontainer/Dockerfile}{\texttt{Dockerfile}}. + +\textbf{2.} Create the dataset. For now, let's just use the default one +(based on ULTERA Database) that comes with the package. Relative to this +notebook, the dataset is located at +\texttt{../dataList.txt}. Let's have a look at the +first few lines of the file to see what it looks like. + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!head -n 8 ../dataList.txt +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Al,Co,Cr,Cu,Fe,Ni +Nb,Ta,Ti +Co,Cr,Ni +Al,Co,Cr,Fe,Mn,Ni +Al,Co,Fe,Mo,Ni +Hf,Nb,Ti,V +Co,Cr,Fe,Nb,Ni +Al,Co,Cr,Cu,Fe,Ni +\end{minted} + +\textbf{3.} Now, we need to create task +\texttt{config.yaml} file that will describe what we +are doing and point to our data file. That was already done for you in +the \href{config.yaml}{\texttt{config.yaml}} file, but +you are more than welcome to play and modify it. + +\textbf{4.} Finally, we can run the \texttt{nimCSO} +package to get the results. To do so, we will use one long command you +can see below. Let's break it down: - \passthrough{\lstinline"!"} is a +Jupyter Notebook magic command that allows us to run shell commands from +within the notebook. + +\begin{itemize} +\item + \texttt{nim} is the official Nim language compiler. +\item + \texttt{c} instructs \texttt{nim} + compiler to use \texttt{C} compiler to optimize and + compile intermediate code. You can also use + \texttt{cpp} to use \texttt{C++} + compiler or \texttt{objc} to use + \texttt{Objective-C} compiler. If you want, you can + also compile directly with LLVM using + \href{https://github.com/arnetheduck/nlvm}{\texttt{nlvm}}, + but it isn't pre-installed for you here. +\item + \texttt{-f} is a flag to force the compiler to + compile everything, even if the code didn't change. We want this + because \texttt{config.yaml}, which tells + \texttt{nimCSO} how to write itself, is not tracked + by the compiler, but is critical to the compilation process (see two + point below). +\item + \texttt{-d:release} is a flag that tells the compiler + to optimize the code for release. You can also use + \texttt{-d:debug} to compile the code with better + debugging support, but it will be slower and it will not prevent bugs + from happening. There is also \texttt{-d:danger} that + will disable all runtime checks and run a bit faster, but you no + longer get memory safety guarantees. +\item + \texttt{-d:configPath=config.yaml} is a flag pointing + to \textbf{\texttt{config.yaml} that is read and + tells \texttt{nimCSO} (not the compiler!) how to + write itself \emph{before} the compilation starts.} That's the magic + metaprogramming sauce enabling us to write functions which + \texttt{C}/\texttt{C++} compiler can + then turn into single deterministically allocated and exectuted + machine code through + \href{https://en.wikipedia.org/wiki/Inline_expansion}{inlining}. +\item + \texttt{out:nimcso} is just telling the compiler to + output the compiled binary right here and name it + \texttt{nimcso}. You can name it whatever you want, + but it's a good idea to name it something that makes sense. +\item + \texttt{../src/nimcso} is pointing to the source code + of \texttt{nimCSO} package to compile, relative to + this notebook. +\end{itemize} + +Let's run the command and see what happens! Shouldn't take more than a +few seconds. + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!nim c -f -d:release -d:configPath=config.yaml --out:nimcso ../src/nimcso +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Hint: used config file '/opt/conda/nim/config/nim.cfg' [Conf] +Hint: used config file '/opt/conda/nim/config/config.nims' [Conf] +........................................................................................................................................................... +/root/.nimble/pkgs2/nimblas-0.3.0-d5033749759fc7a2a316acf623635dcb6d69d32a/nimblas/private/common.nim(52, 7) Hint: Using BLAS library with name: lib(blas|cblas|openblas).so(||.3|.2|.1|.0) [User] +........................................................................... +config.yaml +CC: ../../../opt/conda/nim/lib/system/exceptions.nim +CC: ../../../opt/conda/nim/lib/std/private/digitsutils.nim +CC: ../../../opt/conda/nim/lib/std/assertions.nim +CC: ../../../opt/conda/nim/lib/system/dollars.nim +CC: ../../../opt/conda/nim/lib/std/syncio.nim +CC: ../../../opt/conda/nim/lib/system.nim +CC: ../../../opt/conda/nim/lib/pure/parseutils.nim +CC: ../../../opt/conda/nim/lib/pure/math.nim +CC: ../../../opt/conda/nim/lib/pure/unicode.nim +CC: ../../../opt/conda/nim/lib/pure/strutils.nim +CC: ../../../opt/conda/nim/lib/pure/hashes.nim +CC: ../../../opt/conda/nim/lib/pure/collections/sets.nim +CC: ../../../opt/conda/nim/lib/pure/times.nim +CC: ../../../opt/conda/nim/lib/std/private/ospaths2.nim +CC: ../../../opt/conda/nim/lib/std/envvars.nim +CC: ../../../opt/conda/nim/lib/std/cmdline.nim +CC: ../../../opt/conda/nim/lib/pure/collections/sequtils.nim +CC: ../../../opt/conda/nim/lib/pure/random.nim +CC: ../../../opt/conda/nim/lib/pure/collections/heapqueue.nim +CC: ../../../opt/conda/nim/lib/pure/strformat.nim +CC: ../../../opt/conda/nim/lib/pure/terminal.nim +CC: ../../../root/.nimble/pkgs2/yaml-2.1.1-302727fcd74c79d0697a4e909d26455d61a5b979/yaml/presenter.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/laser/dynamic_stack_arrays.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/laser/private/memory.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/laser/tensor/datatypes.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/laser/tensor/initialization.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/init_cpu.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/higher_order_applymap.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/private/p_shapeshifting.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/init_copy_cpu.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/private/p_accessors_macros_read.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/private/p_accessors_macros_write.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/shapeshifting.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/private/functional.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/private/p_display.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/display.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/ufunc.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/laser/cpuinfo_x86.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/operators_broadcasted.nim +CC: ../../../root/.nimble/pkgs2/arraymancer-0.7.28-d4a45ada1c7a6abebe60bcdd5ee2d7c4680799a4/arraymancer/tensor/aggregate.nim +CC: nimcso/bitArrayAutoconfigured.nim +CC: nimcso.nim +Hint:  [Link] +Hint: mm: orc; threads: on; opt: speed; options: -d:release +87026 lines; 7.635s; 257.383MiB peakmem; proj: /workspaces/nimCSO/src/nimcso; out: /workspaces/nimCSO/examples/nimcso [SuccessX] +\end{minted} + +Now, let's run \texttt{nimCSO} and see what happens! + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** +To use form command line, provide parameters. Currently supported usage: + +--covBenchmark | -cb --> Run small coverage benchmarks under two implementations. +--expBenchmark | -eb --> Run small node expansion benchmarks. +--leastPreventing | -lp --> Run a search for single-elements preventing the least data, i.e. the least common elements. +--mostCommon | -mc --> Run a search for most common elements. +--bruteForce | -bf --> Run brute force algorithm after getting ETA. Note that it is not feasible for more than 25ish elements. +--bruteForceInt | -bfi --> Run brute force algorithm with faster but not extensible uint64 representation after getting ETA. Up to 64 elements only. +--geneticSearch | -gs --> Run a genetic search algorithm. +--algorithmSearch | -as --> Run a custom problem-informed best-first search algorithm. +--singleSolution | -ss --> Evaluate a single solution based on the elements provided as arguments after the flag. It can be stacked on itself like: + ./nimcso -ss Ta W Hf Si -ss V W Hf Si --singleSolution Ta V +\end{minted} + +You should have seen a neat \texttt{help} message that +tells you how to use \texttt{nimCSO}. Let's start with +a ``coverage'' benchmark to see how fast can we check how many +datapoints will be removed from the dataset if we remove the first 5 +elements of \texttt{elementOrder}. + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso -cb +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** +Running coverage benchmark with int8 Tensor representation +Tensor[system.int8] of shape "[1, 19]" on backend "Cpu" +|1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0| +CPU Time [arraymancer+randomizing] 133.6μs +Prevented count:995 + +Running coverage benchmark with BitArray representation +CPU Time [bitty+randomizing] 13.6μs + | 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|16|17|18|19| + 19 | 1| 1| 1| 1| 1| 1| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| +MoTaVWTiZr->995 +Prevented count:995 + +Running coverage benchmark with bool arrays representation (BitArray graph retained) +CPU Time [bit&boolArrays+randomizing] 16.1μs + | 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|16|17|18|19| + 19 | 1| 1| 1| 1| 1| 1| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| +MoTaVWTiZr->995 +Prevented count:995 + +nimCSO Done! +\end{minted} + +\hypertarget{key-routines-and-brute-forcing}{% +\section{Key Routines and Brute +Forcing}\label{nimcsotutorial:key-routines-and-brute-forcing}} + +And if you were able to run that, you are all set to start using +\texttt{nimCSO}! + +Let's try the simplest routine \texttt{mostCommon} or +\emph{What are the most common elements in the dataset?} + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso --mostCommon +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** + +Running search for single-elements preventing the most data. + 0: Cr->667 + 1: Ti->649 + 2: Fe->622 + 3: Ni->620 + 4: Nb->587 + 5: Co->573 + 6: Al->569 + 7: Mo->466 + 8: Zr->346 + 9: Ta->330 +10: V->256 +11: Hf->219 +12: W->207 +13: Si->92 +14: B->69 +15: Re->55 +16: C->36 +17: Y->3 +18: N->1 + +nimCSO Done! +\end{minted} + +If you didn't modify anything, you should now see that elements like +\texttt{N}, \texttt{Y}, +\texttt{C}, and \texttt{Re}, are not +very common in the dataset, while \texttt{Cr}, +\texttt{Ti}, \texttt{Fe}, and +\texttt{Ni} are very common. When it comes to them, its +pretty obvious that removing the first group will be the first choice, +while the latter will be the last, if we want to keep the dataset as +extensive as possible. + +The critical question here is, \emph{which of the intermediate elements +like \texttt{Hf}, \texttt{V}, +\texttt{Ta}, or \texttt{Zr} should we +remove first?} + +With a dataset spanning 19 elements, the solution space is around 0.5M, +so we can actually just brute force it in seconds :) + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso -bfi +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** + +Running brute force algorithm for 19 elements and 1349 data points. +Solution space size: 524287 +Task ETA Estimate: 7 seconds and 30 milliseconds + 0: ->0 + 1: N->1 + 2: YN->4 + 3: YCN->39 + 4: ReYCN->89 + 5: ReYBCN->142 + 6: SiReYBCN->203 + 7: WSiReYBCN->340 + 8: WHfSiReYBCN->511 + 9: TaWHfSiReYBCN->630 +10: TaWZrHfSiReYBCN->735 +11: TaVWZrHfSiReYBCN->816 +12: TaVWZrHfNbSiReYBCN->859 +13: TaVWTiZrHfNbSiReYBCN->952 +14: MoTaVWTiZrHfNbSiReYBCN->1038 +15: MoTaVWTiZrHfNbAlSiReYBCN->1304 +16: TaVWTiZrHfNbCrAlCoNiReFeYCN->1327 +17: MoTaVWTiZrHfNbCrAlSiCoNiReFeYB->1349 +18: MoTaVWTiZrHfNbCrAlSiCoNiReFeYBC->1349 +19: MoTaVWTiZrHfNbCrAlSiCoNiReFeYBCN->1349 +CPU Time [Brute Force] 7308.5ms + +nimCSO Done! +\end{minted} + +Let's look at the result! As expected, \texttt{N}, +\texttt{Y}, \texttt{C}, and +\texttt{Re} are removed first (0-4) and then the trend +follows for a bit to \texttt{Hf} \textbf{The first +break is \texttt{V}, you can notice that it's better to +remove either or both \texttt{Ta} or +\texttt{Zr} first, despite the fact that they are +nearly 50\% more common than \texttt{V}} That's +because they often coocur with \texttt{Re} and +\texttt{Hf}, which are not common. + +We can test exactly how much more data we will have if we remove +\texttt{Ta} insead of \texttt{V} by +using the \texttt{--singleSolution} / +\texttt{-ss} routine. + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso -ss Ta W Hf Si Re Y B C N -ss V W Hf Si Re Y B C N +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** +Testing solution with @[@["Ta", "W", "Hf", "Si", "Re", "Y", "B", "C", "N"], @["V", "W", "Hf", "Si", "Re", "Y", "B", "C", "N"]] + 9: TaWHfSiReYBCN->630 + 9: VWHfSiReYBCN->697 + +nimCSO Done! +\end{minted} + +Wow! Looking at the \texttt{--mostCommon} output from +earlier, we can see that \textbf{\texttt{Ta} is present +in 74 more datapoints than \texttt{V}, but after +removing \texttt{WHfSiReYBCN}, picking +\texttt{V} as one of 10 elements to model will result +in 67 \emph{more} datapoints.} Relative to a dataset without +interdependencies, that's a 141 datapoint difference! + +And another case that breaks from the ordering is +\texttt{Mo}, which is better to keep than much more +common \texttt{Nb}, and after +\texttt{Nb} is removed, even better thank keeping the +\texttt{Ti}, which is the second most common element in +the dataset! + +Similarly to what we did with \texttt{V} +vs.~\texttt{Ta}, we can test how much more data we will +have if we remove \texttt{Nb} instead of +\texttt{Mo} by using the +\texttt{--singleSolution} / +\texttt{-ss} routine. + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso -ss Ta V W Zr Hf Nb Si Re Y B C N -ss Ta V W Zr Hf Mo Si Re Y B C N -ss Ta V W Zr Hf Ti Si Re Y B C N +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** +Testing solution with @[@["Ta", "V", "W", "Zr", "Hf", "Nb", "Si", "Re", "Y", "B", "C", "N"], @["Ta", "V", "W", "Zr", "Hf", "Mo", "Si", "Re", "Y", "B", "C", "N"], @["Ta", "V", "W", "Zr", "Hf", "Ti", "Si", "Re", "Y", "B", "C", "N"]] +12: TaVWZrHfNbSiReYBCN->859 +12: MoTaVWZrHfSiReYBCN->935 +12: TaVWTiZrHfSiReYBCN->938 + +nimCSO Done! +\end{minted} + +We can see that \textbf{\texttt{Nb} is present in 121 +more datapoints than \texttt{Mo}, but after removing +\texttt{TaVWZrHfSiReYBCN}, picking +\texttt{Mo} as one of 7 elements to model will result +in 76 \emph{more} datapoints.} Relative to a dataset without +interdependencies, that's a 197 datapoint difference, which is even more +than the \texttt{Ta} vs.~\texttt{V} +case! Additionally, we can see that \texttt{Ti} is only +3 datapoints better than \texttt{Mo}, despite being +present in 183 more datapoints than \texttt{Mo}. + +\hypertarget{algorithm-search}{% +\section{Algorithm Search}\label{nimcsotutorial:algorithm-search}} + +The +\texttt{--bruteForceInt}/\texttt{-bfi} +routine we used to find the solutions worked great for our 19-element +dataset and took only a few seconds on the low-performance Codespace +machine, but in many cases dimensionality of the problem will be too +high to brute force it. + +Let's now try to use the +\texttt{--algorithmSearch}/\texttt{-as} +routine, which takes advantage of some assumptions known to be valid or +likely to be valid (see manuscript), to limit the search space and find +the solution in a reasonable time. Let's try it now! + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso -as +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** + +Running Algorithm Search for 19 elements. + 1: N->1 (tree size: 19) + 2: YN->4 (tree size: 52) + 3: YCN->39 (tree size: 113) + 4: ReYCN->89 (tree size: 275) + 5: ReYBCN->142 (tree size: 581) + 6: SiReYBCN->203 (tree size: 690) + 7: WSiReYBCN->340 (tree size: 1818) + 8: WHfSiReYBCN->511 (tree size: 3873) + 9: TaWHfSiReYBCN->630 (tree size: 5213) +10: TaWZrHfSiReYBCN->735 (tree size: 4833) +11: TaVWZrHfSiReYBCN->816 (tree size: 4192) +12: TaVWZrHfNbSiReYBCN->859 (tree size: 3784) +13: TaVWTiZrHfNbSiReYBCN->952 (tree size: 2955) +14: MoTaVWTiZrHfNbSiReYBCN->1038 (tree size: 1765) +15: MoTaVWTiZrHfNbAlSiReYBCN->1304 (tree size: 45) +16: MoTaVWTiZrHfNbAlSiReFeYBCN->1338 (tree size: 8) +17: MoTaVWTiZrHfNbCrAlSiReFeYBCN->1349 (tree size: 4) +18: MoTaVWTiZrHfNbCrAlSiNiReFeYBCN->1349 (tree size: 4) +CPU Time [exploring] 109.7ms + +nimCSO Done! +\end{minted} + +As you can see, \textbf{the algorithm reproduced the same results as the +brute force search around 100 times faster}, except for third-to-last +step because dataset had points with at least 3 elements breaking its +backtracking assumptions. + +\hypertarget{genetic-search}{% +\section{Genetic Search}\label{nimcsotutorial:genetic-search}} + +For cases where the dimensionality of the problem is too high to either +brute-force or use the algorithm search, we can still use the +\texttt{--geneticSearch}/\texttt{-gs} +routine to find the solution in a reasonable time. Let's try it now! + +Please note that the results are stochastic, so you might get different +results than ones shown below if you run the command again. + +\begin{minted}[xleftmargin=3\parindent, linenos=true, fontsize=\small]{python} +!./nimcso -gs +\end{minted} + +\begin{minted}[xleftmargin=3\parindent, fontsize=\small, bgcolor=subtlegray]{output} +Using 1 uint64s to store 19 elements. +Configured for task: QuickStart (Just a copy of RCCA Palette from Senkov 2018 Review) +***** nimCSO (Composition Space Optimization) ***** + +Running Genetic Search algorithm for 19 elements and 1349 data points. +Initiating each level with 100 random solutions and expanding 100 solutions at each level for up to 1000 iterations. + 1: N->1 + 2: YN->4 (queue size: 256) + 3: YCN->39 (queue size: 615) + 4: ReYCN->89 (queue size: 869) + 5: ReYBCN->142 (queue size: 929) + 6: SiReYBCN->203 (queue size: 1379) + 7: WSiReYBCN->340 (queue size: 1267) + 8: WHfSiReYBCN->511 (queue size: 1631) + 9: TaWHfSiReYBCN->630 (queue size: 1578) +10: TaWZrHfSiReYBCN->735 (queue size: 1835) +11: TaVWZrHfSiReYBCN->816 (queue size: 1621) +12: TaVWZrHfNbSiReYBCN->859 (queue size: 1746) +13: VWCrAlSiCoNiReFeYBCN->1176 (queue size: 1713) +14: MoTaVWTiZrHfNbSiReYBCN->1038 (queue size: 1565) +15: MoTaVWCrAlSiCoNiReFeYBCN->1320 (queue size: 1028) +16: TaVWTiZrHfNbCrAlCoNiReFeYCN->1327 (queue size: 1575) +17: MoTaVWTiZrHfNbCrAlSiCoFeYBCN->1349 (queue size: 268) +18: MoTaVWTiZrHfNbCrAlSiCoNiReFeBCN->1349 (queue size: 18) +CPU Time [Genetic Search] 766.9ms + +nimCSO Done! +\end{minted} + +\hypertarget{summary}{% +\subsection{Summary}\label{nimcsotutorial:summary}} + +Now, you should be able to apply \texttt{nimCSO} to +your own dataset and get some valuable insights on how to model it! + +If you are working in a Codespace, you can just do everything right in +this notebook by simply modifying the +\texttt{config.yaml} file and running the commands you +just learned about. The Codespace will be persisted until you explicitly +delete it, so you can come back to it later and continue your work by +clicking on the link in the ``Open in Codespaces'' badge in the README +of the repository and resuming your work.