Skip to content

Step by Step Instruction

TimThuering edited this page Aug 18, 2022 · 4 revisions

Requirements:

  • Python 3.9 or higher
  • Installation of the requirements described in the file requirements.txt, e.g. with pip install -r requirements.txt

  1. Run source/model2arch.py via a console (e.g. with python model2arch.py). There is no need for arguments given before execution.

  2. Optional input of previously created extraction. Insert the path to the respective file or skip this input with Enter.

    • Generic Model (pickle) (.dat)
    • MiSim-architecture (.json)

    Note that a MiSim architecture description does not contain all information that are needed for some additional features like Resource-Demand-
    Estimation. Taking a MiSim architecture as input is mainly thought for converting it into a RESIRIO architecture description.

  3. Input of traces (only if no previous extraction was entered in the step before) - for details on how the traces should look like see the Trace Format Description.

    The traces can simply be entered by entering the path to the respective files one by one. After all files are entered, finish the trace input with Enter.

    Instead, if the trace type is either Jaeger or Zipkin, the respective HTTP-API can be used to fetch the traces directly. The Zipkin HTTP-API requires a limit on how many traces you wish to retrieve. From the Jaeger HTTP-API all traces get retrieved every time. You can save all traces from the API as a backup locally. A backup from the Zipkin API can be imported again like normal zipkin traces. A backup from the Jaeger API has to be imported by selecting the backup option b when asked for retrieving Jaeger traces from the API.

  4. Input of which type the output architecture should be. The following architecture description formats are supported:

    • MiSim
    • RESIRIO
  5. Input whether storing the Generic Model in a format (pickle) which can be read by this tool during a future execution. (yes or no)

    This can be useful, because it allows reusing the trace input of the current execution without entering all traces again and going through the extraction process. For example you entered your traces, generated a MiSim-Model out of them and stored the generic model, too. Now you can run the tool again, enter the .dat file at the beginning, and e.g. create a RESIRIO Model out of it. Or you could run Resource-Demand-Estimation with different Parameters (e.g. other CPU-utilizations).

  6. [RESIRIO only] Additional RESIRIO settings.

    • RESIRIO export type (either .js or .json)
    • Output graph should be lightweight (yes or no)
    • Output architecture file should be print pretty (yes or no)
  7. [Trace input only (no model)] Input of a Pattern as Python-RegEx to ignore spans whose names match the given pattern.

    Skipping this will result in the use of a default Pattern, depending on the used trace type. It's ^GET$ for Jaeger and ^get$ for Zipkin. OPEN.xtrace traces do not require this step.

    For example, this can be useful to ignore HTTP GET-Request-Operations or similar. Usually a GET-Request is a unique operation which leads to another, the real operation call. If the GET-Request span in the respective trace isn't ignored, this tool will interpret GET as a dependency of the operation which made the GET-Request. The information about the actual operation which gets called would not be explicitly represented in this case. To avoid this, we recommend ignoring the respective spans by specifying the regular expression.

  8. [MiSim only] Optional input of network-latency. For the correct syntax, consider looking at MiSim's architecture description.

  9. [MiSim only] Input whether the custom-delay should be constant (mean) or with standard derivation. For the semantic, consider looking at MiSim's architecture description.

  10. Input whether the model should be analyzed and or validated.

    • The validation of the architecture is only possible when the output architecture is for RESIRIO
    • The output of the analysis will only be shown as part of the RESIRIO architecture

Now the basic input for the extractor tool is finished. In the following we describe the input for further functionalities. It is needed, because this input is not retrievable from the previously entered model or traces.

  1. Input of the capacity of the instances of each service of the microservice-application. Type one of the following:

    • <Positive integer> to set as default for all services. (e.g. "1000")

    • "manual" to manually enter a capacity for each service (the command line interface will ask for a positive integer for each service one by one)

    • <Path to csv-file> which contains the capacities of each service.

      E.g. a csv-file with capacities for the services A, B, C, D. Make sure that all service names match the service names from the input traces

      A, 1000
      B, 760
      C, 800
      D, 1100
      
  2. Resource-Demand-Estimation

    • <Positive integer> to set a default demand for all operations. (e.g. "100")

    • "manual" to manually enter a demand for each operation (analogue to the manual input of capacities)

    • <Path to csv file containing all demands of each operations>

      E.g. Service A with operations a1, a2, B with b1 and c with c1, c2. Make sure that all service and operation names match the names from the input traces

      A, a1, 112
      A, a2, 78
      B, b1, 900
      C, c1, 543
      C, c2, 654
      
    • "y" to use LibReDE to estimate a value for the demand for each operation.

      Using LibReDE requires the input of the following:

    1. CPU-Utilizations over time

      • <Number in [0, 1]> to set as fix CPU-utilization for all hosts. (e.g. "0.4" for 40% CPU-utilization)
      • "manual" to set a fix CPU-utilization for each host independently (analogue to the manual input of capacities)
      • "csv" to set the CPU-utilization for each host independently. Using this method, the extractor asks for a path to a .csv-file for each host. The first column has to contain UNIX-timestamps (seconds) and the second column has to contain the utilization in percent at the respective time.

      E.g. a csv-file of a single host:

      16443323223, 0.4123
      16443323334, 0.233
      16443323576, 0.33
      16443323690, 0.646
      16443332712, 0.2323
      
    2. <Path to a LibReDE installation> (E.g. "C:\Users\Max\Downloads\librede") - How to install LibReDE described at the LibReDE Installation Guide. Now LibReDE will calculate the estimations. This may take some time.

    3. After an overview of the results, you can decide which approaches' result you want to use. In case you want to use the results of several approaches, the results are aggregated by average.


Finally, an overview of your use of the tool is printed and you should find your output on the same level as the file model2arch.py.

Clone this wiki locally