Skip to content

JoshuaTurner3/TeensyGProf

 
 

Repository files navigation

This repository is a fork of a Teensy 3/4 gprof implementation by ftrias

What is gprof?

gprof is a profiler for applications on Unix systems. A profiler is a software analysis tool that allows for discernment of where a program spends a majority of its execution time thereby allowing an individual to determine the slowest functions within their program and optimize accordingly. gprof itself is a simple profiler and works most easily in conjunction with the GCC compiler due to its built in ‘-pg’ compilation option. GCC’s ‘-pg’ compilation option enables gprof profiling by inserting a call to the instrumentation function ‘mcount’ prior to each called function. The function call to ‘mcount’ allows for the number of times a function is called to be tracked and later reported. In addition to tracking function calls, gprof takes advantage of interrupts to stop execution of the program at set intervals and records the location of the processor’s ‘program counter’ (‘pc’) register. The ‘pc’ is a register that the processor uses to store the memory address for the next execution instruction of the currently running program. Once the program is closed, a gmon.out file is created and can be analyzed using gprof wherein the recorded ‘pc’ register calls may be resolved to a particular function and an estimated runtime for each function can be generated. Once gprof processes a gmon.out file and its associated compiled program binary, it can generate flat profile and call graph tables. The flat profile is a table that lists the estimated time spent in each function of the program whereas the call graph profile lists the relationships between functions callers and callees. These two programs can be used in conjunction to determine the slowest functions of the profiled program.

What is gprof2dot?

gprof2dot is a program that allows for a more intuitive interpretation of gprof output by analyzing output generated by gprof and creating a graph of the program. The generated gprof2dot graph summarizes both the call graph and percentage spent in each function in a single generated image for holistic overview of the program’s execution.

Installation

To install and prepare to use gprof on the Teensy 4.1 microprocessor, the following requirements must be met.

  1. The Arduino Legacy IDE must be installed (It is known to work with version 1.8.19.)
  2. Python 3.0 or higher must be installed.
  3. The OS of host computer must list connected devices in ‘/dev/’. (Most all Linux distributions satisfy this requirement.)
  4. Git and the GitHub CLI must be installed on the host computer.
  5. gprof2dot must be installed, installation instructions are in the provided link.

If all requirements are satisfied, then the following steps should be undertaken:

  1. Navigate to where you would like to install gprof and run the following command in your terminal:

     git clone https://github.com/JoshuaTurner3/TeensyGProf
    
  2. Open your installed Legacy Arduino IDE and open its preferences File > Preferences. Under the 'Additional Boards Manager URLs' section add the following https://www.pjrc.com/teensy/package_teensy_index.json, and then click Ok to close the preferences menu.

  3. Add the TeensyGProf library by selecting Sketch > Include Library > Add .ZIP library... and selecting the TeensyGProf.zip folder in the repository cloned in step #1.

  4. Close the Legacy Arduino IDE

  5. Open your file explorer and navigate to where you installed the TeensyGProf repository.

  6. Copy boards.local.txt and platform.local.txt from the TeensyGProf repository folder and then navigate to your .arduino15 folder. It is likely installed at ~/.arduino15 (You may need to enable Show Hidden Files for your file explorer)

  7. Once at the .arduino15, navigate to ./arduino15/packages/teensy/hardware/avr/<teensy-board-version>/ and then paste the files copied in step #7 into this folder.

  8. To install the cross compiled arm gprof binary, open your terminal and run the following command (or an equivalent command for different package managers)

     sudo apt-get install -y binutils-arm-none-eabi
    

    Confirm the file was installed correctly by running the following command and ensuring the output is not empty:

     ls /usr/bin | grep arm-none-eabi-gprof
    

Your Legacy Arduino IDE is now set up to profile Teensy 4.1 programs.

Using gprof

  1. Open your Terminal (ctrl+alt+t) and cd to the TeensyGProf repository cloned in step #1 and run the following command in your terminal with your desired selection of parameters (Listed below).

     python3 gprof_read.py <your-choice-of-options>
    
    Parameter Function Example
    --hex Convert from ASCII to HEX python3 gprof_read.py --hex
    --serial Read serial device with given name python3 gprof_read.py --serial /dev/ttyACM0
    --elf The ELF file to read (default is /tmp/build.elf) python3 gprof_read.py --elf /path/to/elf.elf
    --img Generate an image from gmon.out python3 gprof_read.py --img "test_1"
    --save Save gprof output to a text file python3 gprof_read.py --save "test_1"
    --project Saves gmon.out, gprof2dot image, and gprof text output in a folder with the project name python3 gprof_read.py --project "test_1"
    --exclude Excludes named functions from gprof processing python3 gprof_read.py --exclude "foo bar func1 func2"

    The recommended command is:

     python3 gprof_read.py --serial /dev/<serial-name> --project <project-name>
    

    The serial device name for the connected Teensy 4.1 can be most easily found from the Arduino IDE under Tools > Port and finding the detected Teensy 4.1 port.

  2. Open the Sketch you would like to profile and select Tools > Board > Teensyduino > Teensy 4.1 and Tools > Profile > On

  3. Edit your sketch to resemble the following:

    // Other code above...
    // Duration of time (ms) to profile the program for
    unsigned long gprof_duration(30000);
    unsigned long gprof_timer_start(millis());
    void loop()
    {
        if(gprof_timer_start && millis() - gprof_timer_start > gprof_duration)
        {
            if(gprof_end() != 0)
            {
                // Failure
                Serial.println("gprof failure");
                Serial.println("Requested memory: ");
                Serial.println(grpof_memory());
            }
        }
        // Rest of your loop code below
    }
    // Other code below...
  4. Upload the sketch and observe the terminal running gprof_read.py from step #6. The program will exit when it is done processing and its results can be found in the TeensyGProf repository folder.

Implementation details

  1. TeensyGProf will commandeer the systick_isr handler. It will process it's own profiling before handing control to the original systick_isr handler. Thus EventResponder and all other timing code should be unaffected. The sampling data will be stored in RAM.

  2. It adds the -pg compiler flag to cpp files. This causes the compiler to add a call to _gnu_mcount_nc at the start of every function. That's how it keeps track of the call stack. Call stacks (called Arcs) are also stored in RAM.

  3. You can configure the amount of RAM memory used by the sampler in Step 1 and the call tracker in Step 2. The more memory you allocate, the more accurate your results. Look at file gmon.h and modify HASHFRACTION and ARCDENSITY.

  4. If you call grpof.begin() and pass milliseconds it will start a timer that upon termination executes gprof.end(), which outputs the data. Otherwise you must call gprof.end(). That processes all the data and outputs the contents of gmon.out to the desired port in the format requested. This file, along with a copy of the elf file is used by gprof to generate a report. You can customize the output method by subclassing class GProfOutput. For example, you could send this file via a network or HTTP.

  5. For some reason, Teensy 4 puts it's code in a section called .text.itcm. Gprof expects it in a section called .text, which is the standard in Linux. Teensy 3 puts it in the right place. So the platform.local.txt files tells arduino to run objcopy to rename the section.

References

For ARM solution this project is based on see: https://mcuoneclipse.com/2015/08/23/tutorial-using-gnu-profiling-gprof-with-arm-cortex-m/

For an interesting overview of gprof: http://wwwcdf.pd.infn.it/localdoc/gprof.pdf

Original repository: https://github.com/ftrias/TeensyGProf

About

Teensy 4.1 gprof implementation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 58.7%
  • Python 21.9%
  • C++ 17.4%
  • Assembly 2.0%