Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gptimer): race on FSM state in gptimer_start() (IDFGH-13929) #14767

Closed
wants to merge 1 commit into from

Conversation

Lsitar
Copy link
Contributor

@Lsitar Lsitar commented Oct 22, 2024

Actual behaviour

gptimer_start() may be interrupted between timer start and setting fsm state GPTIMER_FSM_RUN, so other functions may see wrong state of gptimer and misbehave,
e.g. gptimer_start() may be interrupted by gptimer callback and in this callback called gptimer_stop() sees wrong fsm state.

Expected behaviour

"APIs provided by the driver are guaranteed to be thread safe, (...) are allowed to run under ISR context."
as in https://docs.espressif.com/projects/esp-idf/en/latest/esp32s3/api-reference/peripherals/gptimer.html#thread-safety

Description

it is 6 months since I encountered and described the problem, but it got no attention.
https://esp32.com/viewtopic.php?f=13&t=39428

Now I need the newer ESP-IDF and therefore need to fix the race condition in gptimer, as described on the forum.
I'm sure there is race, but I'm not sure if my fix is the right one, as the issue is caused by FSM design and it may still have similar issues.

Testing

  1. init the gptimer with 1 MHz and set callback on alarm:
    gptimer_config_t gptimer_config = {
        .clk_src = GPTIMER_CLK_SRC_DEFAULT,
        .direction = GPTIMER_COUNT_UP,
        .resolution_hz = 1000 * 1000,
        .flags.intr_shared = 0,
    };
    gptimer_new_timer(&gptimer_config, &timer_handle);

    gptimer_event_callbacks_t callbacks = {
        .on_alarm = timer_callback,
    };
    gptimer_register_event_callbacks(timer_handle, &callbacks, NULL);
    gptimer_enable(timer_handle); // enable but not start yet
  1. set alarm action with some short time (e.g. 1 - 10 us), clear timer and run.
    gptimer_alarm_config_t alarm_config = {
        .alarm_count = 1, // tick = 1 us
        .flags.auto_reload_on_alarm = 0,
    };
    gptimer_set_alarm_action(timer_handle, &alarm_config);
    gptimer_set_raw_count(timer_handle, 0);
    gptimer_start(iu_timer_handle);
  1. in the callback stop the timer
static bool IRAM_ATTR timer_callback(gptimer_handle_t timer, const gptimer_alarm_event_data_t* edata, void* user_data)
{
    if (gptimer_stop(timer) != ESP_OK)
    {
        ESP_DRAM_LOGE(TAG, "X");
    }
    return pdFALSE;
}

To reproduce this problem, it may be important that when this timer code is executed, simultaneously are received ADC samples at 16 kSPS, triggering gpio ISR at every sample. I tried it without doing SPI transfer and also without the ADC running, but the GPTimer race is the same.

There is alternative fix at application level: blocking sheduler while timer start fixes the race, actually masks the problem

vTaskSuspendAll();
gptimer_start(timer_handle);
xTaskResumeAll();

@CLAassistant
Copy link

CLAassistant commented Oct 22, 2024

CLA assistant check
All committers have signed the CLA.

Copy link

Warnings
⚠️

Some issues found for the commit messages in this PR:

  • the commit message "Fix race on FSM state in gptimer_start()":
    • summary looks empty
    • type/action looks empty

Please fix these commit messages - here are some basic tips:

  • follow Conventional Commits style
  • correct format of commit message should be: <type/action>(<scope/component>): <summary>, for example fix(esp32): Fixed startup timeout issue
  • allowed types are: change,ci,docs,feat,fix,refactor,remove,revert,test
  • sufficiently descriptive message summary should be between 20 to 72 characters and start with upper case letter
  • avoid Jira references in commit messages (unavailable/irrelevant for our customers)

TIP: Install pre-commit hooks and run this check when committing (uses the Conventional Precommit Linter).

👋 Hello Lsitar, we appreciate your contribution to this project!


📘 Please review the project's Contributions Guide for key guidelines on code, documentation, testing, and more.

🖊️ Please also make sure you have read and signed the Contributor License Agreement for this project.

Click to see more instructions ...


This automated output is generated by the PR linter DangerJS, which checks if your Pull Request meets the project's requirements and helps you fix potential issues.

DangerJS is triggered with each push event to a Pull Request and modify the contents of this comment.

Please consider the following:
- Danger mainly focuses on the PR structure and formatting and can't understand the meaning behind your code or changes.
- Danger is not a substitute for human code reviews; it's still important to request a code review from your colleagues.
- Resolve all warnings (⚠️ ) before requesting a review from human reviewers - they will appreciate it.
- To manually retry these Danger checks, please navigate to the Actions tab and re-run last Danger workflow.

Review and merge process you can expect ...


We do welcome contributions in the form of bug reports, feature requests and pull requests via this public GitHub repository.

This GitHub project is public mirror of our internal git repository

1. An internal issue has been created for the PR, we assign it to the relevant engineer.
2. They review the PR and either approve it or ask you for changes or clarifications.
3. Once the GitHub PR is approved, we synchronize it into our internal git repository.
4. In the internal git repository we do the final review, collect approvals from core owners and make sure all the automated tests are passing.
- At this point we may do some adjustments to the proposed change, or extend it by adding tests or documentation.
5. If the change is approved and passes the tests it is merged into the default branch.
5. On next sync from the internal git repository merged change will appear in this public GitHub repository.

Generated by 🚫 dangerJS against 4fd888b

@espressif-bot espressif-bot added the Status: Opened Issue is new label Oct 22, 2024
@github-actions github-actions bot changed the title fix(gptimer): race on FSM state in gptimer_start() fix(gptimer): race on FSM state in gptimer_start() (IDFGH-13929) Oct 22, 2024
@suda-morris suda-morris added the PR-Sync-Merge Pull request sync as merge commit label Oct 23, 2024
@suda-morris
Copy link
Collaborator

sha=4fd888b7a9b58692f8c52370034e6f214fec626b

@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels Oct 23, 2024
@suda-morris suda-morris removed the PR-Sync-Merge Pull request sync as merge commit label Oct 23, 2024
@Lsitar
Copy link
Contributor Author

Lsitar commented Nov 4, 2024

@suda-morris do You know, or have estimate, when this will be pulled to master, or rejected? I ask because we are waiting with IDF update. I know I can modify the IDF another way, but I prefer to take a "clean" official revision to CI.

@espressif-bot espressif-bot added Status: Reviewing Issue is being reviewed Status: Done Issue is done internally Resolution: NA Issue resolution is unavailable and removed Status: In Progress Work is in progress Status: Reviewing Issue is being reviewed labels Nov 18, 2024
espressif-bot pushed a commit that referenced this pull request Dec 3, 2024
espressif-bot pushed a commit that referenced this pull request Dec 3, 2024
espressif-bot pushed a commit that referenced this pull request Dec 3, 2024
espressif-bot pushed a commit that referenced this pull request Dec 4, 2024
@Alvin1Zhang
Copy link
Collaborator

Thanks for contribution again, changes have been merged with 0f8e6f6.

@Alvin1Zhang Alvin1Zhang closed this Dec 6, 2024
espressif-bot pushed a commit that referenced this pull request Dec 11, 2024
espressif-bot pushed a commit that referenced this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants