Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory writes and short-lived allocations. #17

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cbiffle
Copy link

@cbiffle cbiffle commented Feb 8, 2025

I noticed that hot parts of the program (allocation and evaluation) were doing more work than is strictly necessary, so I've removed some of it.

  • Removed the memset from tcl_append_string, since all but one byte of it is immediately overwritten by strncpy. Just zeroed the terminating byte instead.

  • Loosened the return type of tcl_string to char* (not const), since many uses of it mutate the returned buffer.

  • Replaced the initializing copy in tcl_append_string with memcpy, which is slightly faster than strncpy in cases like this where the length is known in advance. (This is about a 5% improvement in my tests.)

  • Altered tcl_append to not free the right-hand argument, after noticing that almost every use of it created an otherwise-unused copy.

  • In TCMD handling, rely on tcl_list_at to detect an empty command by returning NULL. This avoids a second tokenization pass over the string in tcl_list_length.

  • Avoid making a short-lived copy of the value in tcl_var, just take ownership of the pointer instead.

On a small collection of microbenchmarks, this improved performance by about 25-30% on my machine.

@cbiffle cbiffle force-pushed the fewer-allocs branch 2 times, most recently from e8e8b6f to 4ce0dd9 Compare February 8, 2025 18:24
I noticed that hot parts of the program (allocation and evaluation) were
doing more work than is strictly necessary, so I've removed some of it.

- Removed the memset from tcl_append_string, since all but one byte of
  it is immediately overwritten by strncpy. Just zeroed the terminating
  byte instead.

- Loosened the return type of tcl_string to char* (not const), since
  many uses of it mutate the returned buffer.

- Replaced the initializing copy in tcl_append_string with memcpy, which
  is slightly faster than strncpy in cases like this where the length is
  known in advance. (This is about a 5% improvement in my tests.)

- Altered tcl_append to _not_ free the right-hand argument, after
  noticing that almost every use of it created an otherwise-unused copy.

- In TCMD handling, rely on tcl_list_at to detect an empty command by
  returning NULL. This avoids a second tokenization pass over the string
  in tcl_list_length.

- Avoid making a short-lived copy of the value in tcl_var, just take
  ownership of the pointer instead.

On a small collection of microbenchmarks, this improved performance by
about 25-30% on my machine.
@cbiffle
Copy link
Author

cbiffle commented Feb 10, 2025

(Noticed your suggestion to run clang-format on PRs, updated the commit accordingly -- it's done something odd to the TCMD case block, but now it's consistent.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant