Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tree view is unusably slow on a large monorepo #8242

Closed
1 task done
bradzacher opened this issue Feb 23, 2024 · 24 comments
Closed
1 task done

tree view is unusably slow on a large monorepo #8242

bradzacher opened this issue Feb 23, 2024 · 24 comments
Labels
bug [core label] large projects For anything relating to large volumes of files or multiple subprojects in one main project. performance Feedback for performance issues, speed, memory usage, etc project panel Feedback for files tree view

Comments

@bradzacher
Copy link

Check for existing issues

  • Completed

Describe the bug / provide steps to reproduce it

Screen.Recording.2024-02-23.at.14.16.31.mov

Sorry about the blurring - internal company monorepo so can't really share the names out.

When opening zed it takes around 20s for zed to show the 1st-level contents of the folder.

Problem number one here is that there is no progress bar or activity indicator to show that something is happening.
The first time I added the folder to the IDE I thought I'd done something wrong because nothing showed up. I switched away to read the docs and when I went back to zed it showed the folders.

I then try to expand a folder. The folder I clicked on has JUST ONE file in it. The video cuts off before the folder contents show up - I let it go for about a minute before I got bored and stopped recording so I don't know how long it took overall (I trimmed the video down).

Sadly this makes the IDE unusable as a daily driver for me.

Environment

Zed: v0.123.3 (Zed)
OS: macOS 14.3.1
Memory: 64 GiB
Architecture: aarch64

If applicable, add mockups / screenshots to help explain present your vision of the feature

No response

If applicable, attach your ~/Library/Logs/Zed/Zed.log file to this issue.

If you only need the most recent lines, you can run the zed: open log command palette action to see the last 1000.

The last 7000 lines of the log are just this repeated:

2024-02-23T14:29:06+10:30 [ERROR] crates/fs/src/repository.rs:147: Error { code: -1, klass: 10, message: "invalid data in index - calculated checksum does not match expected" }
2024-02-23T14:29:06+10:30 [ERROR] crates/fs/src/repository.rs:215: Error { code: -1, klass: 10, message: "invalid data in index - calculated checksum does not match expected" }
@bradzacher bradzacher added admin read Pending admin review bug [core label] triage Maintainer needs to classify the issue labels Feb 23, 2024
@JosephTLyons JosephTLyons added performance Feedback for performance issues, speed, memory usage, etc project panel Feedback for files tree view and removed triage Maintainer needs to classify the issue admin read Pending admin review labels Feb 23, 2024
@jianghoy
Copy link

It looks like the repo is in some state that causes the underlying libgit2 rs crate (used to manipulate git) error.
I tried to replicate by:

  1. cloning a fairly large repo onto my machine (in this case I used k8s)
  2. open using zed.
    The result is here: loom recording of opening up k8s in zed, which looks way faster than your experience.

Could you share what the git repo looks like when you started? e.g. is it stuck in rebasing? etc etc.

@jbkkd
Copy link

jbkkd commented Feb 25, 2024

I suffer from the same issue on a large monolith, with the same error in the log.

This may or may not be helpful, but in my case 99% of the files are Python files.

@bradzacher
Copy link
Author

bradzacher commented Feb 25, 2024

Could you share what the git repo looks like when you started? e.g. is it stuck in rebasing? etc etc.

Nothing weird - the repo was in a clean state.

libgit2 rs crate

Oof - this is likely the problem.

From my experience libgit2 is missing many performance features implemented in the git CLI which makes it unusable on our monorepo.

I recently tried to use it in an internal CLI I was building. I built a command similar to git status and it took around ~40s to run - compared with ~5-10s for the native git cli.

I also found that some of the newer features of git aren't supported in libgit2 - which caused crashes in libgit2.

@bradzacher

This comment was marked as resolved.

@bradzacher
Copy link
Author

bradzacher commented Feb 26, 2024

$ cd company_repo
$ git ls-files | wc -l
 311564
$ du -sh .
 47G    .
$ cd kubernetes
$ git ls-files | wc -l
 24149
$ du -sh .
 1.3G	.

just to illustrate - our company's monorepo is over 10x larger than the k8 repo in terms of number of files, and many times larger in terms of disk weight.

Another illustration of the weight difference:

$ cd company_repo
$ time git status
 0.70s user 0.06s system 97% cpu 0.777 total
$ time git status
 0.67s user 0.05s system 96% cpu 0.755 total

$ cd kubernetes
$ time git status
 0.02s user 0.04s system 74% cpu 0.086 total
$ time git status
 0.02s user 0.03s system 61% cpu 0.089 total

Note that this is with both folders in a clean, no changes, not rebasing, all pushed state after already having run status a few times.

@jianghoy
Copy link

jianghoy commented Feb 26, 2024 via email

@jianghoy
Copy link

jianghoy commented Feb 26, 2024

OK created a feature branch that updated libgit2 to 0.16.2+1.7.2, which seems to be the current stable version of libgit2 rust binding:https://github.com/jianghoy/zed/tree/update-libgit2, the build passes on my test machine. Maybe you could give it a spin and see if it works better?
If you follow this instruction and build the code: https://github.com/jianghoy/zed/blob/update-libgit2/docs/src/developing_zed__building_zed_macos.md, at the end by running cargo run it will launch a debug build of Zed. If everything works out I'll create PR.

@hferreiro
Copy link
Contributor

This can be more or less reproduced using chromium. The waits are not that large but noticeable.

@Liam-Breen
Copy link

Liam-Breen commented Mar 14, 2024

I was not running in to this issue running Zed Preview that I had used for quite a while. I've just had to wipe my mac and reinstall Zed Preview (using the same version as before the wipe) and have now run in to the issue.

EDIT: As mentioned above, it looks like libgit2 is the issue. I found this issue on the git-branchless repo which is essentially what we're running in to here. I tried what is mentioned in this comment and it worked perfectly for me (downgrading to git 2.39.3), no more performance issues. Libgit2 has supposdely fixed this issue in version 1.7.3 or later. I'm unsure on how to check what version Zed uses.

@osiewicz
Copy link
Contributor

osiewicz commented Mar 18, 2024

@Liam-Breen We're using 1.5.1:
cargo tree -p libgit2-sys:

libgit2-sys v0.14.2+1.5.1

We may have to wait until 1.8.0 is out: rust-lang/git2-rs#1032
I'll keep an eye on it.

@mikayla-maki mikayla-maki added the large projects For anything relating to large volumes of files or multiple subprojects in one main project. label May 9, 2024
@jh3y
Copy link

jh3y commented May 29, 2024

Dropping in to say also seeing this on a larger monorepo (React, Next.js, etc.)

Try to navigate a few levels deep and the folder will switch to open, nothing happens, the CPU usage spikes in Activity Monitor (MacOS), and after maybe 30 seconds or so, the files show up. All directories at that level will then open quickly until you want to go a level deeper and then it's the same thing again.

It takes some time for the CPU usage to drop below 95% after this 😢

@filipwiech
Copy link

filipwiech commented Jun 17, 2024

There have been some recent improvements made to the project view by @osiewicz in the #12980, perhaps it will help. 👍

BTW, a new version of the libgit2-sys has been released a few days ago: libgit2-sys-0.17.0+1.8.1. 👍

Also, a few other potentially related improvements from @maxbrunsfeld (including replacing some libgit2 functionality with a native git call): #12266, #12444 , #12489 and #12513. 👍

@jh3y
Copy link

jh3y commented Jun 17, 2024

Will update and keep my fingers crossed 🤞 fwiw - love using it for my other projects where it handles it all fine 🤙

osiewicz added a commit that referenced this issue Jun 17, 2024
Related to: #8242

Release Notes:

- N/A
fallenwood pushed a commit to fallenwood/zed that referenced this issue Jun 18, 2024
@osiewicz
Copy link
Contributor

I can reproduce the stalls when opening an artificial project with 100k files in it. Notably this project doesn't have .git folder, which means that we're not necessarily bound by git perf (though historically it has been a bit of an issue for us).
Pretty much everything @filipwiech mentioned is spot on, these changes should improve your experience when working with large project. Note though that the project panel change doesn't have any effect on the initial load time; the rendering of project panel should simply be significantly lighter.

I will try to push forward another fix which should help with that 100k repro I'm seeing. The project itself can be created with:

mkdir test-rust-project
cd test-rust-project

# Initialize Rust project
cargo init

# Create files with errors
for i in $(seq 1 100000); do
  echo 'fn food() { let x: i32 = "error"; }' > src/file_$i.rs
  echo "mod file_$i;" >> src/main.rs
done

echo "mod file_$i;" >> src/main.rs

echo "Rust project with errors created."

@SomeoneToIgnore
Copy link
Contributor

SomeoneToIgnore commented Jun 18, 2024

On a general note about the project panel slowliness (after implementing my own way with the outline panel tree): almost every operation in project tree goes through update_visible_entries, which

clears its entries

self.visible_entries.clear();

and goes through every entry of every worktree to repopulate (and sort) things,

let mut entry_iter = snapshot.entries(true, 0);

all on the main thread, it seems, and without any kind of back-off or cancellations in case of the event flood.

Also, this happens on every active project entry changed, even if the panel is not focused.

project::Event::ActiveEntryChanged(Some(entry_id)) => {
if ProjectPanelSettings::get_global(cx).auto_reveal_entries {
this.reveal_entry(project, *entry_id, true, cx);
}
}

"almost every operation " is

  • project::Event::WorktreeRemoved/WorktreeUpdatedEntries/WorktreeAdded/WorktreeOrderChanged
  • reveal_entry (hence, every caret change also project::Event::RevealInProjectPanel and project::Event::ActiveEntryChanged)
  • toggling of expanded and folded states of directories
  • renaming
  • maybe something else

I would say that this approach is incompatible with large workspaces.

@jh3y
Copy link

jh3y commented Jun 18, 2024

Some great info and digging @SomeoneToIgnore @osiewicz 👏 Thanks!

@osiewicz
Copy link
Contributor

You're welcome @jh3y - did the recent changes improve the situation anyhow for you?

@jh3y
Copy link

jh3y commented Jun 18, 2024

You're welcome @jh3y - did the recent changes improve the situation anyhow for you?

Massively! Jus' tried it out 🙌 I can now use it for everything 🤞 Very happy with that because I've been really enjoying its speed and aesthetic 😍

@maxbrunsfeld
Copy link
Collaborator

maxbrunsfeld commented Jun 18, 2024

@bradzacher @jbkkd When you get a chance, could you check if things are faster in the latest Zed Preview v0.140.0 or later?

Now, when opening Zed in a huge repo like chromium, there is still a delay before the file tree is populated (we run git status once), but things are fairly responsive. We've moved away from using libgit2 for git status in favor of the git CLI, and made a number of other optimizations.

@Liam-Breen
Copy link

@bradzacher @jbkkd When you get a chance, could you check if things are faster in the latest Zed Preview v0.140.0 or later?

Now, when opening Zed in a huge repo like chromium, there is still a delay before the file tree is populated (we run git status once), but things are fairly responsive. We've moved away from using libgit2 for git status in favor of the git CLI, and made a number of other optimizations.

I work with a very large repo and performance has definitely improved for me @maxbrunsfeld

@JosephTLyons
Copy link
Collaborator

JosephTLyons commented Jun 27, 2024

For those of you who experienced this issue, do things feel good enough for us to close this issue out?

@JosephTLyons JosephTLyons added the needs info / awaiting response Issue that needs more information from the user label Jun 27, 2024
@jbkkd
Copy link

jbkkd commented Aug 19, 2024

Tried again now on our huge repo (same one @Liam-Breen works on) and Zed is usable.

Thanks!

@Yura52
Copy link

Yura52 commented Sep 10, 2024

For those of you who experienced this issue, do things feel good enough for us to close this issue out?

Unfortunately, Zed is still very slow to start in my case. I am working on a repo similar to this one, but like 10x larger. The main thing is that all TOML and JSON files in the exp/ folder are tracked by git, and there are 430K+ such files in my current repo (at the moment of writing this).

The structure of the repo is not very friendly to git, to say the least. Nevertheless, VSCode starts pretty quickly, and as far as I understand this is achieved by not running git too often (because all basic git commands, starting from git status, take a lot of time in this repo).

Also, not directly related to this issue, but it is very helpful that in VSCode in settings.json I can do:

{
    ...
    "python.analysis.exclude": [
        "**/__pycache__",
        "exp",
    ],
}

and the Python language server becomes responsive relatively quickly.

@notpeter
Copy link
Member

I'm going to close this out as resolved since there hasn't been any activity in a few months.

@Yura52 You can similarly provide settings to pyright via python.analysis settings passed to the LSP. There is a similar example in Zed Python Docs.

Similarly file_scan_exclusions is you friend. Add directories you want to skip when searching here:

"file_scan_exclusions": [
  "**/.git",
  "**/.svn",
  "**/.hg",
  "**/CVS",
  "**/.DS_Store",
  "**/Thumbs.db",
  "**/.classpath",
  "**/.settings"
],

For anyone who has a repo where git commands are very slow I recommend you disable the following settings:

{
  "git": {
    "git_gutter": "hide",
    "inline_blame": {
      "enabled": false
    }
  },
  "project_panel": {
    "git_status": false,
  }
}

Thanks for reporting!

@notpeter notpeter removed the needs info / awaiting response Issue that needs more information from the user label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug [core label] large projects For anything relating to large volumes of files or multiple subprojects in one main project. performance Feedback for performance issues, speed, memory usage, etc project panel Feedback for files tree view
Projects
None yet
Development

No branches or pull requests