Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Update linux.md to include NVIDIA PRIME workaround #23438

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Vanuan
Copy link

@Vanuan Vanuan commented Jan 22, 2025

Related to #22900

Release Notes:

  • N/A

Copy link

cla-bot bot commented Jan 22, 2025

We require contributors to sign our Contributor License Agreement, and we don't have @Vanuan on file. You can sign our CLA at https://zed.dev/cla. Once you've signed, post a comment here that says '@cla-bot check'.

@maxdeviant maxdeviant changed the title Update linux.md to include NVIDIA PRIME workaround docs: Update linux.md to include NVIDIA PRIME workaround Jan 22, 2025
@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

@cla-bot check

@ConradIrwin
Copy link
Member

Thanks!

For clarity let's:

  • Remove all the background information
  • Merge this into the existing GPU section
  • Keep the text succinct. How about:
If your integrated graphics card is an NVIDIA Optimus, Zed may fail to render correctly. You can either fix this by `export __NV_PRIME_RENDER_OFFLOAD=1` to cause Zed to use the discrete graphics card; or (if you use GNOME) right clicking on the Zed icon and choose "Launch Using Discrete Graphics".

When you say "fail to render correctly" in the original issue, does it just not show up? Or are there different symptoms.

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

When you say "fail to render correctly" in the original issue, does it just not show up? Or are there different symptoms.

I submitted the screenshot. The window shows up, it's just rendering nothing. It's not transparent. It just freezes whatever is displayed as a static picture. Maybe it initializes some a framebuffer image, I don't know how to explain it. When I run the example gpui app, the window is just transparent, but still not rendering until I force the NVIDIA card.

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

Here's the driver for Intel:

{
    "ICD": {
        "api_version": "1.3.255",
        "library_path": "/usr/lib/x86_64-linux-gnu/libvulkan_intel.so"
    },
    "file_format_version": "1.0.0"
}

Maybe it doesn't fully support Vulkan API?

NVIDIA:

{
    "file_format_version" : "1.0.1",
    "ICD": {
        "library_path": "libGLX_nvidia.so.0",
        "api_version" : "1.3.280"
    }
}

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

I found a way to force Intel card for vkcube

~$ export VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/intel_icd.x86_64.json
~$ vkcube
Selected GPU 0: Intel(R) UHD Graphics 630 (CFL GT2), type: 1

image

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

I think this might be a smoking gun:

~$ glxinfo | grep "OpenGL renderer"
OpenGL renderer string: NVIDIA GeForce GTX 1050 Ti with Max-Q Design/PCIe/SSE2

So, I'm using the performance mode in NVIDIA PRIME settings. It probably means that Intel GPU is actually disabled for rendering.

Here's vulkaninfo output:

Presentable Surfaces:
=====================
GPU id : 0 (Intel(R) UHD Graphics 630 (CFL GT2)):
	Surface types: count = 2
		VK_KHR_xcb_surface
		VK_KHR_xlib_surface
	Formats: count = 2
		SurfaceFormat[0]:
			format = FORMAT_B8G8R8A8_SRGB
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
		SurfaceFormat[1]:
			format = FORMAT_B8G8R8A8_UNORM
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
	Present Modes: count = 4
		PRESENT_MODE_IMMEDIATE_KHR
		PRESENT_MODE_MAILBOX_KHR
		PRESENT_MODE_FIFO_KHR
		PRESENT_MODE_FIFO_RELAXED_KHR
	VkSurfaceCapabilitiesKHR:
	-------------------------
		minImageCount = 3
		maxImageCount = 0
		currentExtent:
--
GPU id : 1 (llvmpipe (LLVM 15.0.7, 256 bits)):
	Surface types: count = 2
		VK_KHR_xcb_surface
		VK_KHR_xlib_surface
	Formats: count = 2
		SurfaceFormat[0]:
			format = FORMAT_B8G8R8A8_SRGB
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
		SurfaceFormat[1]:
			format = FORMAT_B8G8R8A8_UNORM
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
	Present Modes: count = 4
		PRESENT_MODE_IMMEDIATE_KHR
		PRESENT_MODE_MAILBOX_KHR
		PRESENT_MODE_FIFO_KHR
		PRESENT_MODE_FIFO_RELAXED_KHR
	VkSurfaceCapabilitiesKHR:
	-------------------------
		minImageCount = 3
		maxImageCount = 0
		currentExtent:
--
GPU id : 2 (NVIDIA GeForce GTX 1050 Ti with Max-Q Design):
	Surface types: count = 2
		VK_KHR_xcb_surface
		VK_KHR_xlib_surface
	Formats: count = 2
		SurfaceFormat[0]:
			format = FORMAT_B8G8R8A8_UNORM
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
		SurfaceFormat[1]:
			format = FORMAT_B8G8R8A8_SRGB
			colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
	Present Modes: count = 3
		PRESENT_MODE_FIFO_KHR
		PRESENT_MODE_FIFO_RELAXED_KHR
		PRESENT_MODE_IMMEDIATE_KHR
	VkSurfaceCapabilitiesKHR:
	-------------------------
		minImageCount = 2
		maxImageCount = 8
		currentExtent:
			width  = 256
~$ prime-select query
nvidia

So apparently, by default Vulkan Loader is used. It discovers ICD profiles and selects the first compatible ICD it finds which happens to be NVIDIA when PRIME is set to perfomance mode. So, it means that zed has its own GPU selection logic that disregards PRIME set to perfornance mode and tries to render to Intel card which is disabled in this mode.

@ConradIrwin
Copy link
Member

ConradIrwin commented Jan 24, 2025

The selection logic is inherited from blade: https://github.com/kvark/blade/pull/210/files#diff-1e88249f181d87457b584699773e7fd5eac7933f9c19079a483f2fde18f51c0cR59 We already do some things to work around known bugs, and I imagine that if we can narrow down the cases that are broken for you precisely enough that we won't be breaking anyone else, we could fix the logic.

In the meantime (or in addition) I'd love to update the docs with any workarounds that work (though as above, I don't want to add digressions and background information; though links out to external resources would be helpful).

@Vanuan
Copy link
Author

Vanuan commented Jan 24, 2025

@ConradIrwin
As I understand, the PR you referenced only surfaces the API that allows force device_id selection, doesn't it? So it's still app's responsibility to select it.

Currently, there are multiple ways to influence device selection / rendering mode in hybrid graphics environemtn:

  1. Select NVIDIA hybrid graphics mode: prime-select, nvidia-settings
  2. NVIDIA-specific enable offload rendering: GNOME context menu "Use discrete ..." that uses switcheroo under the hood and probably the same effect as nvidia offload rendering environment variable
  3. Native Vulkan / MESA driver env variables

I think now is crucial than ever to document functionality using plain text rather than code. I understand linux troubleshooting guide might not be a great place for background information. But if Zed somehow influence device selection in addition to OS level settings, troubleshooting steps should at least reference all the device selection pipeline at the concept level.

@ConradIrwin
Copy link
Member

I was trying to link to "inspect_adaptor" code, not the PR, sorry!

Do you have a personal blog you could put the background information on? We could link out. The problem with this level of detail in the top level document is that most people don't need it, and one of the 5 other random env var hacks is what they're looking for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants