forked from Ttl/leela-zero
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensorcore -> fastexit #73
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Should fix issue leela-zero#1921.
Required on macOS, and probably other platforms. Fixes issue leela-zero#1901. Pull request leela-zero#1910.
Implement a few more parameters that can be set via lz-setoption, specifically visits, playouts, pondering, resign threshold and the lag buffer. We currently don't check the provided values against the reported min/max values but rely on the UI not to mess up. This could be addressed in a refactoring. Similarly, commandline and setoption values should probably treated in a unified way. Remove the bogus boolean return value from GTP processing functions. Minor style fix for old GTP code. Pull request leela-zero#1927.
We need to call UCTNode::create_children() even if we aren't expanding because that moves our node's state from INITIAL to EXPANDED. Pull request leela-zero#1928.
If tuner failed during precision autodetection error output in stdout was read as a GTP message. Pull request leela-zero#1935.
* Fall back to single precision when the GPU claims fp16 support but it doesn't work. * Net initialization fixes: - Try at least one selfcheck eval when autodetecting precision - Revive selfcheck when using Eigen Pull request leela-zero#1934.
Fixes issue leela-zero#1938.
Many lz-setoption commands are forgetting to add the closing GTP = if they are successful. This will freeze GUIs. Fixes issue leela-zero#1940.
We've supported HTTPS on the server side for a while now, make it the default.
Colab has updated and the instructions here probably no longer work. They should probably be hosted elsewhere, too.
Define a variable closer to the usage point.
* Separate FPU-reduction setting for root. * Removed fpu_root_reduction. Pull request leela-zero#1960.
Link to Google Cloud tutorial on Google Docs. Pull request leela-zero#1961.
Delete outdated questions and answers. Pull request leela-zero#196.
Disabling input buffering on Windows causes breakage that looks like input buffering stays enabled. This was accounted for in the code, but the #define check was against a non-default flag, and a different one as used elsewhere.
Even though SGF defaults to size 19 boards, we should not try to set up a board that size if LZ has not been compiled to support it. Pull request leela-zero#1964.
Without this, it's empirically not possible to load the current 256x40 networks on a 32-bit machine.
If we are trying to auto-select the best device for OpenCL, never select a CPU. This will cause the engine to refuse to run when people are trying to run the OpenCL version without a GPU or without GPU drivers, instead of selecting any slow and suboptimal (and empirically extremely broken) OpenCL-on-CPU drivers. Falling back to CPU-only would be another reasonable alternative, but doesn't provide an alert in case the GPU drivers are missing. Improves behavior of issue leela-zero#1994.
Fix full tuner for heterogeneous GPUs and auto precision detection. --full-tuner implies --tune-only --full-tuner requires an explicit precision Fixes leela-zero#1973. Pull request leela-zero#2004.
Very minor speedup of about 2% with batch size of 1. With batch size of 5 there is a speedup of about 5% with half precision and 12% with single precision. Out transformation memory accesses are almost completely coalesced with the new kernel. Pull request leela-zero#2014.
From upstream a807dcf0f8623d40dc5ce9d1eb00ffd0e46150c7.
* CPUPipe : change winograd transformation constants to an equation. Combined with a series of strength reduction changes, improves netbench by about 8%. * Convert some std::array into individual variables For some reason this allows gcc to optimize the code better, improving netbench by 2%. Pull request leela-zero#2021.
Use hard-coded equations instead of matrix multiplication. Pull request leela-zero#2023.
Fix Validation -k option by reading its value before the parser is reused. Pull request leela-zero#2024.
See pull request leela-zero#2031.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.