forked from Ttl/leela-zero
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update squeeze-excitation + ladder/legality/liberty to gcp/next #97
Open
alreadydone
wants to merge
102
commits into
patch-31
Choose a base branch
from
patch-35
base: patch-31
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The previous method is too strict for fp16 compute. Since lower precision of fp16 is still good enough to play at the same strength as fp32 relax the self check. Pull request leela-zero#1698.
* Fix error calculation (Missing batch_size divider). * Better error reporting when no working configuration could be found. * Change reference data to have less rounding errors with half precision. * Replace BLAS reference SGEMM with custom code that gives transposed output like the OpenCL SGEMM. Pull request leela-zero#1710.
Should save a tiny bit of memory. Pull request leela-zero#1716.
Fall back to single precision net when half precision is broken, at least when detection mode is auto. Pull request leela-zero#1726.
Pull request leela-zero#1721.
Some OpenCL buffers were allocated too big. Tested with oclgrind that the new sizes are correct. Pull request leela-zero#1727.
Use smaller precision to store the weights to decrease the file size. See discussion in issue leela-zero#1733. Pull request leela-zero#1736.
* Network initialization restructuring - Create one net at a time when doing fp16/fp32 autodetect. Saves some GPU memory. - Create an internal lambda which initializes the nets. - Use std::copy to copy vectors to reduce runtime. * zeropad_U : loop reordering for performance optimization. Plus other optimizations for zero-copying initialization. Pull request leela-zero#1750.
Minor fixes to incorrect comments, and reduce some excessively long lines.
* Changed Validation and Game to support multiple GTP commands at start up but left the Validations options untouched. * Separated engine options (as positional arguments) from match options. Replaced time settings option with ability to specify any GTP commands. * Added --gtp-command options using the existing option parser. Also changed default binary options from -p 1600 to -v 3200. * Each binary argument has to be preceded by "--". * Changes to use Engine Objects. * Exits on failed GTP command. Added printing of GTP commands in gameStart() so users can see what commands are actually sent to each engine. Pull request leela-zero#1652.
* Don't refer to stone locations as "squares". * Use "vertex" for those in the "letterbox" representation. * Otherwise, mostly use "intersection". * Also, capture all possible moves (i.e. including pass) in its own explicit constant. * Clean up network constants. Pull request leela-zero#1723.
Pull request leela-zero#1765.
Case-sensitive coordinates are a thing in SGF, not GTP. Pull request leela-zero#1793.
* Install to ${CMAKE_INSTALL_BINDIR}, some distros like to put games in /usr/games. * Store/load leelaz_opencl_tuning and load weights file from system directories, i.e. ~/.local/share/leela-zero on Unix * Better error reporting when network weights file is not found. Pull request leela-zero#1618.
* MTCS: Skip current expanding child when doing uct select. Search thread should explore other nodes in this case, this would save the search from some useless searches. It has benefit for batching support too. Before this change, all threads could be busy waiting for the first node being expanded. Give expanding node a huge virtual loss instead to avoid crash when only one child exists. Pull request leela-zero#1794.
Necessary for Clang. Fixes issue leela-zero#1809. Pull request leela-zero#1811.
As demanded by GTP, improving the input handling of GameState::play_textmove in the process (which now would crash if given a pass or resignation). Pull request leela-zero#1814.
Updated README to compile under Linux with Boost filesystem. Required after 73f1f93. Pull request leela-zero#1813.
Pull request leela-zero#1824.
According to @Atarust at leela-zero#1806 this fixes kernel compilation error with his configuration. No performance difference. Pull request leela-zero#1820.
Copying on weight construction keeps a copy of the weights on the host memory, at least for recent NVIDIA GPUs. Creating a buffer and then copying later on doesn't, and this saves memory. Pull request leela-zero#1818.
* Thread-safe UCTNodePointer This makes almost all UCTNodePointer operations thread-safe. The only exceptions are destructors and when it is 'moved out' Should even handle concurrent inflate() calls properly. Uses atomic operations to emulate locks only when needed. This includes support for re-expansion by forcibly moving the state back to INITIAL on a single-thread context. Pull request leela-zero#1764.
Avoid having duplicate copies of the network weights in memory. Pull request leela-zero#1795.
Fixes issue leela-zero#1837. Pull request leela-zero#1838.
Fixes clang warning. Pull request leela-zero#1841.
When doing auto precision detection, make sure prior implementation is destroyed before trying new implementation Pull request leela-zero#1842.
* Count memory consumption of a search tree by introducing a referencer for UCTNodePointer and UCTNode. * NNCache: Add method to get estimated memory consumption. * Extend Network with methods to estimate network size, network cache size and resize cache. * Estimate total memory consumption as estimated network size + number of gpus * 85MB + estimated tree size * 1.1 + estimated cache size * 1.1 * Add command `lz-setoption` which behaves like set_option from UCI spec. * Add option to set maximum memory consumption in MB. * Add option to configure ratio of memory reserved for nn cache and search tree. * Add command 'lz-estimatememory' which shows estimated memory consumption. * Initialize maximum tree size and cache size after the network initialization. Pull request leela-zero#1741.
Follow up to pull request leela-zero#1741.
* CPUPipe : change winograd transformation constants to an equation. Combined with a series of strength reduction changes, improves netbench by about 8%. * Convert some std::array into individual variables For some reason this allows gcc to optimize the code better, improving netbench by 2%. Pull request leela-zero#2021.
Use hard-coded equations instead of matrix multiplication. Pull request leela-zero#2023.
Fix Validation -k option by reading its value before the parser is reused. Pull request leela-zero#2024.
See pull request leela-zero#2031.
Pull request leela-zero#2034.
Pull request leela-zero#2035.
Simplify instructions, especially related to building and running when wanting to contribute. Based on pull request leela-zero#1983.
* Move Engine to Game.h and refactor autogtp to use it too. * Fix initialization of job engines. Pull request leela-zero#2029.
Generally speaking, providing character pointers as the first argument directly might cause FSB (Format String Bug). Pull request leela-zero#2063.
Update from upstream f0b7045. Fixes warnings related to CL_TARGET_OPENCL_VERSION.
Pull request leela-zero#2033.
* Make AutoGTP URL parametric. * Support for the sgfhash and movescount parameters in get-task. * Automatic downloading of sgf and training files. * Fix Management.cpp for older Qt5 versions. * Added starting match games from specified initial position * Tidy ValidationJob::init() like ProductionJob::init() * Use existing QUuid method of generating random file names instead of QTemporaryFile when fetching game data. Moreover, we do not load training data in LeelaZ since it is not needed to start from an arbitrary position. Pull request leela-zero#2052.
* Add optional separate options for white in match game. * Fixed loading of saved match order with optionsSecond. Pull request leela-zero#2078.
See issue leela-zero#2032. All contributors to the core engine have given their permission to add an additional permission to link with NVIDIA's CUDA/cuDNN/TensorRT libraries. This makes it possible to distribute the engine when built to use those libraries. Update the copyright notices to 2019.
Pull request leela-zero#2147.
Although the OpenCL driver is generally installed as part of the driver install, mention the requirement explicitly in case it wasn't. See pull request leela-zero#2138.
Pull request leela-zero#2135.
Fixes leela-zero#2020. Pull request leela-zero#2133.
See pull request leela-zero#2122.
Calling get_eval() on zero-visit node will assert-fail. The original code could assert-fail on b.get_eval() if 'a' and 'b' both had zero visits but suddenly 'a' gained an additional visit. Pull request leela-zero#2110.
Fixes issue leela-zero#2001. Pull request leela-zero#2114.
Follow up fix for pull request leela-zero#2114.
* AutoGTP: Allow specifying initial GTP commands. Also add support for white taking the first move in handicapped job games. * AutoGTP: Refactored core loop for match games to avoid code duplication. * Fixed white using black's match game settings after loading from an SGF by moving SGF loading into Game::gameStart() to before sending GTP commands (except handicap commands). * Changed so that when an SGF file is loaded, AutoGTP determines whether handicap is in use from the SGF rather than from any starting GTP commands. Pull request leela-zero#2096.
This includes some optimization improvements for newer GCC/Clang that may be relevant to a lot of our users. Pull request leela-zero#2151.
Fixes issue leela-zero#2167. I could swear I fixed this before. Maybe I forgot to push?
* AutoGTP: Added full engine options and starting GTP commands to SGF comments that are produced. * Refactored Game::fixSgf(). Pull request leela-zero#2160.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.