Update squeeze-excitation + ladder/legality/liberty to gcp/next #97

alreadydone · 2019-01-31T03:43:38Z

No description provided.

The previous method is too strict for fp16 compute. Since lower precision of fp16 is still good enough to play at the same strength as fp32 relax the self check. Pull request leela-zero#1698.

* Fix error calculation (Missing batch_size divider). * Better error reporting when no working configuration could be found. * Change reference data to have less rounding errors with half precision. * Replace BLAS reference SGEMM with custom code that gives transposed output like the OpenCL SGEMM. Pull request leela-zero#1710.

Should save a tiny bit of memory. Pull request leela-zero#1716.

Fall back to single precision net when half precision is broken, at least when detection mode is auto. Pull request leela-zero#1726.

Pull request leela-zero#1721.

Some OpenCL buffers were allocated too big. Tested with oclgrind that the new sizes are correct. Pull request leela-zero#1727.

Use smaller precision to store the weights to decrease the file size. See discussion in issue leela-zero#1733. Pull request leela-zero#1736.

* Network initialization restructuring - Create one net at a time when doing fp16/fp32 autodetect. Saves some GPU memory. - Create an internal lambda which initializes the nets. - Use std::copy to copy vectors to reduce runtime. * zeropad_U : loop reordering for performance optimization. Plus other optimizations for zero-copying initialization. Pull request leela-zero#1750.

Minor fixes to incorrect comments, and reduce some excessively long lines.

* Changed Validation and Game to support multiple GTP commands at start up but left the Validations options untouched. * Separated engine options (as positional arguments) from match options. Replaced time settings option with ability to specify any GTP commands. * Added --gtp-command options using the existing option parser. Also changed default binary options from -p 1600 to -v 3200. * Each binary argument has to be preceded by "--". * Changes to use Engine Objects. * Exits on failed GTP command. Added printing of GTP commands in gameStart() so users can see what commands are actually sent to each engine. Pull request leela-zero#1652.

* Don't refer to stone locations as "squares". * Use "vertex" for those in the "letterbox" representation. * Otherwise, mostly use "intersection". * Also, capture all possible moves (i.e. including pass) in its own explicit constant. * Clean up network constants. Pull request leela-zero#1723.

Pull request leela-zero#1765.

Case-sensitive coordinates are a thing in SGF, not GTP. Pull request leela-zero#1793.

* Install to ${CMAKE_INSTALL_BINDIR}, some distros like to put games in /usr/games. * Store/load leelaz_opencl_tuning and load weights file from system directories, i.e. ~/.local/share/leela-zero on Unix * Better error reporting when network weights file is not found. Pull request leela-zero#1618.

* MTCS: Skip current expanding child when doing uct select. Search thread should explore other nodes in this case, this would save the search from some useless searches. It has benefit for batching support too. Before this change, all threads could be busy waiting for the first node being expanded. Give expanding node a huge virtual loss instead to avoid crash when only one child exists. Pull request leela-zero#1794.

Necessary for Clang. Fixes issue leela-zero#1809. Pull request leela-zero#1811.

As demanded by GTP, improving the input handling of GameState::play_textmove in the process (which now would crash if given a pass or resignation). Pull request leela-zero#1814.

Updated README to compile under Linux with Boost filesystem. Required after 73f1f93. Pull request leela-zero#1813.

Pull request leela-zero#1827.

Pull request leela-zero#1824.

@Atarust

According to @Atarust at leela-zero#1806 this fixes kernel compilation error with his configuration. No performance difference. Pull request leela-zero#1820.

Copying on weight construction keeps a copy of the weights on the host memory, at least for recent NVIDIA GPUs. Creating a buffer and then copying later on doesn't, and this saves memory. Pull request leela-zero#1818.

Pull request leela-zero#1826.

* Thread-safe UCTNodePointer This makes almost all UCTNodePointer operations thread-safe. The only exceptions are destructors and when it is 'moved out' Should even handle concurrent inflate() calls properly. Uses atomic operations to emulate locks only when needed. This includes support for re-expansion by forcibly moving the state back to INITIAL on a single-thread context. Pull request leela-zero#1764.

Avoid having duplicate copies of the network weights in memory. Pull request leela-zero#1795.

Fixes issue leela-zero#1837. Pull request leela-zero#1838.

Fixes clang warning. Pull request leela-zero#1841.

When doing auto precision detection, make sure prior implementation is destroyed before trying new implementation Pull request leela-zero#1842.

* Count memory consumption of a search tree by introducing a referencer for UCTNodePointer and UCTNode. * NNCache: Add method to get estimated memory consumption. * Extend Network with methods to estimate network size, network cache size and resize cache. * Estimate total memory consumption as estimated network size + number of gpus * 85MB + estimated tree size * 1.1 + estimated cache size * 1.1 * Add command `lz-setoption` which behaves like set_option from UCI spec. * Add option to set maximum memory consumption in MB. * Add option to configure ratio of memory reserved for nn cache and search tree. * Add command 'lz-estimatememory' which shows estimated memory consumption. * Initialize maximum tree size and cache size after the network initialization. Pull request leela-zero#1741.

Follow up to pull request leela-zero#1741.

* CPUPipe : change winograd transformation constants to an equation. Combined with a series of strength reduction changes, improves netbench by about 8%. * Convert some std::array into individual variables For some reason this allows gcc to optimize the code better, improving netbench by 2%. Pull request leela-zero#2021.

Use hard-coded equations instead of matrix multiplication. Pull request leela-zero#2023.

Fix Validation -k option by reading its value before the parser is reused. Pull request leela-zero#2024.

See pull request leela-zero#2031.

Pull request leela-zero#2034.

Pull request leela-zero#2035.

Simplify instructions, especially related to building and running when wanting to contribute. Based on pull request leela-zero#1983.

* Move Engine to Game.h and refactor autogtp to use it too. * Fix initialization of job engines. Pull request leela-zero#2029.

Generally speaking, providing character pointers as the first argument directly might cause FSB (Format String Bug). Pull request leela-zero#2063.

Update from upstream f0b7045. Fixes warnings related to CL_TARGET_OPENCL_VERSION.

Pull request leela-zero#2033.

* Make AutoGTP URL parametric. * Support for the sgfhash and movescount parameters in get-task. * Automatic downloading of sgf and training files. * Fix Management.cpp for older Qt5 versions. * Added starting match games from specified initial position * Tidy ValidationJob::init() like ProductionJob::init() * Use existing QUuid method of generating random file names instead of QTemporaryFile when fetching game data. Moreover, we do not load training data in LeelaZ since it is not needed to start from an arbitrary position. Pull request leela-zero#2052.

* Add optional separate options for white in match game. * Fixed loading of saved match order with optionsSecond. Pull request leela-zero#2078.

Pull request leela-zero#2072.

Pull request leela-zero#2093.

See issue leela-zero#2032. All contributors to the core engine have given their permission to add an additional permission to link with NVIDIA's CUDA/cuDNN/TensorRT libraries. This makes it possible to distribute the engine when built to use those libraries. Update the copyright notices to 2019.

Pull request leela-zero#2147.

Although the OpenCL driver is generally installed as part of the driver install, mention the requirement explicitly in case it wasn't. See pull request leela-zero#2138.

Pull request leela-zero#2135.

Pull request leela-zero#2134.

Fixes leela-zero#2020. Pull request leela-zero#2133.

See pull request leela-zero#2122.

Calling get_eval() on zero-visit node will assert-fail. The original code could assert-fail on b.get_eval() if 'a' and 'b' both had zero visits but suddenly 'a' gained an additional visit. Pull request leela-zero#2110.

Fixes issue leela-zero#2001. Pull request leela-zero#2114.

Follow up fix for pull request leela-zero#2114.

* AutoGTP: Allow specifying initial GTP commands. Also add support for white taking the first move in handicapped job games. * AutoGTP: Refactored core loop for match games to avoid code duplication. * Fixed white using black's match game settings after loading from an SGF by moving SGF loading into Game::gameStart() to before sending GTP commands (except handicap commands). * Changed so that when an SGF file is loaded, AutoGTP determines whether handicap is in use from the SGF rather than from any starting GTP commands. Pull request leela-zero#2096.

This includes some optimization improvements for newer GCC/Clang that may be relevant to a lot of our users. Pull request leela-zero#2151.

Fixes issue leela-zero#2167. I could swear I fixed this before. Maybe I forgot to push?

* AutoGTP: Added full engine options and starting GTP commands to SGF comments that are produced. * Refactored Game::fixSgf(). Pull request leela-zero#2160.

Ttl and others added 30 commits August 6, 2018 15:14

Use L2-norm in self check.

488de43

The previous method is too strict for fp16 compute. Since lower precision of fp16 is still good enough to play at the same strength as fp32 relax the self check. Pull request leela-zero#1698.

Change policy vector to array.

87c95c4

Should save a tiny bit of memory. Pull request leela-zero#1716.

Fall back to single precision net on breakage.

e72496d

Fall back to single precision net when half precision is broken, at least when detection mode is auto. Pull request leela-zero#1726.

AutoGTP: use compressed weights networks.

681229a

Pull request leela-zero#1721.

Fix OpenCL buffer sizes.

07c908e

Some OpenCL buffers were allocated too big. Tested with oclgrind that the new sizes are correct. Pull request leela-zero#1727.

Script for quantizing weights.

f85a685

Use smaller precision to store the weights to decrease the file size. See discussion in issue leela-zero#1733. Pull request leela-zero#1736.

Fix comments, code style.

7e889c7

Minor fixes to incorrect comments, and reduce some excessively long lines.

Don't use "void" as function parameter.

6eecb1e

Pull request leela-zero#1765.

Isolate and clean up text-to-vertex conversion.

0549816

Case-sensitive coordinates are a thing in SGF, not GTP. Pull request leela-zero#1793.

Convert string before variadic function call.

1042cb6

Necessary for Clang. Fixes issue leela-zero#1809. Pull request leela-zero#1811.

Always expect 2 arguments after "play" command.

51cba90

As demanded by GTP, improving the input handling of GameState::play_textmove in the process (which now would crash if given a pass or resignation). Pull request leela-zero#1814.

Update README with new boost dependencies.

b290f47

Updated README to compile under Linux with Boost filesystem. Required after 73f1f93. Pull request leela-zero#1813.

Fix boost package reference for VS2017 build.

5d4bd2f

Pull request leela-zero#1827.

Added missing files to MSVC 2015 projects.

5bd2ef4

Pull request leela-zero#1824.

Make Winograd matrices global.

dd95cab

According to @Atarust at leela-zero#1806 this fixes kernel compilation error with his configuration. No performance difference. Pull request leela-zero#1820.

OpenCL : Don't copy on weight construction.

5412e66

Copying on weight construction keeps a copy of the weights on the host memory, at least for recent NVIDIA GPUs. Creating a buffer and then copying later on doesn't, and this saves memory. Pull request leela-zero#1818.

Winograd filter transform and CPU in transform optimization.

7e13bf0

Pull request leela-zero#1826.

Pass network weight as a std::shared_ptr class.

cd48427

Avoid having duplicate copies of the network weights in memory. Pull request leela-zero#1795.

Fix vectorized Winograd transform.

0a0d134

Fixes issue leela-zero#1837. Pull request leela-zero#1838.

Remove unused lambda capture.

c21c8a4

Fixes clang warning. Pull request leela-zero#1841.

Reduce network memory usage when autodetecting.

cff3917

When doing auto precision detection, make sure prior implementation is destroyed before trying new implementation Pull request leela-zero#1842.

Assorted style nits and minor bugfixes.

c6999fc

Follow up to pull request leela-zero#1741.

ihavnoid and others added 30 commits November 17, 2018 23:45

Convolve in/out performance optimization.

304f9c7

Use hard-coded equations instead of matrix multiplication. Pull request leela-zero#2023.

Validation: fix -k option.

fc8d080

Fix Validation -k option by reading its value before the parser is reused. Pull request leela-zero#2024.

Add link to Azure free trial instructions.

6e88b95

See pull request leela-zero#2031.

Cleanup atomics and dead if.

666c0c6

Pull request leela-zero#2034.

Const in SGFTree.

8670a40

Pull request leela-zero#2035.

Make the README more clear.

77582b9

Simplify instructions, especially related to building and running when wanting to contribute. Based on pull request leela-zero#1983.

Refactor to allow AutoGTP to use Engine.

8daa0cd

* Move Engine to Game.h and refactor autogtp to use it too. * Fix initialization of job engines. Pull request leela-zero#2029.

Fix printf call style.

64097f0

Generally speaking, providing character pointers as the first argument directly might cause FSB (Format String Bug). Pull request leela-zero#2063.

Update Khronos OpenCL C++ headers.

c157d0b

Update from upstream f0b7045. Fixes warnings related to CL_TARGET_OPENCL_VERSION.

Cleanup loop code.

bc3e750

Pull request leela-zero#2033.

Support separate options for white in match games.

08efb53

* Add optional separate options for white in match game. * Fixed loading of saved match order with optionsSecond. Pull request leela-zero#2078.

Add O(sqrt(log(n))) scaling to tree search.

39be654

Pull request leela-zero#2072.

Option to get network output without writing to cache.

21e3580

Pull request leela-zero#2093.

Add link to GoReviewPartner.

ce41cc1

Pull request leela-zero#2147.

Reminder to install OpenCL driver if seperate.

4ca0734

Although the OpenCL driver is generally installed as part of the driver install, mention the requirement explicitly in case it wasn't. See pull request leela-zero#2138.

Fixed leelaz_file on Android.

d4c0380

Pull request leela-zero#2135.

Fix 'catching polymorphic type by value' warning.

f944b97

Pull request leela-zero#2134.

Fixed converter script for minigo removing bias.

4f12925

Fixes leela-zero#2020. Pull request leela-zero#2133.

Add zlib to the mac OS X build instructions.

44d0e6a

See pull request leela-zero#2122.

UCTNodePtr rare race condition fix.

d192fc6

Calling get_eval() on zero-visit node will assert-fail. The original code could assert-fail on b.get_eval() if 'a' and 'b' both had zero visits but suddenly 'a' gained an additional visit. Pull request leela-zero#2110.

Make sure analysis is printed at least once.

bd0d667

Fixes issue leela-zero#2001. Pull request leela-zero#2114.

Don't post if not requested.

1960e93

Follow up fix for pull request leela-zero#2114.

Update Eigen to 3.3.7.

c7feb53

This includes some optimization improvements for newer GCC/Clang that may be relevant to a lot of our users. Pull request leela-zero#2151.

Fix lz-setoption name playouts.

085d71b

Fixes issue leela-zero#2167. I could swear I fixed this before. Maybe I forgot to push?

AutoGTP: More info in SGF comments.

9831c96

* AutoGTP: Added full engine options and starting GTP commands to SGF comments that are produced. * Refactored Game::fixSgf(). Pull request leela-zero#2160.

copy branch

885b9eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update squeeze-excitation + ladder/legality/liberty to gcp/next #97

Update squeeze-excitation + ladder/legality/liberty to gcp/next #97

alreadydone commented Jan 31, 2019

Update squeeze-excitation + ladder/legality/liberty to gcp/next #97

Are you sure you want to change the base?

Update squeeze-excitation + ladder/legality/liberty to gcp/next #97

Conversation

alreadydone commented Jan 31, 2019