Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge master -> libedax_sensuikan1973 #24

Closed
wants to merge 288 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
288 commits
Select commit Hold shift + click to select a range
7307503
merge with edax 4.5.2
abulmo Aug 31, 2024
8deaca7
merge with edax 4.5.2
abulmo Aug 31, 2024
0076d81
merge with edax 4.5.2
abulmo Aug 31, 2024
aab3c71
-
abulmo Aug 31, 2024
5ec14bc
merge with edax 4.5.2
abulmo Aug 31, 2024
319eb54
-
abulmo Aug 31, 2024
0ac04f2
merge with edax 4.5.2
abulmo Aug 31, 2024
9e0a936
merge with edax 4.5.2
abulmo Aug 31, 2024
bfdbbc6
merge with edax 4.5.2
abulmo Aug 31, 2024
b8f33e9
merge with edax 4.5.2
abulmo Aug 31, 2024
5079b4c
merge with edax 4.5.2
abulmo Aug 31, 2024
b7c4af8
merge with edax 4.5.2
abulmo Aug 31, 2024
51a3c75
merge with edax 4.5.2
abulmo Aug 31, 2024
f89b006
merge with edax 4.5.2
abulmo Aug 31, 2024
e824c0d
merge with edax 4.5.2
abulmo Aug 31, 2024
96a1199
merge with edax 4.5.2
abulmo Aug 31, 2024
6b16df1
merge with edax 4.5.2
abulmo Aug 31, 2024
a2e165b
merge with edax 4.5.2
abulmo Aug 31, 2024
b42f2b7
merge with edax 4.5.2
abulmo Aug 31, 2024
dc1e734
merge with edax 4.5.2
abulmo Aug 31, 2024
00777be
merge with edax 4.5.2
abulmo Aug 31, 2024
e13120c
merge with edax 4.5.2
abulmo Aug 31, 2024
e975690
merge with edax 4.5.2
abulmo Aug 31, 2024
b2ae628
merge with edax 4.5.2
abulmo Aug 31, 2024
f164471
merge with edax 4.5.2
abulmo Aug 31, 2024
b55c935
merge with edax 4.5.2# Prochaines commandes à effectuer (217 commande…
abulmo Aug 31, 2024
66e1258
merge with edax 4.5.2
abulmo Aug 31, 2024
a7d06da
merge with edax 4.5.2
abulmo Aug 31, 2024
3f1b308
merge with edax 4.5.2
abulmo Aug 31, 2024
433c006
merge with edax 4.5.2
abulmo Aug 31, 2024
3c18fd9
merge with edax 4.5.2
abulmo Aug 31, 2024
859a925
merge with edax 4.5.2
abulmo Aug 31, 2024
4032c07
merge with edax 4.5.2
abulmo Aug 31, 2024
b349808
merge with edax 4.5.2
abulmo Aug 31, 2024
e86b5cd
merge with edax 4.5.2
okuhara Dec 12, 2014
5bfb580
CONDITION_VARIABLE ifdef'd out for mingw64 3.0+.
okuhara Dec 15, 2014
3c868dc
count_last_flip_bmi2 and transpose_avx2 added
abulmo Aug 31, 2024
980457b
merge with edax 4.5.2
abulmo Aug 31, 2024
29db7e1
merge README.md
abulmo Aug 31, 2024
9aba07c
merge with edax 4.5.2
abulmo Aug 31, 2024
99c74ef
merge with edax 4.5.2
abulmo Aug 31, 2024
b26c8ce
Update README.md
okuhara Apr 25, 2017
9b1d7ff
filp_sse_bitscan.c (experimental) added; Makefile modified.
abulmo Aug 31, 2024
27a9cec
Some cleanups for clang / android build
abulmo Aug 31, 2024
ae8afa8
Update Doxygen docs
okuhara Jul 21, 2017
b1771d3
merge with edax 4.5.2
abulmo Aug 31, 2024
56ca3ea
copyright changes
abulmo Aug 31, 2024
e0dd108
merge with edax 4.5.2
abulmo Aug 31, 2024
7b5554a
merge with edax 4.5.2
abulmo Aug 31, 2024
8e88c3d
merge with edax 4.5.2
abulmo Aug 31, 2024
50a80c5
merge with edax 4.5.2
abulmo Aug 31, 2024
6ab6f6b
merge with edax 4.5.2
abulmo Aug 31, 2024
eafe436
merge with edax 4.5.2
abulmo Aug 31, 2024
8331875
merge with edax 4.5.2
abulmo Aug 31, 2024
03884cb
merge with edax 4.5.2
abulmo Aug 31, 2024
fb48e53
merge with edax 4.5.2
abulmo Aug 31, 2024
411d0ce
merge with edax 4.5.2
abulmo Aug 31, 2024
7def8b1
merge with edax 4.5.2
abulmo Aug 31, 2024
390fe17
merge with edax 4.5.2
abulmo Aug 31, 2024
b84ca1f
merge with edax 4.5.2
abulmo Aug 31, 2024
b04793b
merge with edax 4.5.2
abulmo Aug 31, 2024
072f84e
merge with edax 4.5.2
abulmo Aug 31, 2024
abd6db7
merge with edax 4.5.2
abulmo Aug 31, 2024
4ecee9b
merge with edax 4.5.2
abulmo Aug 31, 2024
e87d224
merge with edax 4.5.2
abulmo Aug 31, 2024
7768763
HashData and HashStoreData rearranged, TYPE_PUNING now uses union
abulmo Aug 31, 2024
4c85062
Add x64-popcnt build (for Athlon64)
okuhara Mar 14, 2020
179c189
Make eval_swap public and inline some
abulmo Aug 31, 2024
12863f5
minor optimize in search_eval_1/2 and search_shallow
abulmo Aug 31, 2024
26beb66
Change pointer-linked empty list to index-linked
abulmo Aug 31, 2024
49e2694
Change 32-bit get_moves_mmx/sse parameters to 64 bits
abulmo Aug 31, 2024
5987c94
Move n_empties into Eval; tweak eval_open and eval_set
abulmo Aug 31, 2024
144ac19
Halves EVAL_WEIGHT table by n_empties parity instead of eval.player.
abulmo Aug 31, 2024
7f565e4
Change parameter order for vectorcall; use PMOVZXBW in AVX build
okuhara Mar 18, 2020
56b57fd
Clearer Hash align for non-pow-2 sizeof(HASH)
abulmo Aug 31, 2024
a3fb463
mirror_byte added for 1 byte bit reverse
abulmo Aug 31, 2024
c748489
Optimize endgame (esp. 2 empties) score comparisons
abulmo Aug 31, 2024
e7e9843
Fix microbench not to be optimized out
abulmo Aug 31, 2024
69b497f
Add experimemtal AVX2 pcmpeqq get_stability
okuhara Mar 24, 2020
ac0fd8b
Change popcnt build to k10 build using flip_bitscan
abulmo Aug 31, 2024
8c6377c
Change store order to reduce register saving
abulmo Aug 31, 2024
f34f750
Reduce flip table by rotated outflank; revise lzcnt & rol8 defs
abulmo Aug 31, 2024
86b4a03
Drop /GL from clang build
abulmo Aug 31, 2024
803bb96
Update last_bit from chessprogramming wiki
abulmo Aug 31, 2024
3e0e0ce
adding back MSB to get flip mask
abulmo Aug 31, 2024
cb448e8
Use same OutflankToFlip as flip_bitscan, and fix typo bug
abulmo Aug 31, 2024
97a24ef
Satisfy msys2 and gcc 9 warnings
abulmo Aug 31, 2024
c7958a8
Faster flip_avx (ppfill) and variants added
abulmo Aug 31, 2024
15ed33f
table lookup bit_count for non-POPCOUNT from stockfish
abulmo Aug 31, 2024
a31d4ca
fix cr/lf in repository to lf
abulmo Aug 31, 2024
30ccd11
Slightly reduce devendency in flip_avx_ppfill.c
okuhara May 5, 2020
9c06808
Small fix on debug build, etc.
abulmo Aug 31, 2024
5b6eead
Refine arm builds adding neon support.
abulmo Aug 31, 2024
32cb7ef
Static link to pthread on MSYS2 x86 build
abulmo Aug 31, 2024
efbc181
Avoid modern compliler warnings
abulmo Aug 31, 2024
3c6850b
Experimental AVX512VL/CD version of move generator
abulmo Aug 31, 2024
d886645
More neon optimizations; split bit_intrinsics.h from bit.h
abulmo Aug 31, 2024
d0fc28d
Minor optimization from ubiquitin Blog
okuhara Sep 22, 2020
7660c20
More neon/sse optimizations; neon dispatch added for arm32
abulmo Aug 31, 2024
10b0ee4
Optimize search_shallow in endgame.c; revise eval_update parameters
abulmo Aug 31, 2024
b0f8a63
Groups out accumlate_eval subroutine
abulmo Aug 31, 2024
db97a4c
Fix macro expansion; correct comments
abulmo Aug 31, 2024
a38030f
Fix x_to_bit to table where x may be PASS
okuhara Dec 2, 2021
26bec7a
4.5.0: Use CRC32c for board hash
abulmo Aug 31, 2024
171cab1
Use computation or optional pdep to unpack A1_A8
abulmo Aug 31, 2024
5ca3be9
inline board_update and omit restore
abulmo Aug 31, 2024
4b90d15
split get_all_full_lines from get_stability
abulmo Aug 31, 2024
5ff34ed
Dogaishi hash reduction by Matsuo & Narazaki; edge-precise get_full_line
abulmo Aug 31, 2024
805b543
Kindergarten last flip for arm32; MSVC arm Windows build (not tested)
abulmo Aug 31, 2024
2eac4bf
Use player bits only in board_score_1
abulmo Aug 31, 2024
ea7cff5
Share all full lines between get_stability and Dogaishi hash reduction
abulmo Aug 31, 2024
f72e25a
Store solid-normalized hash in PVS_midgame
abulmo Aug 31, 2024
dd9878a
SWAR vector eval update; more restore in search_restore_midgame
abulmo Aug 31, 2024
ff0213e
Inlining move_evaluate; skip movelist_evaluate if empty = 1
abulmo Aug 31, 2024
b0627b0
Backport endgame_sse optimizations into endgame.c
abulmo Aug 31, 2024
96bebd1
Fill struct Search AVX alignment hole
abulmo Aug 31, 2024
ddbf717
Change EVAL_FEATURE to struct for readability; decrease EVAL_N_PLY
abulmo Aug 31, 2024
d12962c
VPGATHERDD accumlate_eval
abulmo Aug 31, 2024
2d761fe
Fix 'nboard pass not parsed' bug, crc32c for game hash too
abulmo Aug 31, 2024
6268100
VPGATHERDD accumlate_eval
abulmo Aug 31, 2024
6623e3b
Returns all full lines in full[4]
abulmo Aug 31, 2024
47ff8ec
Correct errors causing heap corrupt on MSVC builds
abulmo Aug 31, 2024
e559c82
Revise foreach_bit_r and first_bit_32
abulmo Aug 31, 2024
396b35a
Loop out rounding score
abulmo Aug 31, 2024
53425fe
Fix alignment fault
okuhara May 24, 2022
5da98a2
Unify eval_update_sse 0 & 1
abulmo Aug 31, 2024
152421a
Restore eval by copy in search_restore_pass_midgame
abulmo Aug 31, 2024
fa2169e
add AVX get_potential_mobility; revise foreach_bit for CPU32/C99
abulmo Aug 31, 2024
ad5df69
Revise get_corner_stability and hash_cleanup
abulmo Aug 31, 2024
58f5746
add hash_prefetch; revise AVX flip & full_lines
abulmo Aug 31, 2024
84c4828
Use get_moves in search_shallow
abulmo Aug 31, 2024
98559ed
Revise PASS handling; prioritymoves in shallow; optimize Neighbour test
abulmo Aug 31, 2024
10dafdf
Expand board to 2 ULLs in non-SSE search_solve_3 and _4
abulmo Aug 31, 2024
381a4d8
Experimental bruteforce board_score_sse_1 from ubiquitin's
okuhara Jun 7, 2022
e106ae0
Fix android build; revise copyright in title
abulmo Aug 31, 2024
9baf378
Omit unpack from get_edge_stability
abulmo Aug 31, 2024
7dd1489
Revise comments and readme
abulmo Aug 31, 2024
0b9bdf5
Omit restore board/parity in search_shallow; tweak NWS_STABILITY
abulmo Aug 31, 2024
e8c2187
skip hash access if n_moves <= 1 in NWS_endgame
abulmo Aug 31, 2024
20428d2
Rearrange PVS_shallow loop
abulmo Aug 31, 2024
de924c5
small optimizations in endgame
abulmo Aug 31, 2024
152ea28
Increase hash_table and decrease shallow_table; fix NO_SELECTIVITY hack
abulmo Aug 31, 2024
0729b82
Imply NO_SELECTIVITY in shallow searches
abulmo Aug 31, 2024
bca8f32
Split movelist_evaluate_fast from movelist_evaluate
abulmo Aug 31, 2024
7f43af0
Exclude hash init time from count games; other minor size opts
abulmo Aug 31, 2024
f6af2af
flip_avx_shuf_max.c added; small improvements in other flip's
abulmo Aug 31, 2024
1cf90a7
Omit eval_weight table for ply > 53
abulmo Aug 31, 2024
01e3bc3
Ad hoc restore of eval_builder
abulmo Aug 31, 2024
aab48ac
add minimax option to eval_builder
abulmo Aug 31, 2024
2e1185b
Split v3hi_empties from search_solve_3 & moved to solve_4
abulmo Aug 31, 2024
d9db054
Fix equalize, unbias squared in eval_builder
abulmo Aug 31, 2024
1e3f4dc
Split 5 empties search_shallow loop; tune stabiliby cutoff
abulmo Aug 31, 2024
0748811
Add evalgame command to eval_builder
abulmo Aug 31, 2024
86da918
vector call version of board_next & get_moves
abulmo Aug 31, 2024
7b74534
BMI2 and mm_LastFlip version of board_score_sse_1 added (but not enab…
abulmo Aug 31, 2024
aca234d
Exclude corners from unpackA2A7/H2H7 to ease CPU_64 kindergarten
abulmo Aug 31, 2024
9f72bde
Experimental BMI2/AVX2/AVX512 lastflip inlined in endgame_sse.c
abulmo Aug 31, 2024
740a6d5
More avx512 optimization using mask register
abulmo Aug 31, 2024
124d2a4
Experimental branchless AVX512 lastflip in endgame_sse.c
abulmo Aug 31, 2024
d28b449
Add more AVX512 builds; fix modern compiler's warnings
abulmo Aug 31, 2024
6c381bc
minor AVX512/SSE optimizations
abulmo Aug 31, 2024
31a51fb
dirty fix for ICC linux optimization bug
abulmo Aug 31, 2024
7148fcd
Include gcc linux to get_moves_avx with mm256 params
abulmo Aug 31, 2024
34e2a23
SSE optimized search_pass
abulmo Aug 31, 2024
b4a8ee1
Change NodeType to char; next node_type TLU to trinary Op
abulmo Aug 31, 2024
445762c
exit search_shallow/search_eval loop when all bits processed
abulmo Aug 31, 2024
782b69b
Change NodeType to unsigned char to fix gcc warning
abulmo Aug 31, 2024
d0e53e5
fix gcc x86 build; add x86-sse build to makefile
abulmo Aug 31, 2024
8f0b2a9
more precise rboard/vboard opt; reexamine neon vboard_next
abulmo Aug 31, 2024
b0a6a65
Add build options and files for new count_last_flips
abulmo Aug 31, 2024
b6126c6
uint_fast8_t to acc last flip; unsigned char cast to 0xFF mask
abulmo Aug 31, 2024
54cccb8
Negative score in endgame solve 2/3/4; offset beta in score_1
abulmo Aug 31, 2024
f60e82c
pass flag in gamebase; increase MAX_N_GAMES in eval_builder
abulmo Aug 31, 2024
2df6722
add vectorcall to inline functions in case not inlined
abulmo Aug 31, 2024
bbbe93b
Fix score after pass bug in eval_builder
abulmo Aug 31, 2024
a6d2484
Refactor endgame_sse/neon solve 4 to 3 interface
abulmo Aug 31, 2024
8c42ec4
Use minimax instead of negamax for solve 4 or less
abulmo Aug 31, 2024
bb9981f
minimax from 5 empties and swap min/max stages
abulmo Aug 31, 2024
d5ad237
lazy high cut version of board_score_sse_1
abulmo Aug 31, 2024
0c94625
leave failed max-flip cutoff in endgame_sse as a comment
okuhara Oct 5, 2023
1ce95bc
reduce a mm256 constant in flip_avx_ppfill
okuhara Oct 5, 2023
63676a6
AVX flip reduction after TESTZ in endgame_sse.c
abulmo Aug 31, 2024
b3090a9
AVX/SSE optimized hash_cleanup
abulmo Aug 31, 2024
0bb5952
add sfence to be sure; correct comments
abulmo Aug 31, 2024
9b150b0
add hash_prefetch to NWS_endgame
abulmo Aug 31, 2024
401368f
calc solid stone only when stability cutoff tried
abulmo Aug 31, 2024
c329d2b
Fix MAX_MOVE
abulmo Aug 31, 2024
c2f23b1
Refactor get_full_lines; fix get_stability MMX
abulmo Aug 31, 2024
d7d35f1
use appropriate _mm_set1
abulmo Aug 31, 2024
0acf284
Tune NWS_stability_thres to work best with solid stone
abulmo Aug 31, 2024
5a8ec4d
get_spreaded_mobility for SSE/32, bit_count_si64 for SSE2
abulmo Aug 31, 2024
2f3acdd
Use same hash_data for R/W; reduce movelist in NWS_endgame
abulmo Aug 31, 2024
d2a153d
differed movelist sort in PVS/NWS_shallow
abulmo Aug 31, 2024
c0ce88b
refactor movelist_sort and other sorts
abulmo Aug 31, 2024
85e9862
refactor NWS_endgame loop
abulmo Aug 31, 2024
8516faa
AVX2 board_equal; delayed hash lock code
abulmo Aug 31, 2024
ca36773
new get_corner_stability for both 64&32 bit
abulmo Aug 31, 2024
4db4b3f
board_get_moves for AVX2; rename board_get_move_flip
abulmo Aug 31, 2024
492086f
new get_moves_and_potential for AVX2
abulmo Aug 31, 2024
b49005d
Fix w32-modern build and gcc build
abulmo Aug 31, 2024
ef5d1db
Replace VPERMQ due to MSVC's code and for Zen
abulmo Aug 31, 2024
a6f6ce5
New vectored bit_weighted_count_sse
abulmo Aug 31, 2024
ed65393
Fix occasional freezes
abulmo Aug 31, 2024
94e84f7
Fix non-SSE build
okuhara Oct 31, 2023
9d92bfc
Fix gcc AVX build
okuhara Nov 1, 2023
b0d7871
vboard opt using union V2DI; MSVC can assign it to XMM
abulmo Aug 31, 2024
b4dbc62
add vectorcall interface to hash functions
abulmo Aug 31, 2024
e21ff42
consistent vboard usage for eval_1 and eval_2
abulmo Aug 31, 2024
2b5a011
fix AVX512 build; use 256bit AVX512 to avoid downclocking
okuhara Nov 6, 2023
5a4f12b
More HBOARD hash functions
abulmo Aug 31, 2024
9ad14c6
_mm_cvtsi64_si128 x86 sim using loadl, requires lvalue
abulmo Aug 31, 2024
d568eb9
Drop HBOARD opt; little gain and too many changes
abulmo Aug 31, 2024
181a94b
Reformat #if's
abulmo Aug 31, 2024
50cbea0
Drop some excessive 32bit optimizations
abulmo Aug 31, 2024
286276c
Use DISPATCH_NEON, not hasNeon, for android arm32 build
abulmo Aug 31, 2024
ab6b02c
Use lrmask instead of mask_dvhd for LASTFLIP_HIGHCUT
okuhara Nov 17, 2023
e7e778a
Rewrite AVX512 LASTFLIP_HIGHCUT not to use kortest
abulmo Aug 31, 2024
a991a7a
Refine AVX512 SIMULLASTFLIP
okuhara Nov 22, 2023
eeb6c22
Initial 4.5.2; some reformats
abulmo Aug 31, 2024
cbe8eb7
Update move.c
abulmo Aug 31, 2024
f9719b0
Update root.c
abulmo Aug 31, 2024
8d9489f
Revise board0 usage; fix unused flips
abulmo Aug 31, 2024
afd02b1
minimax search_eval_1; feed moves to search_eval_1/2
abulmo Aug 31, 2024
e8dc5f7
modify movelist_evaluate calc; affects fingerprint
okuhara Dec 16, 2023
9fb9704
Add _mm_extract_epi64 to x86 sim
abulmo Aug 31, 2024
0529875
Split board_flip_* from board_symetry
abulmo Aug 31, 2024
56b0399
SSE optimized board_symetry again
abulmo Aug 31, 2024
8af90af
MSC ARM64 still missing _arm_rbit
abulmo Aug 31, 2024
569fcc2
Renew version string and copyright year
abulmo Aug 31, 2024
decdcca
AVX optimized board_unique
abulmo Aug 31, 2024
1d6e1dd
Change macos test build to arm
okuhara May 19, 2024
003de24
Init 4.5.3: abandon size_reduced_movelist which confuses gcc warn
abulmo Aug 31, 2024
82e7ed0
AVX512 last flip with lastflip_highcut
abulmo Aug 31, 2024
7c6e177
Add flip-sve-lzcnt.c for arm SVE build
abulmo Aug 31, 2024
4281514
Replace broadcast which confuses gcc warn with set1
okuhara Jun 15, 2024
4bbcfa9
Add 512bit version to AVX_LAST_FLIP_AVX512 SIMULLASTFLIP
okuhara Jun 15, 2024
fc7268f
Add SVE SIMULLASTFLIP to endgame_neon (but not enabled)
abulmo Aug 31, 2024
5af8f91
Refine last square score in AVX512 SIMULLASTFLIP
okuhara Jun 19, 2024
9618d7c
Replace broadcast from memory with set1
abulmo Aug 31, 2024
80f911e
Revise avx512 mask usage to ease ternarylogic opt
abulmo Aug 31, 2024
2845441
Revise last square score adjustment using sign flag
okuhara Jun 28, 2024
ef8b39a
Fall down to SSE board_score_1 for AVX512 lazy LC w/o DEFs
abulmo Aug 31, 2024
7dbf1b4
Revert AVX Flip results to __m128i, keeping reduce_vflip partially
abulmo Aug 31, 2024
ca7911a
Replace board_score_sse_1 param PO with OP
abulmo Aug 31, 2024
bfb9d37
Replace mm_flip OP param unpack with _mm_set_epi64x
abulmo Aug 31, 2024
e0259ee
Add TEST1_EPI8_MASK32 and remove TESTNOT_EPI8_MASK32 in board_score_s…
abulmo Aug 31, 2024
0e5f16c
Add acepck's pcmpgtq flips (but not enabled)
abulmo Aug 31, 2024
95190e2
rebase to version 4.5.2
abulmo Sep 1, 2024
00f132f
Update README.md
abulmo Sep 1, 2024
ef94063
Merge remote-tracking branch 'abulmo/master'
sensuikan1973 Sep 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
- os: windows-latest
build_command: make build ARCH=x64 COMP=gcc OS=windows
- os: macos-latest
build_command: make build ARCH=x64-modern COMP=gcc OS=osx
build_command: make build ARCH=arm COMP=gcc OS=osx

steps:
- uses: actions/checkout@v2
Expand Down
37 changes: 37 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,44 @@
<<<<<<< HEAD
<<<<<<< HEAD
bin/
data/
doc/

*.7z

.DS_Store
=======
*.o
*.s
*.exe
*~
pgopti*
*.dyn
all.gc*
generate_flip.*
generate_count_flip.*
bcnttest.*
*.utf8.c
bin/
doc/
<<<<<<< HEAD
>>>>>>> dd6b636 (Bcc32 friendly and minor improvement on Flip_32.)
=======
problem/
>>>>>>> 48873fa (calc opponent_feature once in eval_open)
=======
*.o
*.s
*.exe
*~
pgopti*
*.dyn
all.gc*
generate_flip.*
generate_count_flip.*
bcnttest.*
*.utf8.c
bin/
doc/
problem/
>>>>>>> 3e1ed4f (fix cr/lf in repository to lf)
108 changes: 108 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,111 @@ cd src
doxygen
open ../doc/html/index.html
```
=======
# edax-reversi-AVX
Automatically exported from code.google.com/p/okuharaandroid-edax-reversi

=======
# edax-reversi-AVX
Automatically exported from code.google.com/p/okuharaandroid-edax-reversi

Edax is a strong othello program. Its main features are:

fast bitboard based & multithreaded engine.
accurate midgame-evaluation function.
opening book learning capability.
text based rich interface.
multi-protocol support to connect to graphical interfaces or play on Internet (GGS).
multi-OS support to run under MS-Windows, Linux and Mac OS X.

>>>>>>> 81dec96 (Kindergarten last flip for arm32; MSVC arm Windows build (not tested))
This is SSE/AVX optimized version of Edax 4.4.0. Functionally equivalent to the parent project, provided no bugs are introduced.

Thanks to AVX2, x64-modern build solves fforum-40-59.obf 60% faster than official edax-4.4 on Haswell, and runs level 30 autoplay 80% faster.

See http://www.amy.hi-ho.ne.jp/okuhara/bitboard.htm and http://www.amy.hi-ho.ne.jp/okuhara/edaxopt.htm for optimization details in Japanese.

## 1. Mobility (board_sse.c, board_mmx.c)

### 1.1 new SSE2 version of get_moves
Diagonals are SIMD'd using vertical mirroring by bswap.

Athlon -get_moves_sse
problem\fforum-20-39.obf: 111349635 nodes in 0:07.998 (13922185 nodes/s).
mobility: 81.10 < 81.28 +/- 0.17 < 82.03
Athlon +get_moves_sse
problem\fforum-20-39.obf: 111349635 nodes in 0:07.889 (14114544 nodes/s).
mobility: 71.08 < 71.72 +/- 0.34 < 73.53
Core2 -get_moves_sse
problem/fforum-20-39.obf: 111349635 nodes in 0:10.180 (10938078 nodes/s).
mobility: 78.06 < 78.18 +/- 0.08 < 78.41
Core2 +get_moves_sse
problem/fforum-20-39.obf: 111349635 nodes in 0:09.978 (11159514 nodes/s).
mobility: 60.84 < 61.19 +/- 0.13 < 61.47

### 1.2 can_move
Now calls SIMD'd get_moves for x86/x64 build.

## 2. Stability (board.c, board_sse.c, board_mmx.c)

### 2.1 get_full_lines_h, get_full_lines_v
get_full_lines for horizontal and vertical are simplified. The latter is compiled into rotation instrunction.

### 2.2 rearranged loop
The last while loop is rearranged not to call bit_count in case stable == 0.

### 2.3 new SSE2 version with bswap and pcmpeqb
Athlon -get_stability_sse
stability: 90.10 < 90.28 +/- 0.24 < 91.20
Athlon +get_stability_sse
stability: 81.59 < 81.93 +/- 0.73 < 86.25
Core2 -get_stability_sse
stability: 79.24 < 79.39 +/- 0.15 < 79.93
Core2 +get_stability_sse
stability: 71.80 < 71.85 +/- 0.06 < 72.07

### 2.4 get_corner_stability
Kindergarten version eliminates bit_count call.

### 2.5 find_edge_stable
Loop optimization and flip using carry propagation. One time execution but affect total solving time.

## 3. eval.c (4.4.5)
Eval feature calculation using SSE2 / AVX2 (now in eval_sse.c) improves midgame by 15-30% and endgame by 8-12%.
Restoring eval from backup instead of rewinding.
eval_open (one time execution) is also optimized.

## 4. hash.c
I think hash->data.move[0] on line 677 should be hash->data.move[1].

## 5. board_symetry, board_unique (board.c, board_sse.c)
SSE optimization and mirroring reduction. (Not used in solving game)

## 6. endgame_sse.c (4.4.7)
Keep more variables in SSE registers. SSE optimized count_last_flip. Parity sort by shuffle.

## 7. board_get_hash_code (4.5.0)
Changed to use CRC32c. This enables hardware acceleration on modern build.

## 8. AVX2 versions (x64-modern build only)
In many cases AVX2 version is simplest, thanks to variable shift instructions (although they are 3 micro-op instructions).

Benchmarks are on Core i5-4260U (Haswell) 1.4GHz (TB 2.7GHz) single thread.

4.4.0 original x64-modern clang
problem/fforum-20-39.obf: 111349635 nodes in 0:05.726 (19446321 nodes/s).
+optimizations 1-5 above, no-avx2
problem/fforum-20-39.obf: 111349635 nodes in 0:05.342 (20844185 nodes/s).
+get_moves (board_sse.c)
problem/fforum-20-39.obf: 111349635 nodes in 0:05.142 (21654927 nodes/s).
+flip_avx.c
problem/fforum-20-39.obf: 111349635 nodes in 0:04.946 (22513068 nodes/s).
+count_last_flip_sse.c
problem/fforum-20-39.obf: 111349635 nodes in 0:04.906 (22696624 nodes/s).

## 9. makefile
gcc-old, x86 build should be -m32, not -m64. Some flags and defines added for optimization.
<<<<<<< HEAD
>>>>>>> b9d48c1 (Create README.md)
=======
>>>>>>> 81dec96 (Kindergarten last flip for arm32; MSVC arm Windows build (not tested))
8 changes: 8 additions & 0 deletions src/Android.mk
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
LOCAL_PATH := $(call my-dir)
include $(CLEAR_VARS)
LOCAL_MODULE := aEdax # should be renamed to lib..aEdax..so afterwords
LOCAL_CFLAGS += -DUNICODE
LOCAL_SRC_FILES := all.c board_sse.c.neon eval_sse.c.neon flip_neon_bitscan.c.neon android/cpu-features.c
LOCAL_ARM_NEON := false
# cmd-strip :=
include $(BUILD_EXECUTABLE)
3 changes: 3 additions & 0 deletions src/Application.mk
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
APP_ABI := armeabi-v7a arm64-v8a x86 x86_64
APP_PLATFORM := android-14
APP_BUILD_SCRIPT := Android.mk
8 changes: 4 additions & 4 deletions src/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -687,7 +687,7 @@ RECURSIVE = NO
# Note that relative paths are relative to the directory from which doxygen is
# run.

EXCLUDE =
EXCLUDE = _*.c

# The EXCLUDE_SYMLINKS tag can be used to select whether or not files or
# directories that are symbolic links (a Unix file system feature) are excluded
Expand Down Expand Up @@ -1599,7 +1599,7 @@ HIDE_UNDOC_RELATIONS = YES
# toolkit from AT&T and Lucent Bell Labs. The other options in this section
# have no effect if this option is set to NO (the default)

HAVE_DOT = YES
HAVE_DOT = NO

# The DOT_NUM_THREADS specifies the number of dot invocations doxygen is
# allowed to run in parallel. When set to 0 (the default) doxygen will
Expand Down Expand Up @@ -1688,15 +1688,15 @@ INCLUDED_BY_GRAPH = YES
# the time of a run. So in most cases it will be better to enable call graphs
# for selected functions only using the \callgraph command.

CALL_GRAPH = YES
CALL_GRAPH = NO

# If the CALLER_GRAPH and HAVE_DOT tags are set to YES then
# doxygen will generate a caller dependency graph for every global function
# or class method. Note that enabling this option will significantly increase
# the time of a run. So in most cases it will be better to enable caller
# graphs for selected functions only using the \callergraph command.

CALLER_GRAPH = YES
CALLER_GRAPH = NO

# If the GRAPHICAL_HIERARCHY and HAVE_DOT tags are set to YES then doxygen
# will generate a graphical hierarchy of all classes instead of a textual one.
Expand Down
Loading
Loading