Saturday, March 17, 2012

Murphys law

As I now do some code fine tuning in iCE I run long test series to see that the changes does not make the engine weaker. I started to see iCE losing some games because of illegal moves.

I was absolutely dumbfounded by that. The move generator of iCE passes all perft test, I use a 64 bit hash signature to make undetected collisions very unlikely and as an additional safety belt I verify each move from the hash again for legality. So illegal moves should just be not possible.

After some investigation I found that is was always the same illegal move in the same position and it actually came from the new opening book of iCE. So when I assembled the book I generated an illegal move and put it in the book.

In this position the move to play was specified as 5. bxa6. The iCE book creator translated that move into Pawn from b2 to a6. So when it looked what pawn actually captures on a6 it picked the wrong one. The reason was an uninitialized bitboard and this was a quick fix.

But it still did not explain why iCE actually used that move and did not detect it as illegal. After debugging I found that the move validator had also a small bug (a bracket was closed at the wrong spot) that just effected white pawn captures and so the move went ok through validation.

I fixed that bug too and corrected the books but it showed "whenever something can go wrong it will".

Sunday, March 4, 2012

Another reference match

In my efforts to establish a performance baseline I ran another reference match against a strong engine. This time I picked gaviota and decided to run a few more games at a slightly longer time control to get a more accurate result.

600 games at a TC of 100 moves in 10 sec + 0.5 sec increment per move, ponder off, own book

Score of gaviota v0.85 vs iCE 0.3 v2394 : 392 - 111 - 97  [0.73] 600
ELO difference: 176

which seems about petty correct. So this is the level I want to improve my engine from.

Saturday, March 3, 2012

Windows TimeResolution Trouble

In recent tournaments iCE 0.2 sometimes lost on time which is a very bad thing. Its time control is implemented aggressively so it is using all available time on the last move before the time control but it should not overstep it from the way it was implemented, but it did. As I wanted to improve the time allocation anyway I introduced a soft and a hard time limit. Under certain conditions the engine might continue to search beyond the soft but never beyond the hard limit. The hard limit is verified to always be less than the available remaining time. On the last move before the time control a small safety buffer is used in addition. So I was pretty sure time loss are now a thing of the past.

I was wrong. In a little private reference tournament at very fast time controls (1 sec + 0.1 sec increment per move) iCE was losing almost every game on time. After debugging the GUI and the Engine logs it showed that the engine was assuming to spend less time on a move as the GUI did. So the engine thought it has calculated for 90 ms where the GUI recorded 120 ms for a move. So the initial buffer was getting smaller and smaller with each move until there was almost no buffer anymore and the engine overstepped it.

I investigated the issue a bit and found 2 root causes for this behavior
  • it seems that the Windows XP time resolution has a default quantum of about 10 ms, so the time related functions have a smallest resolution of 10 ms. So for 9 ms the engine thinks it did spend no time at all and 1 ms later it spent already 10.
  • the c++ Sleep(int wait) function is only specifying the minimum time the OS scheduler does not schedule the thread that called Sleep. The actual time the thread is suspended can be much greater
If the engine is idle iCE is sending the worker thread to sleep until the IO thread is signaling new input. Otherwise and idle engine would consume a whole CPU core just by looping over a hasNewInput flag. The Sleep interval is actually pretty small, I was using 10 ms but as this is not reliable the GUI might have sent a search command and started the GUI clock while the engine was still sleeping for some ms. When the engine finally woke up realized the new command and started its clock the GUI clock was already ticking for some ms.

Those 2 facts introduced a significant error margin in short TC games. I investigated some alternatives to either lower the time resolution of the OS (there is an API call for that) or eliminate the Sleep call in the main thread but all had its Cons. At the end I decided to introduce a TIME_RESOLUTION_ERROR_MARGIN of 25 ms. All time limits assume that the engine in reality is using 25 ms more than it thinks and the internal limits are adjusted for that. This works pretty well and eliminates almost all time losses even in very fast TC games. The downside is that the engine is sometimes using some ms less time than it could. But nothing is perfect.

All this is only relevant in very short TC games anyway. In real tournaments the engine has much more time and 25 ms don't make a difference at all. But for engine tuning and change evaluation a lot of games are required and to finish 1000+ games in a reasonable amount of time you have to limit the time per move very much.

With the new TC I ran a quick match against a pair of engines to establish a baseline which I measure engine changes against. iCE was actually doing not so bad, but I think this is TC related. At longer TCs the other engines would probably be stronger. For instance I wasn't able to include the engine Aristarch 4.50 at all because it is very weak at short TCs. It was mated in 9 out of 10 games by iCE (no time losses but real mates). At longer TCs Aristarch seems much stronger.

Those are the results of iCE against 5 engines with 200 games each at a TC of 100 moves in 3 sec + 0.3 sec per move. All engines 32 bits and 1 core, PonderOff, own book where existing.

Rank Name                        ELO   Games   Score   Draws
   1 Quazar 0.4 w32              282     200     84%     16%
   2 cheng3 1.07 JA               96     200     64%     12%
   3 iCE 0.3 v2394                19    1000     53%     17%
   4 Abrok 5.0                   -92     200     37%     24%
   5 Ufim 8.02                  -177     200     26%     18%
   6 Eeyore 1.52 UCI            -186     200     26%     15%
Finished match

Congratulations to Martin Sedlak. It's cheng engine is really very strong and greatly improved in Version 1.07.