Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0010061Dwarf FortressTechnical -- Generalpublic2016-11-02 11:472019-02-12 12:32
ReporterSolra Bizna 
Assigned Tolethosor 
PlatformLinuxOSDebianOS VersionStretch
Product Version0.43.05 
Target VersionFixed in Version 
Summary0010061: This save is seconds away from crashing DF
DescriptionI'm running DF on Debian Linux. A thousand or so ticks after loading this save, the game crashes.
Steps To Reproduce1. Load save.
2. Wait. DF will crash soon after the yak is slaughtered, if not before.
Additional InformationSave is here: [^]

64-bit, PRINT_MODE 2D: Xlib abort with a multithreading-related error message... unless I'm running DF through gdb, in which case there's a floating point exception
64-bit, dfstream: Floating point exception
32-bit, PRINT_MODE 2D: Floating point exception

Changing Z-levels seems to slightly alter the timing of the crash.

All of the floating point exceptions take place with a backtrace similar to this:

#0 0x08c749a1 in ?? ()
0000001 0x08d87946 in ?? ()
0000002 0x0897d591 in ?? ()
0000003 0x0897dad5 in ?? ()
0000004 0x083f93e2 in ?? ()
0000005 0xf7b29ea1 in interfacest::loop() ()
   from /home/sbizna/df_linux32/libs/
0000006 0x08665d4f in mainloop() ()
0000007 0xf7b0cb92 in enablerst::async_loop() ()
   from /home/sbizna/df_linux32/libs/
0000008 0xf7b0cf8d in call_loop(void*) ()
   from /home/sbizna/df_linux32/libs/
0000009 0xf7efb155 in ?? () from /usr/lib/i386-linux-gnu/
0000010 0xf7f3f048 in ?? () from /usr/lib/i386-linux-gnu/
0000011 0xf735e2da in start_thread () from /lib/i386-linux-gnu/
0000012 0xf781691e in clone () from /lib/i386-linux-gnu/
Tags0.44.09, 0.44.12
Attached Files

- Relationships
duplicate of 0008410resolvedlethosor Crash due to zero-size weasel 
has duplicate 0010859resolvedlethosor Constant Crashes 

-  Notes
Solra Bizna (reporter)
2016-11-03 12:59

With the help of dwarf_fortress_unfuck I did a little investigating. The exceptions occur within interfacest::loop()'s call to currentscreen->logic(). I hacked in a SIGFPE handler that prints the type of error that occurred (for diagnostic purposes), and throws a catchable exception, then wrapped the currentscreen->logic() call in a try/catch block for that exception. With this in place, if currentscreen->logic() triggers a SIGFPE, it is simply skipped until the next time the loop runs.

From this, I learned two things:

1. As I suspected, the SIGFPE is being caused by an integer divide by zero. (The si_code is FPE_INTDIV.)
2. With this hack in place, instead of crashing, the game... "freezes". The interface is responsive, and it seems to believe it's still processing ticks, but creatures stop moving. Days still seem to pass normally, and at random intervals the game "unfreezes" for a few dozen ticks.

Interestingly, if I stop blocking the river (by pulling the northmost lever in Asob's office), fluid flow continues during the "freeze".

The outpost liason is visiting the fort. I suspected he might be the cause of the problem, so I ordered my crummy militia to bludgeon him to death. This didn't prevent the divide by zero from eventually occurring, and _did_ cause a loyalty cascade.

Deleting all of my work orders in the new manager didn't change anything either.

If I just power through the "freezes", the SIGFPEs eventually stop happening.
lethosor (manager)
2018-04-12 09:22
edited on: 2018-04-18 08:54

Confirmed in vanilla 0.44.09 on OS X.
For what it's worth, it appears to crash consistently at 0x0000000100c966e1 (at least 3 times now), although that's probably not very useful.
This was brought to my attention by [^] , which may be the same issue as this.

lethosor (manager)
2018-08-08 09:39

Save from 0.44.12, 0010859: [^]
risusinf (reporter)
2019-02-07 02:19
edited on: 2019-02-10 21:36

Both saves from OP and from the comment above stop crashing after
[DFHACK]# exterminate weasel
which is very similar to 0008410. It says something about zero body size, how do i check that?

Also see 0010253

lethosor (manager)
2019-02-12 12:27
edited on: 2019-02-12 12:31

Highlighting the weasel and running "lua ~unit.body.size_info" in DFHack prints
size_cur               	 = 0
size_base              	 = 1
area_cur               	 = 0
area_base              	 = 1
length_cur             	 = 0
length_base            	 = 21

Killing the weasel with DFHack stops the crash. From running "for _,u in ipairs(world.units.all) do if u.body.size_info.size_cur == 0 then print( end end" in Lua, this is the only zero-sized unit.

Thanks for investigating! I'll close this as a duplicate of 0008410.

- Issue History
Date Modified Username Field Change
2016-11-02 11:47 Solra Bizna New Issue
2016-11-03 12:59 Solra Bizna Note Added: 0036028
2018-04-12 09:22 lethosor Note Added: 0038154
2018-04-12 09:22 lethosor Assigned To => lethosor
2018-04-12 09:22 lethosor Status new => confirmed
2018-04-12 09:22 lethosor Tag Attached: 0.44.09
2018-04-13 09:57 Huntthetroll Issue Monitored: Huntthetroll
2018-04-18 08:54 lethosor Note Edited: 0038154 View Revisions
2018-08-08 09:38 lethosor Relationship added has duplicate 0010859
2018-08-08 09:39 lethosor Note Added: 0038709
2018-08-08 09:39 lethosor Tag Attached: 0.44.12
2019-02-06 00:04 risusinf Note Added: 0039190
2019-02-07 02:19 risusinf Note Deleted: 0039190
2019-02-07 02:19 risusinf Note Added: 0039194
2019-02-10 21:36 risusinf Note Edited: 0039194 View Revisions
2019-02-12 12:27 lethosor Note Added: 0039208
2019-02-12 12:31 lethosor Note Edited: 0039208 View Revisions
2019-02-12 12:32 lethosor Relationship added duplicate of 0008410
2019-02-12 12:32 lethosor Status confirmed => resolved
2019-02-12 12:32 lethosor Resolution open => duplicate
2019-02-24 18:42 Huntthetroll Issue End Monitor: Huntthetroll

Copyright © 2000 - 2010 MantisBT Group
Powered by Mantis Bugtracker