|
1. A problem in malloc calls
|
|
Fri Nov 11, 2011 [5:02 AM]
|
nosty
Email not supplied
member since: Dec 12, 2000
|
Reply
|
|
#0 0x00000032404edadc in __lll_lock_wait_private () from /lib64/libc.so.6 #1 0x000000324047cb31 in _L_lock_9745 () from /lib64/libc.so.6 #2 0x000000324047aaf2 in malloc () from /lib64/libc.so.6 #3 0x000000324046f60d in open_memstream () from /lib64/libc.so.6 #4 0x00000032404dbc3b in __vsyslog_chk () from /lib64/libc.so.6 #5 0x0000003240470b0f in __libc_message () from /lib64/libc.so.6 #6 0x000000324047703a in malloc_printerr () from /lib64/libc.so.6 #7 0x0000003240478dd3 in _int_malloc () from /lib64/libc.so.6 #8 0x000000324047ab00 in malloc () from /lib64/libc.so.6 #9 0x00000000005d40d2 in alloc_mem (sMem=15) at db.c:2671 #10 0x000000000050e0dd in str_dup (str=0x7fff651dacb5 "'cone of cold'") at ssm.c:290
From what I understand of the limited google hits for this it seems that malloc can occasionally deadlock on itself, creating this condition. What has me wondering though is since my mud codebase doesn't do anything threading...how this condition comes into existence in the first place. Anyone else meeting this guy? We've been getting these since moving to a more up to date server using the libc 6.* libraries. Also since its in malloc itself I'm at a lose on how I can prevent it from occurring.
Happens once a bluemoon (like one in maybe every few million string allocations) but still rather annoying.
|
|
|
|
|
2. RE: A problem in malloc calls
|
|
Fri Nov 11, 2011 [8:26 AM]
|
dentin
soda@xirr.com
member since: Aug 21, 2008
|
In Reply To
Reply
|
Nosty, This is the critical line: #6 0x000000324047703a in malloc_printerr () from /lib64/libc.so.6 It's deadlocking trying to print an error message, probably because the allocator internal memory state is fouled up and the print function requires memory to function. It is -extremely- likely that you've used a chunk of memory that has already been free'd, damaging the internal state of the memory allocator. Later (possibly much later,) when the allocator attempts to reuse that chunk, the damaged state causes the allocator to panic and dump an error message, which it is unable to do because it can't get some critical memory resource (because the internal state of the allocator is hosed.) I've not used it, but a lot of people swear by Valgrind for debugging problems like this. It will probably tell you exactly what to look for the next time it comes up. There's another thread here today concerning memory allocation issues; you may want to read that as well. Good luck! -dentin Alter Aeon MUD http://www.alteraeon.com
|
|
|
|
|
3. RE: A problem in malloc calls
|
|
Fri Nov 11, 2011 [9:53 AM]
|
nosty
Email not supplied
member since: Dec 12, 2000
|
In Reply To
Reply
|
|
Hah so it might be on my end after all eh? When I was googling around the only hits I'd get were about a malloc bug where it was referencing itself in a particular version of the library to generate an error but yah the timeline does also fit with something else that went in on our code which does do a lot of trickery with string manipulations.
Also coincidentally yah in that very thread I was extolling the virtues of valgrind, didn't even consider that a possible solution (since I'd convinced myself it wasn't a problem in my own code but in malloc) but yah I'll give that a run and see what it tells me.
Now to remember how use it properly :P
Much thanks for sure.
(Comment added by nosty on Fri Nov 11 10:52:38 2011)
Its all coming back to me, neat man I forgot that this tool finds uninitialized values as well that are being used in that state, found several interesting little things with that so far.
But, a grand total of 1 string out of place from something I fully knew was doing what it was doing. 85 bytes lost after running Xaos in turbo mode (setting to disable the sync to clock, makes the mud run as fast as it can) for about the last hour or so. Not saying its off the hook, especially since the frequency of that deadlock is so darn rare but well, I'm actually somewhat impressed with our operations. 85 out of the 188,452,677 it was taking care of ain't bad, ain't bad at all.
Either way, thank you again for suggesting I try that out.
|
|
|
|
|
4. RE: A problem in malloc calls
|
|
Fri Nov 11, 2011 [6:03 PM]
|
Tyche
Email not supplied
member since: Apr 4, 2000
|
In Reply To
Reply
|
|
...from /lib64/libc.so.6
Are you compiling and running in 64-bit mode?
|
|
|
|
|
5. RE: A problem in malloc calls
|
|
Sat Nov 12, 2011 [9:48 AM]
|
nosty
Email not supplied
member since: Dec 12, 2000
|
In Reply To
Reply
|
|
I am yah, and that is also where the situation started to occur.
|
|
|
|
|
6. RE: A problem in malloc calls
|
|
Sat Nov 12, 2011 [10:30 AM]
|
Tyche
Email not supplied
member since: Apr 4, 2000
|
In Reply To
Reply
|
|
A double free of a pointer can cause corruption of the heap and is hard to detect until much later. Have you tried using MALLOC_CHECK_ = 1 and recompiling?
(Comment added by Tyche on Sat Nov 12 10:40:36 2011)
It's odd that ssm calls alloc_mem instead of malloc. In the code I've seen ssm manages it's own memory and of course alloc_mem manages it's own memory. That's why I think it's a double free. Anyway the 64-bit issue might be a problem in the pointer arithmetic done in either memory management system. Just wild guesses without seeing the code.
|
|
|
|
|
7. RE: A problem in malloc calls
|
|
Sat Nov 12, 2011 [11:55 AM]
|
dentin
soda@xirr.com
member since: Aug 21, 2008
|
In Reply To
Reply
|
The reason I didn't mention double free is that most modern libraries I've seen come with double free detection already enabled. I had assumed that a double free would have thrown a more obvious error on his system, as it does on ours. My mistake if that's in fact what was going on. Sorry about that. -dentin Alter Aeon MUD http://www.alteraeon.com
|
|
|
|
|
8. RE: A problem in malloc calls
|
|
Sat Nov 12, 2011 [12:31 PM]
|
Tyche
Email not supplied
member since: Apr 4, 2000
|
In Reply To
Reply
|
|
You're right. It is built into newer libs and he shouldn't have to recompile. It is distro dependent though and he may have to set the environment variables in his startup script: export MALLOC_CHECK_=1
|
|
|
|
|
9. RE: A problem in malloc calls
|
|
Sat Nov 12, 2011 [1:23 PM]
|
nosty
Email not supplied
member since: Dec 12, 2000
|
In Reply To
Reply
|
|
Oh sorry I should have mentioned something about the reference to the ssm.c file, its a red herring. At one point in the past we were using SSM but eventually its benefits outweighed its CPU usage so at this point our strings don't actually use that stuff at all but the two functions str_dup and free_string live in the source file. I just never bothered moving them out of that file nor removing ssm.c from the makefile (but nothing ssm related is involved, none of its functions are called anywhere).
I'll try running it with malloc_check_ set to 1 though and see if it exposes anything for sure though.
|
|
|
|
|