Please check out God Wars II !

Member Discussions

terms



[Previous] [Next] [Post] [Reply] [Topics] [Summary] [Search]


1. System Resource Hogging by Sysadmin Mon Nov 1, 2004 [8:39 AM]
Xenophon
Email not supplied
member since: Oct 31, 2003
Reply
Alright, here's the story. Lately, for the passed two weeks or so, the sysadmin has been playing one or two certain games which take up over 90%CPU resources(And generally pushes the memory to the limits leaving *very little* leftover). In this time, our mud has begun crashing like hell. We've been getting a lot of SIGSEGV crashes(Memory), and a couple SIGHUP and SIGPIPE crashes. As well, a lot of 'Write_to_descriptor: Resource temporarily unavailable'(perror('Write_to_descriptor');). Now, my question is simple. Will a lack of system resources crash our mud? I'm asking this in a very general sense. If the system runs out of memory, would our mud crash? If the system runs out of memory, is it possible for the mud to try and access already used memory and crash? If the mud cannot get any cpu-time, would it crash, or could some function inside of it crash it? It's a freehost we're on right now, so by no means am I bashing the host. He/she has every right to do whatever they wish with their server, even if it disrupts everything else.
I had installed a signal handler, and used a very primitive last_command function(Which works through the interpret function to snatch the last player-used command), and have found that several times, the mud will crash with simple commands such as kill, quit, save, berserk, or in some instances, the mud will crash without any input(Which leads me to believe it could be system-resource related.), such as the last command could've been quit, but the mud crashed 3 minutes after the last player quit, so chances are the two aren't directly related.
Anyone care to enlighten me on the reason we're crashing, be it the fault of the mud, or the fault of the system? (We've never had problems before with the mud, in terms of crashing/bugs, but I won't rule it out. We're only human.)
Xenophon

(Comment added by Xenophon on Mon Nov 1 10:48:23 2004)

An example of the output from top:
CPU states: cpu user nice system irq softirq iowait idle
total 62.8% 1.1% 35.9% 0.0% 0.0% 0.0% 0.0%
Mem: 514560k av, 510000k used, 4560k free, 0k shrd, 10600k buff
102260k active, 335132k inactive
Swap: 2120480k av, 253732k used, 1866748k free 146664k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
12648 ******* 16 0 74016 72M 5308 S 53.1 14.3 2:41 0 nwmain
7770 root 21 0 98764 29M 23580 R 42.0 5.8 2211m 0 X
18995 ******* 15 0 3268 1772 1344 S 1.3 0.3 97:31 0 kmix
19767 *** 15 0 20400 18M 4080 S 0.7 3.6 7:44 0 mcloud


2. RE: System Resource Hogging by Sysadmin Mon Nov 1, 2004 [9:01 AM]
Kastagaar
Email not supplied
member since: Jul 29, 1999
In Reply To
Reply
> As well, a lot of 'Write_to_descriptor: Resource temporarily
> unavailable'(perror('Write_to_descriptor');).

perror writes a message to standard error and then aborts (causes the program to core dump). IOW, you've written (or not unwritten) it to crash when it goes wrong. Consider a retry loop.
There are two ways of constructing software: to make it so simple that there are obviously no errors, and to make it so complex that there are no obvious errors.


3. RE: System Resource Hogging by Sysadmin Mon Nov 1, 2004 [4:27 PM]
SuperPele
Email not supplied
member since: Apr 7, 2003
In Reply To
Reply
perror does write a message to stderr, but it doesn't cause the program to abort. Depending on how much memory you have been given by the admin, it could crash the mud if you go over that limit, or I guess if the admin uses up all the memory. Easiest way is to just run it under gdb and check the info there when it crashes.


4. RE: System Resource Hogging by Sysadmin Mon Nov 1, 2004 [8:33 PM]
Razzer_9
Email not supplied
member since: Mar 5, 2001
In Reply To
Reply
perror does write a message to stderr, but it doesn't cause the program to abort.

While you are correct, Kas has a bit more knowledge than that about DIKUs. He was talking about how the DIKU error reporting system uses perror and subsequently aborts.


5. RE: System Resource Hogging by Sysadmin Tue Nov 2, 2004 [12:02 AM]
eiz
eiz@codealchemy.org
member since: Dec 24, 2002
In Reply To
Reply
Except in this case it does no such thing: it simply closes the socket. Actually, it doesn't ever abort due to a network error, it exit(1)s which will not produce a core (and that would be SIGABRT anyway, not SIGSEGV).

On the other hand, the advice to add a retry loop to write_to_descriptor is more or less sound. Even better would be to move to a system that isn't running NWN.


6. RE: System Resource Hogging by Sysadmin Tue Nov 2, 2004 [2:01 AM]
Kastagaar
Email not supplied
member since: Jul 29, 1999
In Reply To
Reply
> perror does write a message to stderr, but it doesn't cause
> the program to abort.

My mistake. It looks like you get what you pay for, and that counts for free advice, too. Sorry about that.

Reading through Envy code, I see that, in this case, it should actually cause the client's socket to be closed rather than crash. Look at process_output().

However, if that's failing, then so may this code:
if ( select( maxdesc+1, &in_set, &out_set, &exc_set, &null_time ) < 0 )
{
    perror( "Game_loop: select: poll" );
    exit( 1 );
}


There are two ways of constructing software: to make it so simple that there are obviously no errors, and to make it so complex that there are no obvious errors.


7. RE: System Resource Hogging by Sysadmin Tue Nov 2, 2004 [8:43 AM]
lindahlb
Email not supplied
member since: Mar 2, 2001
In Reply To
Reply
I'm guessing write_to_descriptor (or send/write) bailed out because it wasn't able to allocate memory it needed - perhaps a system buffer. Are you sure this is the source of the crash, though? A crash could have occurred just after this point.




[Previous] [Next] [Post] [Reply] [Topics] [Summary] [Search]