Twitter Updates

    follow me on Twitter

    G-AVLN in front of her home

    G-AVLN in front of her home

    Mostly Unix and Linux topics. But flying might get a mention too.

    Tuesday, June 10, 2008

    When troubleshooting - work methodically....

    I am working on a brand new (albeit second hand) HP-UX (Itanium) machine. Never really configured it, all I've done with it was to interrogate it a bit, and I don't really know which known to me flavour of UNIX is going to be the closest to this one... So anything I do will require an element of investigation. All these wonderful HP-UX books that I've bought are sitting next to me, and no doubt will come handy at one point. In the meantime, let's play...

    First of all, the machine booted into run-level 3, and seems to be defaulting to DHCP, whereas I am sitting in a classroom with fixed private addresses. I need to assign a static address:

    # ifconfig lan0 192.168.1.43

    Pinging the gateway and other machines on the network is working the treat. However, some 30 seconds after the IP address change, the machine starts to emit horrible, ear-piercing beeps. As if some application or a server got terribly unhappy with the new address. Rebooted the box, and repeated the exercise a couple of times - a very predictable sequence of events was reproduced every time. Eventually, instead of rebooting, I placed it into single user mode, for a quick look under the hood:

    # init 1

    First of all, find out how the rc (start-up and shutdown) stuff works: a look inside the inittab file is usually the best place to start:

    # cat /etc/inittab
    init:3:initdefault
    ...
    sqnc::wait:/sbin/rc # system init
    ... :respawn:... # several of these and few more
    ...

    The master rc scripts are, as expected in the /etc/init.d directory. The run-level related links, however, are under the /sbin location (rather than /etc in other version of UNIX).

    # ls -ld rc*
    rc rc.utils rc0.d rc1.d rc3.d rc4.d

    Most of the multiuser runlevel work is done in the rc2.d directory (i.e as part of going into run-level 2). Run-level 3 introduces some of the network server start-up:

    # ls /sbin/rc3.d
    S100nfs.server S823hpws_apache S823hpws_webmin S990dtlogin.rc
    S200tps.rc S823hpws_tomcat S823hpws_xmltools S999kwdbd

    So the first test is to change the IP address in run-level 1, and see if anything gets upset:

    # ifconfig lan0 192.168.1.43

    Pinging the gateway and other machines is working OK again, and no alarm is heard! OK, then, let's push it one notch further, and switch into the basic multi-user mode (run-level 2):

    # init 2

    DHCP was attempted, but the message scrolled off before I could read it - will need to look at log files. I am surprised to see NIS being started here, but pleased to see SSH server started - should be able to connect from my laptop soon. Mail daemon took several minutes before timing out..., so did SNMP HP-UNIX Network Management Subagent.

    Eventually, after excruciatingly long time , got the message "Transition to run-level 2 is complete". But the box is not talking to anybody else... The IP address reverted to the last known. My command line changes disappeared. Not sure if I am surprised, perhaps a bit. This was not a full blown reboot, my networking worked fine in run-level 1, so I didn't necessarily expected for it to go and come back again...

    But, before looking closer at the configuration to see which script was responsible, I need to remember why I'm doing this: testing the impact of changing IP at the command line. I am now in run-level 2 – will the change in this state raise the alarm?

    # ifconfig lan0 192.168.1.43

    No problems, my ears are safe. So it must be something in run-level 3. Well, I don't actually need it at the moment, but when I'm ready to get the box on-line for our courses, I will need the NFS server running. That one is started in run-level 3, so I might as well sort it out now.

    # /sbin/rc3.d/S100nfs.server start
    starting NFS SERVER networking
    ...

    So, this worked, and I think this is all that we will need long term. However, I can't stop yet – my curiosity won't let me. I need to know what the horrible noise was about.

    Remember what is in run-level 3?

    # ls /sbin/rc3.d
    S100nfs.server S823hpws_apache S823hpws_webmin S990dtlogin.rc
    S200tps.rc S823hpws_tomcat S823hpws_xmltools S999kwdbd

    So, the next: try tps.rc. This is what starts X server in HP-UX.

    # /sbin/rc3.d/S200tps.rc start

    No problem. Somehow, I didn't think apache, tomcat, etc would be at fault, so went straight for the desktop (GUI) login script:

    # /sbin/rc3.d/S990dtlogin.rc start

    Bingo!

    Quick – init 1!

    Next – will need to find how to fix the IP address and see if it behaves without my command line changes...


    No comments:

    Blog Archive