27 February 2004

Since several days ago, I noticed that #direktif is less visited compared with the time before I got trouble with my server. The "search word" recorded by my Analog's statistics is also empty. So, yesterday I checked to search common word from #direktif via Google, and... oops, no result for direktif.web.id!

Ah, this is what I have to pay for last week of January until mid-February problem with Atijembar: Google's crawls were failed to read #direktif (only #direktif) so for now on (until next reindex on their database) #direktif is unreachable from Google. The other major search engines? Unfortunately they are usually too late to update.

24 February 2004

My MRTG that was installed in my firewall had been stopped, following the replacement of degromiest as a router with Sweex broadband router. Beside it is smaller and portable enough, I do not need any additional services in that old computer.

Thanks for degromiest which has replaced lawanggeni for months and did a good job!

17 February 2004

Hope this will be the end of long-to-be-predictable trilogy of my server, Atijembar.

I went to Norrod Computer, Groningen, for RAM replacement, because it is still in their guarantee. The replacement has higher frequency, 400 MHz, and at 15.30 Atijembar has been ready to serve with 640MB memory size. The /var/log/syslog contains no error related to system, such as segmentation fault or part of kernel crash after one day run with its 128MB RAM.

Go, Atijembar, go!

16 February 2004

I was still rethinking about what happens with Atijembar. It is very strange that even simple program run by limited privileges user can make kernel space damage. The other unusual thing is when computer is rebooted: in more frequent occassion than before, it stops seconds after reboot instruction is performed. At first, I thougt this was caused by reboot script which maybe not compliant with APM in my computer. Yesterday I tried to diagnose with utilities from MSI inside Windows XP, no problem, but Windows' restart got similiar, hang, problem!

At first I suspect it was caused by IDE cables that are used by two hard disks and two CD drives. Can't the motherboard handle their total size? Its specification does not tell anything about that. So I assume it must be safe.

Along searching configuration or jumper position inside motherboard user guide, I came into LED diagnostics that comes with motherboard. This is common feature in the lattest motherboard which inform boot steps. I still had not noticed where this LED stands. *Blink* Ah, when computer room is darker, I can see blink red and yellow light in the back side of casing!. And, yes! There are four square LED beside USB!

Accompanied also by Ismail Fahmi (who at that moment he spent his time visiting me) to identify the pattern after reboot's fail case, it stopped in "fail memory testing". I opened again Atijembar's casing, test with one piece RAM and got failed, so it is suspected damage. The other piece (smaller size, 128MB) then replaced this failed memory and reboot process was without problem. I tested for a day, and no segmentation fault happened.

14 February 2004

Getting worse and worse in this week run-and-also-test my Debian Web server. It was somewhat strange that I got failed file system operation in only about half day after file system repairment. Beside that, some applications crashed into segmentation fault and made system unstable. Two most frequent problematic applications were Spamassassin and Jdresolve, which coincidentally were invoked by cron. I tried to debug both of them by executing individually with their proper privilege (for example Jdresolve from www-data account), but still problem arose.

Another way I did was to reinstall some of applications/libraries which I suspected corrupt or misconfiguration. Debian's apt-get install --reinstall is a very helpful command for doing that. But no better result at all!

So I came into an conclusion to reinstall Debian, by first fixing backup script to make it easy saving valuable files in tar format inside safe partition (i.e. FAT32 partition). Finally, on Friday, 13 February I reinstalled Debian again, with one fault I made inside backup script: /var/log/apache is forgotten included.

Today Debian runs in second run-and-test with very limited packages. XWindows is not yet installed and Apache's log is used but untouched by such as log analyzer applications. Yet really, I missed previous one week Apache's log.

10 February 2004

Atijembar still has problem for this week operation (and also test). This morning, with last command was executed on 06.25, the main partition /dev/hda1 is suddenly remounted read-only, so journal could not be written. On startup after reboot, fsck detected some errors and do some i-node repairments.

No additional report in logfiles.

06 February 2004

On Thursday, 29 January, at 04.00, my Web server, Atijembar, got a problem. I was working inside XFree when suddenly Mozilla was interrupted unexpectedly and my Mutt changed its active mailbox into read only status. Checking from console screen, I found mounted partition used by Linux was error, so it was automatically remounted into read only status as written in /etc/fstab.

Actually I got similiar error message some days before, and without too much attention to consider it as a problem, I simply did reboot and computer ran again as usual. But that time, after rebooting, my Debian GNU/Linux asked me to enter single user mode and did disk scan. Having executed fsck with so many i-node corrected and saved it in lost+found, I noticed that Apache daemon is failed, as well MySql. Minutes later I came into a conclusion that something worse than I think, had happened: most of important binary files in /usr/sbin were missing.

I did know exactly what happened inside /home partition, the best I could do was quickly doing backup routine. A slight modification inside backup script is done to accomodate physical storage of MySql's databases. It is needed because last backup was done about a month ago using mysqldump and the days after there were so many database activities.

It is a complete accident: library module for CD writer, ide-scsi got corrupt and no backup into CD operation under Linux can be performed.

The only thing I could do was copying all .tar.gz files into Windows XP's FAT32 partition and burned them into CD later. The rest of that morning was to let my children using Atijembar for games under Windows!

Along investigation I did to my hard disk, I found there was no physical error, so it was a little strange to get I/O error as reported by ext3 journaling file system. Without too long to think, I took simple conclusion on Thursday evening that hard disk was broken and had to be replaced. Even I tried with Windows XP (that uses other partition), the operating system ran well but after that, disk operation was failed, so let alone Windows run using RAM, and minutes later, crashed.

The day after, Friday 30 January, I replaced my hard disk with a new and larger one. Windows installation still got failed with annoyance error messages that my computer is not compliant with some of Windows' hardware specifications. Joking to myself: do I have to replace my complete machine? Do I have to beg in front of Mr. Gates?

I left Windows partition unused, after some broken installations. Skip to Linux installation with Debian. The result was not better at all: installation was in major interrupted in some unpredictable spots and even it was successful, the result is very unstable. The I/O error messages still appeared, so it would be very early to say this computer is ready to operate. I was curious whether a new hard disk is too large to be handled by my computer. How it that comes a two years old computer suddenly becomes out of date? Funny!

Ismail Fahmi, one of my colleague, comes with a suggestion to borrow his old hard disk for testing. Along waiting his coming on Saturday, 31 January, I came into idea to test my computer with Knoppix. This was a second test after my computer broken with different condition: now with a new hard disk. Knoppix detected configuration correctly, but... a moment after XFree initialization, my monitor got flash and blink for several times and XWindows could not be started. Knoppix itself was not broken, it simply reported in console screen that XWindows could not be launced. After twice experiment with Knoppix, I step into new conclusion that the problem seemed coming from VGA card. Duh!

At the evening of Iedul Adha 1424H, I contacted Elfahmi Yaman, other colleague who had PC, and asked him for testing my computer with his VGA card. Not waiting too long, a half hour later I brought Atijembar to Elfahmi's house and performing some tests both with Windows XP installation and Knoppix running directly from CD. Bla and bla... one hour test was enough: the problem was identified more precisely. It came from VGA card and caused some effects into I/O transactions.

I was at home again, finding borrowed hard disk from Ismail Fahmi on my table. Having enough speculation from previous test with VGA card, I just made sure this conclusion perfect by seeing that even with older, smaller size, hard disk, Atijembar with fail VGA card, could not operate well.

A long awaiting until Monday, 2 February, noon, because in Groningen, shops are open usually at 13.00. I chose the cheapest card from available offering in some computer shops: Radeon 7000, 64MB. Fixing hardware configuration and testing its stability using Windows XP, Debian GNU/Linux was then installed, configurations were reset into proper ones, and backup at last was restored.

Tuesday afternoon I opened firewall for Atijembar's service. Some of my friends realized this and notified via email. Wednesday, 4 February, I made announcement to our community that our Website had been activated.

Because this computer is also used by me for working (development), I still fight to install this-and-that to make it comfortable as before. It has some trickies configurations and settings, but that is another story. The most important thing, the web server function, has been reactivated successfully.

My empathy for one of my friends in Bandung who also had hardware problem caused by bad fluctuate electricity that destroyed physically.