[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Strange lockup with 2.6.0


> > > > I usually do a backup of each filesystem simply using tar.  I attempted to
> > > > backup a machine I had that's running 2.6.0 and it hard locked.
> > >
> > > Are sysrq-keys enabled? If so, could you catch the tar backtrace during
> > > the lock-up (ALT-SysRq-t)? What was the latest kernel-version that worked?
> >
> > Yes, but the machine hard locks.  sysrq does not work.  I have a small
> 
> __THAT__ hard...:-)

Yup.  That hard.

> > utility I wrote that will set the state of the parport (I used this to tell
> > if it locks up) using outb to the port (This does not effect it in anyway,
> > it will lockup w/o it running)
> 
> You mean it just toggles a bit periodically?

I have a set of LEDs attached to the parport (12) and this program writes to
it in a way that makes it bounce the 'on' led every .25 seconds  I'll send
you the program if you're interested.

> > > Can you just try to write some data over NFS? Would it lock if you write 1
> >
> > I am constantly accessing NFS with this machine.  Read and write.  It was
> 
> How much data at one go (max)?

Dunno.  I've never given   it that much thought.  I have the completed
backup on the jaz.  I can attempt to dump it to the server to see if that
makes a difference.

> > only when I backed it up with tar.  In the event it doesn't lock, tar
> > crashes w/o error/warning (over NFS).
> 
> So, it locks not always?

Most of the time, yes it does.  I'd say 90% of the time it hard locks.  If
it doesn't and I attempt it again it always hard locks (except one time I
did it).  I've done the tests numberous times.

> > > byte or 1K or 1M? Does it lock immediately as you start the backup or
> >
> > It locks up usually at one point, but not always.
> 
> Since you could backup to Jazz, looks like your filesystem is ok, NFS also
> works in principle...

Before one test, I did: cp /dev/sda /dev/null
to see if it has any problems with the disk.  It was fine.

> > > after some time (you could start some process in the background
> > > periodically printing some info on the terminal, like vmstat, cat
> > > /proc/interrupts, free, tcpdump on both ends to a file...) Can you try NFS
> >
> > I can do this I think.  It's fun when running with init being bash.  It will
> > take some time to do since I can't scroll backwards.
> 
> You could also attach a serial console and direct the output there (then
> you also can scroll).

I thought about this.  Hopefully compiling in serial doesn't add another
variable to this.  I currently have serial compiled as a module.

> > > over TCP? Are other machines, where backup works, also running 2.6,
> >
> > I can try TCP, but I'm not sure about the server accepting TCP (was there a
> > compile time option for NFSD to use TCP?)  These 2 machines are the only
> 
> Yes.

I did not compile the server with TCP support.

> > ones I have on 2.6.
> >
> > > 10/100mbps?
> >
> > 100 FD always.
> 
> Why I am interested in your experiences is that I also have a problem
> transferring large (several M) files over NFS when the server is 2.6 and
> both ends have 100 FD. (You can see my posts this week about 2.6 NFS.) And
> in my case it TCP fixed it. But I never had hard-locks, just cp hanged in
> D, and tcpdump showed timed out reassembly on the receiving side. But I
> was reading from the server.

That's interesting.  I hope it doesn't matter if the server is a diskless
machine.  Interesting you mention the server being 2.6.  The NFS I did above
was to a different (also diskless) server.  The 2.6 one I threw a hard disk
on so I could do backups of all my machines (and w/o shutting another down). 
Out of the 5 machines on this network, only 2 have usable IDE ports (one has
none, one's a laptop, one is full of cdroms which is the machine that's
hanging on me)

On a side note, I have a 2.4.x (x>=20) using knfsd and nohide on directories.
A 2.4.x client can see those contents, a 2.6.x client can't w/o mounting each
individually.

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo _at_ vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


この情報があなたの探していたものかどうか選択してください。
yes/まさにこれだ!   no/違うなぁ   part/一部見つかった   try/これで試してみる

あなたが探していた情報はどのようなことか、ご自由に記入下さい。特に「まさにこれだ!」と言う場合は記入をお願いします。
例:「複数のマシンからCATV経由でipmasqueradeを利用してWebを参照したい場合の設定について」
References: