[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[microblaze-uclinux] uClinux crashing w/ bus errors/ hanging in pty_init



I am currently trying to complete a project running uClinux on an ML505. I have found uClinux to be extremely unstable, and would like to know if anyone has seen similar problems. I do not believe this problem is hardware related, as my standalone code can run for days with no issues.

The most annoying issue I am having is with the function pty_init, and I was wondering who else might have seen this. There are 3 possible results I see from calling this function:

About 20% of the time, the function completes after about 17 seconds, and boot up continues as normal. About 50% of the time, the function never completes and simply hangs forever. About 30% of the time, the function will appear to hang, and eventually I will get an exception that originates from no_work_pending (this is obviously another thread)

When I see the exception, ESR is set to 0x00000984. The documentation I have lists this as a data bus exception, and the exception specific status area has no description listed. This would seem to imply that memory or a peripheral access has timed out, however the location passed in R17 when this occurs points to the instruction:

lwi  r7,r1,32

Interestingly, there are at least a dozen similar instructions above it that never seem to have an issue. This function is apparently the scheduler reloading the registers from the stack. The exception always seems to happen at exactly this instruction when it occurs. r1 is valid when this happens, and pointing to the DDR ram. I have run an exhaustive memory test on the DDR ram and it shows no problems. I have cache disabled.

2 questions. Is it possible the exception is really occurring during an interrupt and being delayed until this point? If so, why is so consistent that it always has a problem on this instruction. Adding printk's to the code does not change the place where the exception occurs, as you would expect if it was timing related.

Second, does anyone know why pty_init would sometimes appear to hang during half the attempts? Could the scheduler be having a different issue on these occasions and actually be crashing the kernel? And if so, why does it take 17 seconds to return even when it works? That seems an incredibly long time for a processor running at 125 MHz.

How would everyone recommend going about debugging this situation? Again, I don't believe there is truly a hardware problem, as this situation only occurs when running uClinux. It is of course possible that uClinux exercises some specific pattern which exposes a hardware problem, but if so, I have no idea how to resolve this.

Thank you in advance for any assistance.

Chris

___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/