[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [microblaze-uclinux] Bad page state in process 'swapper'



Mike,

it is not that I was doubting the hardware. I have tested my case with an extension of the memory test program.

The point that I wanted to make is that with the SDRAM case when compared to DDR case some different info is passed from the Xilinx hardware build to the Petalinux software build. This could result e.g. in a different control of the cache as you already stated. However I would expect that a cache never can go wrong in a flat memory system. Or it must be controlled in a wrong way (due to a difference of the SDRAM/DDR info????).

I think that consistency in the crash behaviour can't be expected as long as no interrupts occur. At that point there will be a mutliple of interacting "processes" influenced by the factor "time". However you wrote that you zeroed memory (something that I was also thinking of) and disabled interrupts.

Did you disable the interrupts by hardware? In software they could be enabled again by a regular part of the kernel.

Cor

----- Original Message ----- From: "Mike Thompson" <mike@xxxxxxxxxx>
To: <microblaze-uclinux@xxxxxxxxxxxxxx>
Sent: Saturday, November 10, 2007 5:43 PM
Subject: Re: [microblaze-uclinux] Bad page state in process 'swapper'


Cor,

I'll look to provide more details on the Suzaku (not working) and Memec (working) differences when I get back into the office on Monday. However, I am using the exact same FPGA bitstream with the Petalinux 2.6 kernel that I am using with the more ancient Suzaku uClinux 2.4 kernel. Since it is working fine with the Suzaku 2.4 kernel it at least assures me the hardware configuration is in a stable and working configuration when the drivers are properly configured.

The difference between SDRAM and DDR SDRAM didn't occur to me. I'll look into this as well on my side. Something I have noticed is that my crash results are not entirely consistent and sometimes it will crash prior to the Dentry/Inode-cache entry appearing on the console. It never gets past those two entries though and the 'Bad page state in process 'swapper'' will appear. Such inconsistent crashing could potentially indicate some memory problem.

One diagnostic attempt we made was to set all memory to a precise known state and disable the enabling of interrupts and see what the result was. We still were getting inconsistent crashing although we expected to find the system to crash in exactly the same place each time. This is perhaps another clue that memory is unstable with the 2.6 kernel.

Unfortunately, I'm not seeing where the memory controller is exposed as a peripheral to the Microblaze so I don't see how the 2.4 kernel could be doing anything different in memory than the 2.6 kernel. Perhaps the issue is in the handling of the instruction/data cache controller as that could also cause memory issues. Just a wild guess though on my part at this time.

I'll provide more details on the other information you mention in a follow up email.

Mike

Cor Venner wrote:
Hello Mike and everyone who is interested,

let us share some more details to look if we can identify a pattern:
- I use the image.srec as source for the kernel and fsrom, I expect that most of us use image.bin or image.ub
- networking is disabled in "menuconfig"
- initially the board had 128Mbyte SDRAM (not DDR) configured, which is quite al lot when compared to the evaluation boards mostly used. I have brought this down to 32Mbyte in Kconfig. The difference that I could notice was that the downsized memory version was able to reach the "Memory: 29378k/32768k available" and "Calibrating delay loop.." messages while backtracing. So it seems that there is some multi processing is going on. - This SDRAM<>DDR difference is quite intriguing. As far as I can remember is the Memec board a DDR board, the SZ130 is a SDRAM board as far as I could find out on the web. My Gemini board is also SDRAM. Could it be something in the configuration info of the IP?? - I have applied the peripheral test functions as provided by Xilinx. They run successfully. Unfortunately the do not test the serial interface interrupt. I expect that the timer and interrupt controller are running correctly. See the previous statement. - Xilinx tests the memory in a simple way. They try if they can do char, short and long accesses. I have extended this test and it seems to run succesfully on 128Mbyte of SDRAM

I don't quite understand the relation between printk() and early_printk(). At what point has printk() to take over the role of early_printk(). And what is happening when you apply the "keep" option in init_early_printk() on a correct running system. Is the info displayed twice?

Mike, is this something which you can try on the Memec board?

Best regards, Cor Venner


___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/



___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/