[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [microblaze-uclinux] kernel BUG at sched.c:687!
Hi Prasad,
This is my fault. I sent an email 1 hour ago but it seems to be delayed. I had
a mistake in my instructions. The code must be AFTER SAVE_STATE. If not the
stack pointer used could be from user stack and the check will fail giving a
false positive.
Sorry again.
On Friday 21 April 2006 09:41, DeviPrasad Natesan wrote:
> Hi alexandro,
> I did put your stack checking code (with current ptr updated as mine
> ) and strange as it seems, the code goes into loop just when the linux
> coming up. (This seems to work normal without this code). I checked
> with the our custom hardware and also the ml403 eval board and it
> behaves the same. I removed all the custom application and it is a
> simple system with ethernet, uart with root mounted in the jffs2. This
> itself seems to corrupt the kernel stack (according to this test
> code).
>
> Does it mean that the last "current" task corrupts the stack?. Got to
> continue debugging tomorrow. Seems totally illogical at this point.
>
> - Prasad
>
> On 4/20/06, Alejandro Lucero <alucero@xxxxxxxxx> wrote:
> > On Thursday 20 April 2006 13:29, Brettschneider Falk wrote:
> > > Hi Alejandro,
> > > thanks! Am I right, it will infinitely loop in case of a kernel stack
> > > overflow? If yes, how could I write a fixed value to a register address
> > > in that loop (to e.g. switch an LED on)? Going to try that soon...
> > > Cheers, F@lk
> >
> > If the CPU executes this infinity loop it means the kernel stack size is
> > more than 7500 bytes and this size is very close to 8192. Moreover, the
> > limit is 8192 - sizeof(struct task_struct) since the process descriptor
> > is at the bottom. This does not imply a stack overflow but the odds are
> > high.
> >
> > If you know the leds address (look at
> > arch/microblaze/platform/uclinux-auto/autoconfig.in) you can use it with
> > a swi instruction:
> >
> > addi r11, r0, 0x1; /* Assuming 0x00000001 put a led on */
> > swi r11, r0, led_address;
> >
> > You can add this before the first nop instruction.
> >
> > > > -----Original Message-----
> > > > From: owner-microblaze-uclinux@xxxxxxxxxxxxxx
> > > > [mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx]On Behalf Of
> > > > Alejandro Lucero
> > > > Sent: Thursday, April 20, 2006 1:50 PM
> > > > To: microblaze-uclinux@xxxxxxxxxxxxxx
> > > > Subject: Re: [microblaze-uclinux] kernel BUG at sched.c:687!
> > > >
> > > > On Thursday 20 April 2006 10:31, Brettschneider Falk wrote:
> > > > > Hi,
> > > > >
> > > > > Alejandro Lucero wrote:
> > > > > > I assumed you are using the entry.S without my patch reported
> > > > > > two days ago.
> > > > > > aren't you?
> > > > >
> > > > > I've tried JWs version of your patch but it doesn't help as
> > > >
> > > > a bugfix. My
> > > >
> > > > > environment is one active user app with several threads
> > > >
> > > > (SCHED_RR), a high
> > > >
> > > > > IRQ frequency (about 2 per millisecond), many thread switches, many
> > > > > locks/unlocks of semaphores and mutexes. From time to time
> > > >
> > > > one thread of
> > > >
> > > > > that application calls pthread_cancel() to another thread.
> > > > > Often (about after 20 kill actions) this leads to either a
> > > >
> > > > Linux crash
> > > >
> > > > > (with several versions of "kernel BUG at sched.c:***"), or
> > > >
> > > > just a total
> > > >
> > > > > hang or an exit of the app with return code 5. (The statistical
> > > > > distribution is: displaying of scheduler bug = 0,01%, Linux
> > > >
> > > > hang = 60%,
> > > >
> > > > > process exit = rest.) I haven't the problems if either the
> > > >
> > > > IRQ frequency is
> > > >
> > > > > very low or no threads are cancelled(). That's why I asked
> > > >
> > > > you if you ever
> > > >
> > > > > tried to kill threads in your application, this increases
> > > >
> > > > the chance of a
> > > >
> > > > > Linux crash extremely here.
> > > >
> > > > Perhaps you could do some tests to discard the kernel stack
> > > > overflow. Try to
> > > > put this in your entry.S file but update the "current"
> > > > pointer and make sure
> > > > you are not using memory 0x554, 0x558, 0xc64 and 0xc68
> > > > (surely LMB memory).
> > > > This code looks at the kernel stack size and if it is greter
> > > > then 0x1d4c
> > > > (7500bytes) the system will execute an endless loop with
> > > > interrupts disabled.
> > > > In 0xc64 is stored the maximum kernel stack size used.
> > > >
> > > > Rembember to update current which is my kernel is in
> > > > 0x0213472c address. Use
> > > > objdump -t image.elf | grep current
> > > >
> > > > Try to put this in ENTRY(irq) just after swi r1, r0, ENTRY_SP
> > > > and before
> > > > SAVE_STATE
> > > >
> > > > swi r11, r0, 0x554
> > > > swi r12, r0, 0x558
> > > > lwi r11, r0, 0x0213472c;
> > > > addi r11, r11, 0x2000;
> > > > rsub r11, r1, r11;
> > > > lwi r12, r0, 0xc64;
> > > > swi r11, r0, 0xc68;
> > > > rsub r11, r11, r12;
> > > > bgei r11, 1f;
> > > > lwi r11, r0, 0xc68;
> > > > swi r11, r0, 0xc64;
> > > > 1:
> > > > lwi r11, r0, 0x0213472c;
> > > > addi r11, r11, 0x2000;
> > > > rsub r11, r1, r11;
> > > > addi r12, r0, 0x1d4c;
> > > > rsub r11, r12, r11;
> > > > blei r11, 2f;
> > > > lwi r11, r0, 0;
> > > > mts rmsr, r11;
> > > > nop;
> > > > nop;
> > > > nop;
> > > > bri -8;
> > > > 2:
> > > > lwi r11, r0, 0x554
> > > > lwi r12, r0, 0x558
> > > >
> > > > > Cheers, F@lk
> > > > > ___________________________
> > > > > microblaze-uclinux mailing list
> > > > > microblaze-uclinux@xxxxxxxxxxxxxx
> > > > > Project Home Page :
> > >
> > > http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> > >
> > > > Mailing List Archive :
> > > > http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
> >
> > --
> > Alejandro Lucero
> > Technical Director
> > +34 665 68 71 68
> > Valencia (SPAIN)
> > www.os3sl.com
> > ___________________________
> > microblaze-uclinux mailing list
> > microblaze-uclinux@xxxxxxxxxxxxxx
> > Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> > Mailing List Archive :
> > http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
>
> ___________________________
> microblaze-uclinux mailing list
> microblaze-uclinux@xxxxxxxxxxxxxx
> Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> Mailing List Archive :
> http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
--
Alejandro Lucero
Technical Director
+34 665 68 71 68
Valencia (SPAIN)
www.os3sl.com
___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/