[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [microblaze-uclinux] Bug at line 497 of mmnommu/slab.c?



Hi John,

Thanks for your continued help on this; you've been a wonderful resource!

I've attached an archive of my XPS project, in hopes that you might have
time to peruse it for some mistake that I may be making, which is keeping
this from working. Thanks, in advance, for any time you can afford to give
it.

In response to the questions you asked in your reply to my original post:

1) Kernel updated recently?

Kernel and distribution were both CVS updated and had your patch applied
immediately before running the tests that led to my original post.


2) Compiler version?

The following output was generated by running a 'mb-gcc -v' command at a
Linux command prompt on the VMware virtual machine, which I'm using to
compile the kernel/distribution:

--------- Command output begins here. ---------------------------
Reading specs from
/usr/local/microblaze-elf-tools/bin/../lib/gcc-lib/microblaze/2.95.3-4/specs
gcc version 2.95.3-4 Xilinx EDK 6.3 Build EDK_Gmm.11.2
--------- Command output ends here. ---------------------------


3) 'slab.o' disassembly?

Here's the section of the output of the requested command, which pertains to
the 'kmem_cache_sizes_init()' function:

-------- Command output begins here. ----------------
void __init kmem_cache_sizes_init(void)
{
      ac:	d9e00800 	sw	r15, r0, r1
      b0:	12e50000 	addk	r23, r5, r0
      b4:	e877004c 	lwi	r3, r23, 76
      b8:	be03004c 	beqid	r3, 76		// 104
      bc:	13060000 	addk	r24, r6, r0
	cache_sizes_t *sizes = cache_sizes;
	char name[20];
	/*
	 * Fragmentation resistance on low memory - only use bigger
	 * page orders on machines with more than 32MB of memory.
	 */
	if (num_physpages > (32 << 20) >> PAGE_SHIFT)
      c0:	e8770020 	lwi	r3, r23, 32
      c4:	12c00000 	addk	r22, r0, r0
      c8:	16561803 	cmpu	r18, r22, r3
      cc:	bc720038 	blei	r18, 56		// 104
      d0:	e8770018 	lwi	r3, r23, 24
      d4:	e898000c 	lwi	r4, r24, 12
      d8:	4063b000 	mul	r3, r3, r22
      dc:	e917004c 	lwi	r8, r23, 76
      e0:	be080014 	beqid	r8, 20		// f4
      e4:	10a41800 	addk	r5, r4, r3
      e8:	10d70000 	addk	r6, r23, r0
      ec:	99fc4000 	brald	r15, r8
      f0:	10e00000 	addk	r7, r0, r0
      f4:	e8770020 	lwi	r3, r23, 32
      f8:	32d60001 	addik	r22, r22, 1
      fc:	16561803 	cmpu	r18, r22, r3
     100:	bc92ffd0 	bgti	r18, -48		// d0
     104:	e8d8000c 	lwi	r6, r24, 12
     108:	e8b80008 	lwi	r5, r24, 8
     10c:	e8970028 	lwi	r4, r23, 40
     110:	30600001 	addik	r3, r0, 1
     114:	45032400 	bsll	r8, r3, r4
     118:	14a53000 	rsubk	r5, r5, r6
     11c:	b0000200 	imm	512
     120:	30650000 	addik	r3, r5, 0
     124:	6463000c 	bsrli	r3, r3, 12
     128:	b0000000 	imm	0
     12c:	e8800000 	lwi	r4, r0, 0
     130:	6063002c 	muli	r3, r3, 44
     134:	3108ffff 	addik	r8, r8, -1
     138:	10e41800 	addk	r7, r4, r3
     13c:	aa48ffff 	xori	r18, r8, -1
     140:	bc120030 	beqi	r18, 48		// 170
     144:	94910002 	msrclr	r4, 2
     148:	80000000 	or	r0, r0, r0
     14c:	e8670018 	lwi	r3, r7, 24
     150:	a463feff 	andi	r3, r3, -257
     154:	f8670018 	swi	r3, r7, 24
     158:	9404c001 	mts	rmsr, r4
     15c:	80000000 	or	r0, r0, r0
     160:	30e7002c 	addik	r7, r7, 44
     164:	3108ffff 	addik	r8, r8, -1
     168:	aa48ffff 	xori	r18, r8, -1
     16c:	bc32ffd8 	bnei	r18, -40		// 144
     170:	e8d70028 	lwi	r6, r23, 40
     174:	b0000000 	imm	0
     178:	b9f40000 	brlid	r15, 0
     17c:	80000000 	or	r0, r0, r0
     180:	e877001c 	lwi	r3, r23, 28
     184:	b0000001 	imm	1
		slab_break_gfp_order = BREAK_GFP_ORDER_HI;
     188:	a4630000 	andi	r3, r3, 0
     18c:	bc030010 	beqi	r3, 16		// 19c
     190:	e8b7003c 	lwi	r5, r23, 60
     194:	b9f40e70 	brlid	r15, 3696	// 1004
<kmem_cache_free>
     198:	10d80000 	addk	r6, r24, r0
     19c:	eb01002c 	lwi	r24, r1, 44
     1a0:	eae10028 	lwi	r23, r1, 40
	do {
		/* For performance, all the general caches are L1 aligned.
		 * This should be particularly beneficial on SMP boxes, as
it
		 * eliminates "false sharing".
		 * Note for systems short on memory removing the alignment
will
		 * allow tighter packing of the smaller caches. */
		snprintf(name, sizeof(name), "size-%lu",(unsigned
long)sizes->cs_size);
     1a4:	eac10024 	lwi	r22, r1, 36
     1a8:	c9e00800 	lw	r15, r0, r1
     1ac:	b60f0008 	rtsd	r15, 8
     1b0:	30210030 	addik	r1, r1, 48

000001b4 <kmem_cache_create>:
     1b4:	3021ffa8 	addik	r1, r1, -88
     1b8:	fbe10054 	swi	r31, r1, 84
     1bc:	fbc10050 	swi	r30, r1, 80
     1c0:	fba1004c 	swi	r29, r1, 76
     1c4:	fb810048 	swi	r28, r1, 72
     1c8:	fb610044 	swi	r27, r1, 68
     1cc:	fb410040 	swi	r26, r1, 64
     1d0:	fb21003c 	swi	r25, r1, 60
     1d4:	fb010038 	swi	r24, r1, 56
     1d8:	fae10034 	swi	r23, r1, 52
     1dc:	fac10030 	swi	r22, r1, 48
     1e0:	d9e00800 	sw	r15, r0, r1
     1e4:	13060000 	addk	r24, r6, r0
     1e8:	13470000 	addk	r26, r7, r0
     1ec:	13c50000 	addk	r30, r5, r0
     1f0:	f921006c 	swi	r9, r1, 108
     1f4:	f9410070 	swi	r10, r1, 112
		if (!(sizes->cs_cachep =
			kmem_cache_create(name, sizes->cs_size,
					0, SLAB_HWCACHE_ALIGN, NULL, NULL)))
{
			BUG();
		}

		/* Inc off-slab bufctl limit until the ceiling is hit. */
		if (!(OFF_SLAB(sizes->cs_cachep))) {
			offslab_limit = sizes->cs_size-sizeof(slab_t);
			offslab_limit /= 2;
		}
		snprintf(name, sizeof(name), "size-%lu(DMA)",(unsigned
long)sizes->cs_size);
		sizes->cs_dmacachep = kmem_cache_create(name,
sizes->cs_size, 0,
			      SLAB_CACHE_DMA|SLAB_HWCACHE_ALIGN, NULL,
NULL);
		if (!sizes->cs_dmacachep)
			BUG();
		sizes++;
	} while (sizes->cs_size);
}

-------- Command output ends here. ----------------


4) Bogus value passed to lower level function?

Here is the output of my gdb session, which includes a post-crash stack
frame analysis showing the erroneously received values:

-------- gdb session output begins here. ----------------

dbanas@linux:~/NuHo/MBO/S3-1500/uClinux-dist> mb-gdb -nw images/image.elf
GNU gdb 5.3Xilinx EDK 6.3 Build EDK_Gmm.10
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "--host=i686-pc-linux-gnu --target=microblaze"...
(gdb) target remote win:1234
Remote debugging using win:1234
0xfe000000 in start ()
(gdb) c
Continuing.

>>>>>>>>>>>> Note) I hit 'Ctrl-C' here. <<<<<<<<<<<<<<<

Program received signal SIGTRAP, Trace/breakpoint trap.
0xfe00b828 in machine_halt () at machine.c:234
234                     asm ("nop; nop; nop; nop; nop");
(gdb) bt
#0  0xfe00b828 in machine_halt () at machine.c:234
#1  0xfe005b54 in __bug (file=0x1d "\bR\220", line=1, data=0x359) at
bug.c:30
#2  0xfe02188c in kmem_cache_create (name=0xfe121fa0 "eeee",
size=4262789612, offset=4262690804, flags=4262559768, ctor=0,
    dtor=0) at slab.c:857
#3  0xfe127364 in kmem_cache_sizes_init () at slab.c:495
#4  0xfe123568 in start_kernel () at init/main.c:453
#5  0xfe123568 in start_kernel () at init/main.c:453
#6  0xfe123568 in start_kernel () at init/main.c:453
#7  0xfe123568 in start_kernel () at init/main.c:453
#8  0xfe123568 in start_kernel () at init/main.c:453
#9  0xfe123568 in start_kernel () at init/main.c:453
#10 0xfe123568 in start_kernel () at init/main.c:453
#11 0xfe123568 in start_kernel () at init/main.c:453
#12 0xfe123568 in start_kernel () at init/main.c:453
#13 0xfe123568 in start_kernel () at init/main.c:453
#14 0xfe123568 in start_kernel () at init/main.c:453
#15 0xfe123568 in start_kernel () at init/main.c:453
#16 0xfe123568 in start_kernel () at init/main.c:453
#17 0xfe123568 in start_kernel () at init/main.c:453
#18 0xfe123568 in start_kernel () at init/main.c:453
#19 0xfe123568 in start_kernel () at init/main.c:453
#20 0xfe123568 in start_kernel () at init/main.c:453
#21 0xfe123568 in start_kernel () at init/main.c:453
#22 0xfe123568 in start_kernel () at init/main.c:453
#23 0xfe123568 in start_kernel () at init/main.c:453
#24 0xfe123568 in start_kernel () at init/main.c:453
#25 0xfe123568 in start_kernel () at init/main.c:453
#26 0xfe123568 in start_kernel () at init/main.c:453
#27 0xfe123568 in start_kernel () at init/main.c:453
#28 0xfe123568 in start_kernel () at init/main.c:453
#29 0xfe123568 in start_kernel () at init/main.c:453
#30 0xfe123568 in start_kernel () at init/main.c:453
#31 0xfe123568 in start_kernel () at init/main.c:453
#32 0xfe123568 in start_kernel () at init/main.c:453
#33 0xfe123568 in start_kernel () at init/main.c:453
#34 0xfe123568 in start_kernel () at init/main.c:453
#35 0xfe123568 in start_kernel () at init/main.c:453
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) frame 3
#3  0xfe127364 in kmem_cache_sizes_init () at slab.c:495
495                     if (!(sizes->cs_cachep =
(gdb) l
490                      * This should be particularly beneficial on SMP
boxes, as it
491                      * eliminates "false sharing".
492                      * Note for systems short on memory removing the
alignment will
493                      * allow tighter packing of the smaller caches. */
494                     snprintf(name, sizeof(name), "size-%lu",(unsigned
long)sizes->cs_size);
495                     if (!(sizes->cs_cachep =
496                             kmem_cache_create(name, sizes->cs_size,
497                                             0, SLAB_HWCACHE_ALIGN, NULL,
NULL))) {
498                             BUG();
499                     }
(gdb) p sizes->cs_size
$1 = 64

-------- gdb session output ends here. ----------------

As you can see, the values '64' and '0' are sent for variables 'size' and
'offset', respectively. However, the values received for both of these
variables are nonsense.
By the way, do you know why the stack trace is loaded up with calls to
'start_kernel()'?

Here's a copy of the Windows command terminal, in which XMD was run, for
reference:

-------- XMD terminal output begins here. -----------------

Xilinx Microprocessor Debug (XMD) Engine
Xilinx EDK 6.3 Build EDK_Gmm.12.3
Copyright (c) 1995-2004 Xilinx, Inc.  All rights reserved.

XMD%
Loading XMP File..
Loading MHS File..
Processor(s) in System ::

Microblaze(1) : microblaze_0
Address Map for Processor microblaze_0
  (0x00000000-0x00001fff) dlmb_cntlr    dlmb
  (0x00000000-0x00001fff) ilmb_cntlr    ilmb
  (0xc0000000-0xc0003fff) ethernet      mb_opb
  (0xfe000000-0xfeffffff) opb_sdram_0   mb_opb
  (0xff800000-0xffbfffff) sram_flash    mb_opb
  (0xffff0000-0xffff01ff) sram_flash    mb_opb
  (0xffff1000-0xffff10ff) system_timer  mb_opb
  (0xffff2000-0xffff20ff) console_uart  mb_opb
  (0xffff3000-0xffff30ff) system_intc   mb_opb
  (0xffff5000-0xffff50ff) system_gpio   mb_opb
  (0xffffc000-0xffffc0ff) debug_module  mb_opb

Loading MSS File..
Executing Connect Cmd: connect mb mdm -cable type xilinx_parallel port LPT1
-debugdevice cpunr 1 -pfsl port 0 type s
Connecting to cable (Parallel Port - LPT1).
Checking cable driver.
 Driver windrvr6.sys version = 6.2.2.2. LPT base address = 03BCh.
 ECP base address = 07BCh.
 ECP hardware is detected.
Cable connection established.
Connecting to cable (Parallel Port - LPT1) in ECP mode.
Checking cable driver.
 Driver xpc4drvr.sys version = 1.0.3.0. LPT base address = 03BCh.
 Cable Type = 1, Revision = 3.
Cable connection established.

JTAG chain configuration
--------------------------------------------------
Device   ID Code        IR Length    Part Name
 1       05046093           8        XCF04S
 2       05046093           8        XCF04S
 3       01434093           6        XC3S1500
Assuming, Device No: 3 contains the MicroBlaze system
Connected to the JTAG MicroBlaze Debug Module (MDM)
No of processors = 1

MicroBlaze Processor 1 Configuration :
-------------------------------------
Version............................3.00.a
No of PC Breakpoints...............2
No of Read Addr/Data Watchpoints...1
No of Write Addr/Data Watchpoints..1
Instruction Cache Support..........off
Data Cache Support.................off
MBsfsl(0)-MDMmfsl(0) Connected..........Yes
JTAG MDM Connected to MicroBlaze 1
Connected to "mb" target. id = 0
Starting GDB server for "mb" target (id = 0) at TCP port no 1234
INFO:EDK - MHS File already loaded

Processor (1) Already Connected


XMD% rst
Target reset successfully
XMD% dow image.elf
        section, .text: 0xfe000000-0xfe114ed0
        section, .intv: 0xfe114ed0-0xfe114f08
        section, .init: 0xfe123000-0xfe12e000
Checking if Program I-Side Memory within Address Range....PASSED

        section, .sdata2: 0xfe114f08-0xfe116a78
        section, .data: 0xfe116a78-0xfe122020
        section, .bss: 0xfe12e000-0xfe14f4bc
        section, .romfs: 0xfe12e000-0xfe1e2000
Checking if Program D-Side Memory within Address Range....PASSED

Downloaded Program image.elf
Setting PC with program start addr = 0xfe000000
XMD%

>>>>>>> Note) gdb 'target remote' command was executed now. <<<<<<<<<

Accepted a new GDB connection from 192.168.78.128 on port 32773

-------- XMD terminal output ends here. -----------------


Thanks, again, for the help, John,

David Banas
Field Applications Engineer
Nu Horizons Electronics Corp.
2070 Ringwood Avenue
San Jose, CA 95131
(408)434-0800 - office
(415)846-5837 - cell
http://www.nuhorizons.com
-----Original Message-----
From: owner-microblaze-uclinux@itee.uq.edu.au
[mailto:owner-microblaze-uclinux@itee.uq.edu.au] On Behalf Of John Williams
Sent: Friday, March 11, 2005 5:33 PM
To: microblaze-uclinux@itee.uq.edu.au
Subject: Re: [microblaze-uclinux] Bug at line 497 of mmnommu/slab.c?

Hi David,

David Banas wrote:

> I think I just found a bug at line 497 of mmnommu/slab.c:
>    "0" -> "0L"
> 
> It was preventing my kernel from booting during the 'kmem_cache_init()'
> phase by giving a bogus cache size to a lower level function, which was,
in
> turn, bailing out to 'machine_halt' after being asked to create a cache of
> over 3 billion bytes.

The compiler should promote the 0 to 0L automagically.  What was the 
bogus value passed to the lwoer level function?

> What's curious is this: the fix doesn't appear in the current CVS archive.
> So, my question is, how is anyone running successfully with this bug
> present?

That code is stable and works across all nommu arches, microblaze 
included.  I'm more inclined to suspect something specific to your setup.

When did you last update your kernel sources (not that it should matter 
in this instance)?

What compiler version are you using?

Also, can you run mb-objdump -S over the slab.o file, and see what sort 
of assembly is being generated?

Thanks,

John




___________________________
microblaze-uclinux mailing list
microblaze-uclinux@itee.uq.edu.au
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive :
http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/


________________________________________________________________________
This email has been scanned for all viruses by the MessageLabs Email
Security System. For more information on a proactive email security
service working around the clock, around the globe, visit
http://www.messagelabs.com
________________________________________________________________________

Attachment: uClinux_auto_6_30_b.zip
Description: Zip compressed data