[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [microblaze-uclinux] Kernel BRAM usage
Very interesting!
> -----Original Message-----
> From: owner-microblaze-uclinux@xxxxxxxxxxxxxx
[mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx] On
> Behalf Of John Williams
> Sent: Monday, April 07, 2008 3:00 AM
> To: microblaze-uclinux@xxxxxxxxxxxxxx
> Subject: [microblaze-uclinux] Kernel BRAM usage
>
>
> Question: What do you do with all that BRAM after FS-boot is finished
> with it?
>
> Answer: Use it for time-critical kernel functionality, of course!
>
> Attached patch should apply cleanly to a recent petalinux-v0.20 era
> kernel (not the MMU test kernel yet, sorry). Apply at patchlevel -p0
> from the linux-2.6.x directory.
>
> It creates a few new kernel config menu options under "processor type
> and features":
>
> [*] Allow placing code/data in BRAM
> [ ] Place interrupt entry path in BRAM
> [ ] Place low level signal handling and delivery path in BRAM
> [ ] Place system call entry path and related routines in BRAM
> [ ] Place cache flush code in BRAM
> [ ] Place exception handling code in BRAM
> [ ] Place kernel FASTCALL sybols in BRAM
>
> These options should be fairly self explanatory. The "cache flush"
> option is a good one to try. Because of the MicroBlaze architecture
we
> have to disable caches while flushing them. Flushing is done in a
loop,
> so we run a big loop cache-off, it's very slow. Moving them to BRAM
> speeds things up nicely, and gives a noticeably "snappier" feel to
> application loading (the main client of the cache invalidation API).
>
> The exception handling code should also have a nice effect if you are
> using unaligned exceptions (and causing them!).
>
> Basically, anything asynchronous that runs once - and is therefore
> usually cache-cold - should see a win. Tight loops not so much, the
> subsequent iterations will be cache-hot and there's no benefit to
using
> BRAM unless the loop footprint is huge.
>
> Also the last option is an interesting one - a number of functions in
> the kernel are marked as FASTCALL - this is currently used only by
i386
> to specify some sort of optimised register-based function call ABI.
> However by overloading it so that all functions marked FASTCALL get
> loaded into BRAM, we again should see a bit of a speedup. This will
use
> min 16Kbyte of BRAM, so if you don't have enough room you'll get
link
> errors.
>
> I haven't benchmarked it thoroughly but earlier work on a related
> patchset for the 2.4 kernel did show improved interrupt latencies when
> the IRQ handling was moved to BRAM.
>
> In addition to the standard choices I've created, you can also tag
your
> own functions (esp. driver IRQ's for example) and data structures as
> __bram_code__ or __bram_data__ respectively and they too will receive
> the magic treatment.
>
> I'll fold this into the next PetaLinux release but any experience
> reports before then will be greatly appreciated.
>
> Cheers,
>
> John
>
___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/