diff options
author | John Rigby <john.rigby@linaro.org> | 2012-03-27 09:36:17 -0600 |
---|---|---|
committer | John Rigby <john.rigby@linaro.org> | 2012-03-27 09:36:17 -0600 |
commit | 891a674c57a0a1e4a2b871d5b6e874857e977ef5 (patch) | |
tree | 4cdc25fb8fe7352086da9a16cc1a9555a2d1f400 /linaro/arm-virt-bl/docs | |
parent | f0d8629d72fea2d76060d97c2f463e17c67ec844 (diff) |
Add linaro directory which contains the bootwrapper
Signed-off-by: John Rigby <john.rigby@linaro.org>
Diffstat (limited to 'linaro/arm-virt-bl/docs')
-rw-r--r-- | linaro/arm-virt-bl/docs/01-Usage.txt | 90 | ||||
-rw-r--r-- | linaro/arm-virt-bl/docs/02-Code-layout.txt | 563 | ||||
-rw-r--r-- | linaro/arm-virt-bl/docs/03-Linux-kernel-build.txt | 58 | ||||
-rw-r--r-- | linaro/arm-virt-bl/docs/04-Cache-hit-rate-howto.txt | 198 | ||||
-rw-r--r-- | linaro/arm-virt-bl/docs/05-FAQ.txt | 20 | ||||
-rw-r--r-- | linaro/arm-virt-bl/docs/06-Optional-rootfs-build.txt | 122 |
6 files changed, 1051 insertions, 0 deletions
diff --git a/linaro/arm-virt-bl/docs/01-Usage.txt b/linaro/arm-virt-bl/docs/01-Usage.txt new file mode 100644 index 0000000..39d3b7f --- /dev/null +++ b/linaro/arm-virt-bl/docs/01-Usage.txt @@ -0,0 +1,90 @@ +Usage +===== + +1. Requirements/Pre-requisites + + 1. A Linux development environment. This release has been + build tested on the following Linux host environments: + + 1. Linux Ubuntu 10.10 + 2. Red Hat Enterprise Linux WS release 4 (Nahant Update 4) + + This release is not intended to be used on development + environments other than Linux. + + 2. An installation of the ARM RealView Development Suite. This + release was built and tested with version 4.1 [Build 514]. + + 3. An installation of the Perl scripting language. This release + was built and tested with v5.10.1. + + 4. An installation of the GNU coreutils suite + <http://www.gnu.org/software/coreutils/>. This release was + built and tested with v8.5. + +2. Build instructions + + Note that this release relies on the 'env' utility, which is + a part of the coreutils suite. The 'env' utility is expected + to be located at '/bin/env'. If it is at a different + location then this must be reflected in the first line of + the following file: + + arm-virtualizer-v2_2-160212/bootwrapper/makemap + + Failure to make this modification will result in a build + failure. + + To build the software: + + $ tar -jxf arm-virtualizer-v2_2-160212.tar.bz2 + $ cd arm-virtualizer-v2_2-160212/bootwrapper + $ make clean && make + + The resulting file is 'img.axf'. + + This image may be loaded and executed on the model debugger + as explained in section 3 below. + + Note that the pre-built stub kernel image is located at: + + arm-virtualizer-v2_2-160212/bootwrapper/payload/kernel + + .. and the placeholder dummy root filesystem image is located + at: + + arm-virtualizer-v2_2-160212/bootwrapper/payload/fsimg + + These may be replaced with custom built images such as a + suitably configured linux kernel image and a root filesystem + image. + + Look at docs/03-Linux-kernel-build.txt for instructions on + building a suitable Linux kernel. + + Look at docs/06-Optional-rootfs-build.txt for optionally + building a complete root filesystem. + +3. Usage + + If the Real-Time System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1 + and RTSM_VE_Cortex_A15x4_A7x4) is installed, the resulting + img.axf file may be loaded, executed and debugged on the + model and the associated model debugger. + + This model may be obtained from ARM by separate arrangement. + + Steps to run the software: + + a. Depending upon whether the MPx1 or MPx4 model is being used, + update the big-little-mp<x>.mxscript file (x is 1 or + 4 as the case may be) with the absolute + path to the model and the img.axf file. (Comments in the + file indicate where the changes have to be made) + + b. Invoke the modeldebugger and the script file as follows: + + $ <path to modeldebugger> -s <path to big-little-mp<x>.mxscript> + + The default build simultaneously switches clusters + every 12 million cycles (appx). diff --git a/linaro/arm-virt-bl/docs/02-Code-layout.txt b/linaro/arm-virt-bl/docs/02-Code-layout.txt new file mode 100644 index 0000000..ba1690e --- /dev/null +++ b/linaro/arm-virt-bl/docs/02-Code-layout.txt @@ -0,0 +1,563 @@ +Code layout +=========== + +A Introduction + + The software contained in the 'bootwrapper' directory allows + the execution of a software payload e.g. a Linux stack to + alternate between two multi-core clusters of ARM Cortex-A15 + & Cortex-A7 processors connected by a coherent + interconnect. To achieve this aim it provides the ability + to: + + 1. Save the processor context on one cluster (henceforth + called the outbound cluster) and restore it on the other + cluster (henceforth called the inbound cluster). + + 2. Hide any software visible microarchitectural differences + between the Cortex-A15 & Cortex-A7 processors. + + 3. Use the ARM Virtualization Extensions to perform 1. and 2. + in a payload software agnostic manner. + + This software is intended to be executed on the Real-Time + System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1 and + RTSM_VE_Cortex_A15x4_A7x4). + + In addition to switching the payload software execution + between the two clusters, the software also contains support + for executing the payload software simultaneously on the two + clusters. + + This is called the MP configuration. In it's current state, + it mainly involves making the payload software believe that + the A15 cluster includes the cpus present on Cortex-A7 cluster + i.e.there is one cluster with more cpus that there + physically are. [Note that MP support is highly experimental + and unstable. It is NOT the focus of this release and is + intended for purely informational purposes. The cluster + swithing mode of operation remains the focus of this + release.] + + The Virtualizer software needs initialization prior to being + used to perform any of the above functions. The + initialization needs to be done before the payload software + is executed. Hence, it makes sense to do this from the + existing boot firmware being used on the platform. The code + in the 'bootwrapper' directory is a bareminimal bootloader + that: + + 1. Sets up the environment for execution of the payload + software in the Non-secure world by programming the + appropriate coprocessor and memory mapped peripheral + registers from the Secure world. + + 2. Invokes the entry point of the Virtualizer software + (bl_setup()) which does the necessary initialization. + + 3. Passes control to the payload software in the Non-secure + world. + +B Code layout overview + + 1. bootwrapper/ + + Apart from containing the bootloader, this directory + also contains scatter files to load the bootloader, + Virtualizer and the payload software correctly on the + target platform as a single ELF file (img.axf). + + The important files here are: + + 1. vectors.S + + 1. Implements the Secure world exception vectors + which are loaded to the base of physical memory + (0x80000000) at reset. + + 2. boot.S + + 1. Handles a power-on reset. + + 2. Initialises the I-Cache, sets up the stack & + passes control to the C handler for performing + the rest of the initialization. + + 3. c_start.c + + 1. Picks up from where the start() routine left in + the previous file. + + 2. Programs the exception vector tables for the + Secure world. + + 3. Provides Non-secure access to certain + coprocessor registers and memory mapped + peripherals e.g. access to the cache + coherent interconnect registers, coprocessors + etc. + + 4. Enables functionality which can be initialised + only in the Secure world. e.g. Configuration of + interrupts as Non-secure. + + 5. Synchronises execution with the secondary cpus + (if present) so that any global peripheral is + accesses by them only after the primary has + initialised it. + + 6. Enters the non-secure HYP mode and initialises + the Virtualizer. + + 7. Enters the non-secure SVC mode and jumps to the + payload software entry point. + + 4. payload/ + + 1. Contains two files 'fsimg' and 'kernel'. + + 2. The 'kernel' is a raw Linux kernel binary image. + The instructions to build this Linux image can + be found in docs/03-Linux-kernel-build.txt. + This image can be replaced with a raw binary + image of any other software payload which is + desired to be run on this system. + + 3. The 'fsimg' is an empty filesystem stub. If + desired, it can be replaced with a suitable + filesystem image in a Linux initramfs format. A + custom busybox filesystem was used for testing. + More complex filesystems may be used if needed + but will require the use of MMC emulation with + the ARM FastModels. + See docs/06-Optional-rootfs-build.txt for + details. + + 5. boot.map.template + + 1. Scatter file which combines the payload + software, Virtualizer and the bootloader into a + single ELF file (img.axf) which can + then be loaded on the relevant platform. + + 6. makemap + + 1. Simple perl script that takes an ELF image of + the Virtualizer, parses through the relevant + sections & adds those sections to the scatter + file so that a consolidated image can be + created. + + 2. big-little/common + + This directory mainly deals with setting up of the HYP + processor mode and the Virtual GIC. This allows the + payload software to run unmodified while either the + Switching or the MP mode is active in the background. + + The important files here are: + + 1. hyp_vectors.s + + 1. Implements the HYP mode vector table. + + 2. It contains the entry point "bl_setup()" which + is invoked by the bootwrapper to initialise the + Virtualizer software. + + 3. The exception vector for interrupts + [irq_entry()] is the entry point for all + physical interrupts. The exception vector for + hypervisor traps [hvc_entry()] is the entry + point for all accesses made by the payload + software that need to be handled in the HYP + mode. + + 4. Also contained is rudimentary support for fault + exception handlers [dabt_entry(), iabt_entry() & + undef_entry()]. + + 2. hyp_setup.c + + 1. Extends the initialization of the Virtualizer + software into C code after a cold reset. + + 2. If switching is being done asynchronously then + the HYP timer interrupt is setup to periodically + (~12 million instructions) trigger a switchover + to the other cluster. + + 3. If in MP mode, then CCI snoops are enabled for + both the clusters. + + 3. vgic_handle.c + + 1. Extends handling of physical interrupts into C + code from irq_entry(). Interrupts are + acknowledged (optionally EOI'ed) and queued as + virtual interrupts. The HYP timer interrupt is + handled differently. When recieved, its used as + a trigger to initiate the switchover process. + + 4. vgiclib.c + + 1. Implements handling of virtual interrupts once + they have been queued up in the vGIC HYP view + list registers. It maintains the list registers + and also saves and restores the context of the + vGIC HYP view interface. + + 5. pagetable_setup.c + + 1. Creates and sets up the HYP mode and 2nd stage + translation page tables. Accesses by the payload + software to the vGIC physical cpu interface are + mapped to the vGIC virtual cpu interface using + the 2nd stage translation page tables. + + 2. In the MP configuration, the translation tables + are shared by all the cpus in the two clusters. + Hence the first cpu in only one of the clusters + creates them. + + 6. vgic_setup.c + + 1. Enables virtual interrupts and exceptions, + initialises the physical cpu interface and the + HYP view interface. + + 3. big-little/lib + + This directory implements common functionality thats + used across all the Virtualizer code. This includes: + + 1. Locks which can be used with Strongly Ordered and + Device memory. + + 2. Code tracing support on the Fast Models platform + through the use of memory mapped TUBE registers & + the Generic Trace plugin. + Details of this feature can be found in + docs/04-Cache-hit-rate-howto.txt. + + 3. Events to synchronise the switching process between + the clusters and within the clusters. They also used + to synchronise the setup phase after a cold reset in + the MP configuration. + + 4. UART routines to enable support semihosting of + printf family of functions. + + 5. Cache maintenance, Stack manipulation and Locking + routines. + + 4. big-little/include + + 1. This directory contains the headers specific to HYP + mode setup, Switching process and common helper + routines. Most importantly, context.h contains the + data structures which are used to save and restore + the processor context. + + 5. big-little/switcher + + This directory implements code to save and restore + processor context and to initiate/handle a + async/synchronous switchover request. + + 1. context/ + + 1. ns_context.c + + 1. Contains top level routines to save and + restore the Non-secure world context. + + 2. It requests the secure world to save its own + context and bring the inbound cluster out of + reset. It also uses events to synchronise + the switching process between the inbound + and outbound clusters. + + 2. gic.c + + 1. Contains routines to save and restore the + context of the vGIC physical distributor and + cpu interfaces. + + 3. sh_vgic.c + + 1. The two clusters share the interrupt + controller instead of each cluster having + its own. A consequence of this is that there + is no longer a 1 to 1 mapping between cpu + ids and cpu interface ids e.g. on an + MPx1+MPx1 cluster configuration, + cpu0 of the Cortex-A7 cluster would + correspond to cpuinterface1 on the shared + vGIC. This in turn affects routing of + peripheral and software generated + interrupts. This file implements code to + allow use of the shared vGIC correctly + keeping this limitation in mind. + + 2. trigger/ + + 1. async_switchover.c + + 1. Contains code to use the HYP timer interrupt + as a trigger to initiate a switchover + asynchronously. + + 2. sync_switchover.c + + 1. Contains code to handle an HVC instructions + executed by the payload software: + + a. to initiate a synchronous switchover. + ("HVC #1") + + b. to find the id of the cluster on which its + currently executing. ("HVC #2") + + 3. handle_switchover.s + + 1. Contains code to start saving the non-secure + world context and request the secure world to + power down the outbound cluster once the + inbound cluster is up and running. + + 6. big-little/virtualisor + + This directory implements code that using the ARM + Virtualization extensions: + + 1. Hides any microarchitectural differences between the + Cortex-A15 & Cortex-A7 processors visible to the + payload software. + + 2. Provides a different view of the underlying hardware + than what really exists e.g. in the switching mode + it traps accesses made by the host cluster + (Cortex-A7 cluster currently) to the shared vGIC + physical distributor interface, so that routing of + interrupts can take place correctly. In the MP mode, + the L2 control and MPIDR registers are virtualized + to tell the payload software that there is one + cluster with multiple processors instead of two. + + The ARM Virtualization extensions provide a set of trap + registers (HCPTR (Hyp Coprocessor Trap Register), HSTR + (Hyp System Trap Register), HDCR (Hyp Debug + Configuration Register)) to be able to select what + accesses made by the payload software to the coprocessor + block will be trapped in the HYP mode. + + Accesses to memory mapped peripherals e.g. shared vGIC + can betrapped into the HYP mode by populating + appropriate entries in the 2nd stage translation tables. + This is how microarchitectural differences between the + two processor sets are resolved. + + Whenever a trap into HYP mode is taken, the HSR (Hyp + Syndrome Register) contains enough information about the + type of trap taken for the software to take appropriate + action. + + The Virtualizer design centres around the traps + recognized by the HSR. Also, to deal with + microarchitectural differences the concept of a HOST + cluster is introduced. It is possible for each + cpu to find out the system topology using the Kingfisher + System Control Block. Once it knows the host cluster id + & whether the software is expected to switch execution + or run in the MP mode (provided at compile time), the + CPU Can configure itself accordingly. + + The processor cluster for which the payload software has + been built to run on [assumed to be Cortex-A15 for this + release] is termed as the TARGET while the cluster on + which the differences are expected to crop up is called + the HOST (assumed to be Cortex-A7 for this release). + The HOST environment variable is used to specify + the host cluster. The target cluster is assumed to be + the logical complement of the host i.e. cluster ids can + only take the values of 0 and 1. + + The HOST processor emulates the TARGET processor by + trapping the accesses to differing processor features + into the HYP mode. Most of the microarchitectural + differences & registers that need to be virtualized are + handled in a generic (CPU Independent) layer of + code. Additionally, each processor exports functions to + setup, handle & optionally save/restore context of each + trap that the HSR recognises. These handlers are invoked + whenever the software runs + on that processor. + + 1. virt_setup.c + + 1. Generic function that initialises the required + traps. This is done once each on both the host + and target clusters if the trap handler needs + to obtain some information about the target + cluster to be able to work correctly e.g the + Cortex-A7 processor cluster needs to find out + the cache geometry of the Cortex-A15 + processor cluster to be able to handle cache + maintenance operations by set/way correctly.This + function further calls any setup function that + has been exported by the processor the code is + executing on. + + 2. virt_handle.c + + 1. Generic function that extends the hvc_entry() + routine to C Code. It calls the generic trap + handler (if registered) and then any trap + handlers exported by the processor on + which the trap has been invoked. + + 3. virt_context.c + + 1. Generic function that saves and restores traps + on the host cluster & then calls any + save/restore function that has been exported by + the processor the code is executing on. + + 4. cache_geom.c + + 1. Generic function that detects cache geometries + on the host and target clusters & then maps + cache maintenance operations by set/way from the + target to the host cache. + + 5. mem_trap.c + + 1. Generic function that sets up any memory traps + by editing the 2nd stage translation tables. + + 6. vgic_trap_handler.c + + 1. Generic function that handles trapped accesses + to the shared vGIC. + + 7. include/ + + Header files specific to the Virtualisor code. + + 8. cpus/ + + Placeholders for any traps that the Cortex-A7 or A15 processor + cluster might want to setup. No traps need to be setup + at the moment. + + 9. big-little/secure_world + + Since both Cortex-A7 & Cortex-A15 processors support ARM + TrustZone Security Extensions, there is certain context + that needs to be setup, saved & restored in the Secure + world. + + This context allows access to certain coprocessor and + peripheral registers to the Non-secure world. It also + configures the shared vGIC for use by the Non-secure + world. + + Execution shifts to the Secure world through the SMC + instruction which is a part of the ARM V7-ISA. + + 1. monmode_vectors.s + + 1. Implements the monitor mode vector table. It + contains the secure entry point [do_smc()] for + the SMC instruction alongwith rudimentary + support for other fault exceptions taken while + executing in the secure world. + + 2. Three types of SMC exceptions are expected (type + of exception is contained in r0): + + 1. SMC_SEC_INIT + + Called once after a power on reset to + initialise the Secure world stacks, + coherency, pagetables, to configure some + coprocessor and memory mapped peripheral + (Coherent interconnect & shared vGIC) + registers for use of these features by + the Non-secure world. + + 2. SMC_SEC_SAVE + + Called from ns_context.c to request the + secure world to save its context and bring + the corresponding core in the inbound + cluster out of reset so that it can start + restoring the saved state. + + 3. SMC_SEC_SHUTDOWN + + Called from handle_switchover.s to request + the secure world to flush the L1 and L2 caches + and power down the outbound cluster. + + Also implemented is a function to handle warm + resets on the inbound cluster. Bareminimal + context is initialised while the rest is restored + before control is passed to the Non-secure world + handler for restoring context [restore_context()] + in ns_context.c + + 2. secure_context.c + + Implements code to save and restore the secure world + context + + 3. secure_resets.c + + Implements code to power down the outbound cluster + and bring individual cores in the inbound cluster + out of reset. + + 4. ve_reset_handler.s + + Base of physical memory in the Versatile Express + memory map is at 0x80000000. The processors are + brought out of reset at 0x0 which points to Secure + RAM/Flash memory. This file implements a small stub + function that is placed at 0x0 so that execution + jumps to 0x80000000 after a cold reset and to the + warm_reset() handler in monmode_vectors.s + after a warm reset. + + The secure world code is built into a seperate ELF image + to maintain its distinction from the Virtualizer code + that executes in the Non-secure world. + + 10. big-little/bl.scf.template + + 1. Scatter file that is used to build the Non-secure + world code in the Virtualizer software. The + resultant image is bl.axf. + + 11. big-little/bl-sec.scf.template + + 1. Scatter file that is used to build the Secure world + code in the Virtualizer software. The resultant + image is bl_sec.axf. + + 12. acsr/ + + The secure world code is built into a seperate ELF image + to maintain its distinction from the Virtualizer code + that executes in the Non-secure world. + + 1. helpers.s + + Helper functions to access the CP15 coprocessor + space. + + 2. v7.s + + Contains routines to save and restore ARM processor + context. diff --git a/linaro/arm-virt-bl/docs/03-Linux-kernel-build.txt b/linaro/arm-virt-bl/docs/03-Linux-kernel-build.txt new file mode 100644 index 0000000..876cce3 --- /dev/null +++ b/linaro/arm-virt-bl/docs/03-Linux-kernel-build.txt @@ -0,0 +1,58 @@ +Building and installing a Linux kernel +====================================== + +A suitable Linux kernel image for use with the virtualizer +can be built as follows (GCC toolchain used for these steps is: +CodeSourcery Sourcery G++ Lite 2010.09 v4.5.1) + +$ tar -jxf arm-virtualizer-v2_2-160212.tar.bz2 +$ cd arm-virtualizer-v2_2-160212/bootwrapper +$ make clean +$ pushd /tmp +$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git arm-platforms.git +$ cd arm-platforms.git +$ git checkout -b ael-11.06 origin/ael-11.06 +$ yes | make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- vexpress-new_defconfig +$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- -j4 +$ popd +$ cp $OLDPWD/arch/arm/boot/Image payload/kernel + +The virtualizer can now be built as usual by invoking: + +$ make clean && make + +.. in the top bootwrapper directory. + +This will result in a file called img.axf located at +arm-virtualizer-v2_2-160212/bootwrapper/img.axf. + +To launch the ARM FastModel with the virtualizer, first modify +arm-virtualizer-v2_2-160212/bootwrapper/big-little-MP<x>.mxscript +as usual to fill in paths to the model binary and the img.axf files. +The mxscript file is adequately commented to assist with this. + +In case of an MP1 model, we would use the big-little-MP1.mxscript file +and we would specify the path to the model in a manner similar to: + +string model = "/home/working_dir/RTSM_VE_Cortex-A15x1-A7x1"; + +Similarly, in case of an MP4 model, we would use the big-little-MP4.mxscript +and we would specify the path to the model in a manner similar to: + +string model = "/home/working_dir/models/RTSM_VE_Cortex-A15x4-A7x4"; + +The path to the img.axf file is specified using the app directive as +follows: + +string app = "arm-virtualizer-v2_2-160212/bootwrapper/img.axf"; + +The model can then be launched using: + +modeldebugger -s arm-virtualizer-v2_2-160212/bootwrapper/big-little-MP<x>.mxscript + +Where 'x' is the 1 or 4 respectively in the case of an MP1 model run or an +MP4 model run. + +This will result in the Linux kernel console messages appearing the ARM +FastModel UART emulation window. The virtualizer will switch execution +between the two clusters at ~12 million instruction intervals. diff --git a/linaro/arm-virt-bl/docs/04-Cache-hit-rate-howto.txt b/linaro/arm-virt-bl/docs/04-Cache-hit-rate-howto.txt new file mode 100644 index 0000000..27c83b4 --- /dev/null +++ b/linaro/arm-virt-bl/docs/04-Cache-hit-rate-howto.txt @@ -0,0 +1,198 @@ +Cache hit-rate HOWTO +==================== + +A Introduction + + The ARM Fast Models are accompanied with a trace infrastructure + referred to as the Model Trace Interface (MTI). The MTI trace + provides a mechanism to dynamically register to events from the + model. The GenericTrace.so MTI trace plugin provides a number of + trace events whose output can be logged in a simple text file. + The usage of this plugin is given in Section B. + + In this document we will consider how the GenericTrace.so plugin + can be used during a cluster switchover to calculate the number + of cache hits in the outbound cluster L2 cache originating from + the inbound cluster before the outbound L2 is flushed and the + cluster placed in reset. + +B Plugin Usage + + The GenericTrace plugin is loaded using the "--trace-plugin" + parameter in the command line to launch the model. + + A list of trace sources provided by the plugin can be listed as + follows: + + "RTSM_VE_Cortex-A15x1-A7x1 --trace-plugin GenericTrace.so + --parameter TRACE.GenericTrace.trace-sources= " + + A list of parameters supported by the Generic Trace plugin can + be listed as follows: + + "RTSM_VE_Cortex-A15x1-A7x1 --trace-plugin GenericTrace.so -l" + + Some of the interesting parameters are: + + TRACE.GenericTrace.trace-file: The trace file to write into. If + empty will print to console / STDOUT. + + TRACE.GenericTrace.perf-period: Print performance every N + instructions. Since the instruction count and the global counter + have the same value on the Fast Models, this parameter provides + a good approximation of time. + + TRACE.GenericTrace.flush: If set to true then the trace file will be + flushed after every event. + +C Plugin Trace sources + + The GenericTrace plugin provides events which allow each cluster + to trace snoop requests originating from a different cluster that + hit in its caches. For snoops originating from the Cortex-A7 cluster + that hit in the A15 cluster, the event is 'read_for_4_came_from_snoop' + & for the opposite case the event is 'read_for_3_came_from_snoop'. + The numbers '3' & '4' in the name of the trace sources are the ids + of the CCI slave interfaces from where the snoop originated. + + These trace sources are the per-cluster implementation of the + event id '0xA' "(Read data last handshake - data returned + from the cache rather than from downstream)" of the CCI PMU. + Please refer to the "Cache Coherent Interconnect (CCI-400) + Architecture Specification" for further details. + + The plugin also provides the ability to trace code execution through + a memory mapped "tube" interface. This interface defines a list of + registers which when written to in a particular sequence and the + 'sw_trace_event' trace source selected during model invocation will + print out the register values in the trace file. + + The "tube" interface defines: + + - Three LE 64 bit registers of arbitrary data that can be + written (and retain their values). + + - A tube-like char register which when written with '\0' + will generate an event with the current state of the + 64-bit registers and with the characters sent to the + device with a unique sequence_id. + + All of these registers are banked and write-only, the trace + event will also output the cluster id and the CPU id. ARM + FastModels implement 1 to 4 TUBE interfaces. Please refer to + Section E for supported interfaces in the current model + release. The memory map of these registers can be found in + big-little/include/misc.h. + + The 'write_trace' function in big-litte/lib/tube.c implements the + software sequence to program the tube interface. This function is + called at various points in switchover process. It prints out a + message which indicates that an event is about to start or has + completed alongwith the value of the global counter in one of the + 64 bit registers. To enable this functionality, the environment + variable "TUBE" needs to be defined to TRUE prior to code compilation. + +D Putting it all together + + The list of steps to use the above mentioned functionality is: + + 1. Build the Virtualizer code with "TUBE" support. On the + tcsh shell, this is as follows; + + $ setenv TUBE TRUE; make clean && make + + 2. Launch the model with the MTI trace plugin support and a + selection of the right trace sources using a suitable + MXScript file in the 'bootwrapper' directory. + + Once the switchover process starts, the trace file will contain output + that looks like this (not including the comments): + + . + . + . + . + // Lines beginning with "PERFORMANCE" are a result of the value of the + // "TRACE.GenericTrace.perf-period" parameter. This string is printed + // every <value> number of instructions (200 in this case) in the trace + // file. It indicates at what rate is the model executing instructions + // & the number of instructions executed thus far. + PERFORMANCE: 2.8 MIPS (Inst:67216767) + . + . + . + // Lines beginning with "sw_trace_event<x>" are a result of enabling + // "TUBE" support in the code and selecting the "sw_trace_event" source + // while invoking the model. The interpretation of this message is: + // + // <x> : indicates the "TUBE" interface number. + // sequence_id : a unique number assigned to each message + // cluster_and_cpu_id : in the format 0x<cluster id><cpu id>. Each id + // occupies 8 bits. + // data0 : first 64-bit register value. Programmed with + // the value of the global counter. + // data1 : second 64-bit register value. Not used. + // data2 : third 64-bit register value. Not used. + // message : String written to the TUBE register + sw_trace_event2: sequence_id=0x00000001 cluster_and_cpu_id=0x0000 data0=0x000000000401a3dc data1=0x0000000000000000 data2=0x0000000000000000 message="Secure Coherency Enable Start":30 + . + . + . + PERFORMANCE: 0.2 MIPS (Inst:67217079) + sw_trace_event2: sequence_id=0x00000002 cluster_and_cpu_id=0x0000 data0=0x000000000401a581 data1=0x0000000000000000 data2=0x0000000000000000 message="Secure Coherency Enable End":28 + PERFORMANCE: 0.9 MIPS (Inst:67217301) + PERFORMANCE: 5.8 MIPS (Inst:67217511) + . + . + . + // Lines beginning with "read_for_<x>_came_from_snoop" are a result of + // enabling the event sources for monitoring the cache hits resulting + // from snoops originating from master interface <x> on the CCI. + // The following line indicates that a snoop from the Cortex-A7 cluster + // hit in the caches of the A15 cluster. It also prints the cache line + // address and whether the access was Secure or Non-secure. + read_for_4_came_from_snoop: Bus address=0x000000008ff02440 Is non secure=N + read_for_4_came_from_snoop: Bus address=0x000000008ff02440 Is non secure=N + read_for_4_came_from_snoop: Bus address=0x000000008ff02240 Is non secure=N + read_for_4_came_from_snoop: Bus address=0x000000008ff02240 Is non secure=N + read_for_4_came_from_snoop: Bus address=0x000000008ff012c0 Is non secure=N + PERFORMANCE: 0.0 MIPS (Inst:135292834) + sw_trace_event: sequence_id=0x00000010 cluster_and_cpu_id=0x0000 data0=0x000000000810672e data1=0x0000000000000000 data2=0x0000000000000000 message="L2 Flush Begin":15 + PERFORMANCE: 5.5 MIPS (Inst:135293056) + PERFORMANCE: 7.2 MIPS (Inst:135293374) + PERFORMANCE: 7.4 MIPS (Inst:135293587) + PERFORMANCE: 12.4 MIPS (Inst:135293800) + PERFORMANCE: 10.0 MIPS (Inst:135294118) + read_for_4_came_from_snoop: Bus address=0x0000000080054a80 Is non secure=Y + read_for_4_came_from_snoop: Bus address=0x0000000080054a80 Is non secure=Y + read_for_4_came_from_snoop: Bus address=0x0000000080054ac0 Is non secure=Y + read_for_4_came_from_snoop: Bus address=0x0000000080054ac0 Is non secure=Y + read_for_4_came_from_snoop: Bus address=0x0000000080074c80 Is non secure=Y + PERFORMANCE: 0.5 MIPS (Inst:135294331) + . + . + . + . + PERFORMANCE: 10.5 MIPS (Inst:135541612) + PERFORMANCE: 3.3 MIPS (Inst:135541929) + sw_trace_event: sequence_id=0x00000011 cluster_and_cpu_id=0x0000 data0=0x0000000008143442 data1=0x0000000000000000 data2=0x0000000000000000 message="L2 Flush End":13 + . + . + . + . + + Post-processing scripts can be developed which count the number of + 'read_for_<x>_came_from_snoop' events between two 'sw_trace_event<x>' + events. In the above example, the result will be the number of snoop + hits in the A15 caches while they were being flushed. In addition, + the "PERFORMANCE" strings can be used to determine the cache hit rate. + In this case, they indicate the number of hits in the last 200 + instructions. Repeated iterations can be done where each iteration + changes the point of time when the L2 cache is flushed during a + switchover. By monitoring its effect on the cache hit rate, a suitable + time can be determined to power down the outbound L2 cache. + +E Status of "TUBE" support + + The Real-Time System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1 and + RTSM_VE_Cortex_A15x4_A7x4) implements 'tube' interfaces TUB0-3. diff --git a/linaro/arm-virt-bl/docs/05-FAQ.txt b/linaro/arm-virt-bl/docs/05-FAQ.txt new file mode 100644 index 0000000..f054a6d --- /dev/null +++ b/linaro/arm-virt-bl/docs/05-FAQ.txt @@ -0,0 +1,20 @@ +Frequently asked questions +========================== + +Q1. What is the per-core context size that is switched between + clusters? + +A1: + + Per-CPU context: + + CP15 and VFP context: 768 bytes + vGIC Virtual CPU interface (payload view) context: 128 bytes + vGIC Virtual CPU interface (HYP mode view) context: 280 bytes + vGIC Distributor context (SGIs & PPIs): 128 bytes + Virt. Ext. Registers: 40 bytes + + Global context: + + vGIC Distributor context (SPIs): 2048 bytes + 2nd stage translation trap context: 40 bytes diff --git a/linaro/arm-virt-bl/docs/06-Optional-rootfs-build.txt b/linaro/arm-virt-bl/docs/06-Optional-rootfs-build.txt new file mode 100644 index 0000000..6dd9d62 --- /dev/null +++ b/linaro/arm-virt-bl/docs/06-Optional-rootfs-build.txt @@ -0,0 +1,122 @@ +Optional Root filesystem build and use instructions +=================================================== + +A Introduction + + This note describes ways to build Linux user-land + filesystems of varying complexity for use with the + virtualizer. Note that there are several ways to create + filesystems and this note doesn't cover all possibilities. + + The default virtualizer release contains an empty filesystem + stub located at: + + arm-virtualizer-v2_2-160212/bootwrapper/payload/fsimg + + A build using this stub doesn't contain a functional + filesytem that the Linux kernel image can use. fsimg can be + replaced with a suitable filesystem image but with the + following constraints: + + 1. Compressed or uncompressed cpio archives are supported. + + 2. The image size is limited to ~200 MB. + + The size restriction implies that only very 'lean' + filesystems such as busybox <http://www.busybox.net/> may be + used. While busybox presents a minimal but robust command + line environment, quite often a more conventional desktop + like environment with window management on top of an X + server is required in order to run web browsers etc. + + In this note, we illustrate a method to use a larger (~2GB) filesystem image + that can be used with the ARM FastModels MMC emulation. Note that the MMC + emulations only supports images that are just under 2GB in size. + + Note that if the MMC route is used, the bootwrapper/payload/fsimg filesystem + image will be suppressed and ignored. + + Locating a root filesystem on the MMC emulation allows the Linux kernel to + access and use this filesystem. This is facilitated by indicating the + filesystem location to the kernel via the kernel command-line arguments by + appending 'root=/dev/mmcblk0' (for a single partition MMC image) to the + argument list. + + Note that when using this technique, the fsimg file is ignored. + +B Building and installing a Linux kernel + + A suitable Linux kernel image for use with the virtualizer + can be built as follows: + + $ tar -jxf arm-virtualizer-v2_2-160212.tar.bz2 + $ cd arm-virtualizer-v2_2-160212/bootwrapper + $ make clean + $ pushd /tmp + $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/maz/ael-kernel.git ael-kernel.git + $ cd ael-kernel.git + $ git checkout -b ael-11.06 origin/ael-11.06 + $ yes | make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- vexpress-new_defconfig + $ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- -j4 + $ popd + $ cp $OLDPWD/arch/arm/boot/Image payload/kernel + + Note that the using the vexpress-new_defconfig configuration + ensures that the kernel is built with MMC support. + +C Building a suitable root filesystem + + A suitable root filesystem can be built using Ubuntu Linux's rootstock utility + <https://wiki.ubuntu.com/ARM/RootfsFromScratch> as follows: + + $ sudo apt-get install rootstock + $ sudo rootstock --fqdn ubuntu --login ubuntu --password ubuntu --imagesize 2040M --seed lxde,gdm --notarball + $ mv qemu-armel-*.img mmc.img + + Note that the complete filesystem build will take ~30 + minutes. On boot, the username and password is 'ubuntu'. + + The rootstock invocation above will produce a rootfilesystem containing an + LXDE desktop <http://lxde.org/> that has a firefox browser. + +D Modifying the kernel command line to support the MMC image. + + The virtualizer build system and the mxscripts that are used for launching + the ARM FastModel require modifications to support the MMC image. + + The build system modification is to change the Linux kernel command line + arguments to make the kernel aware of the location of the root filesystem. + The command line should contain the string 'root=/dev/mmcblk0'. + + To make this modification, edit the file bootwrapper/Makefile and change the + BOOTARGS specification on line 42 from: + + BOOTARGS=mem=255M console=ttyAMA0,115200 migration_cost=500 + cachepolicy=writealloc + + to + + BOOTARGS=root=/dev/mmcblk0 mem=255M console=ttyAMA0,115200 + migration_cost=500 cachepolicy=writealloc + + The ARM FastModel mxscript modification is to get the FastModel to use the + mmc.img file created in step C above with the MMC emulation. + + To make this modification uncomment the 'string mmcimage=' line (line 42) + and provide the complete path to the mmc.img file generated in step C above. + +E Building the virtualizer + + $ cd bootwrapper + $ make clean && make + +F Launching the ARM FastModel + + $ modeldebugger -s big-little-MP<x>.mxscript + + .. where x is 1 or 4 as the case may be (MP1 build or MP4 + build). + +G Known limitations + + None. |