Add linaro directory which contains the bootwrapper

Signed-off-by: John Rigby <john.rigby@linaro.org>
author: John Rigby <john.rigby@linaro.org> 2012-03-27 09:36:17 -0600
committer: John Rigby <john.rigby@linaro.org> 2012-03-27 09:36:17 -0600
commit: 891a674c57a0a1e4a2b871d5b6e874857e977ef5 (patch)
tree: 4cdc25fb8fe7352086da9a16cc1a9555a2d1f400 /linaro/arm-virt-bl/docs
parent: f0d8629d72fea2d76060d97c2f463e17c67ec844 (diff)
6 files changed, 1051 insertions, 0 deletions
diff --git a/linaro/arm-virt-bl/docs/01-Usage.txt b/linaro/arm-virt-bl/docs/01-Usage.txt
new file mode 100644
index 0000000..39d3b7f
--- /dev/null
+++ b/linaro/arm-virt-bl/docs/01-Usage.txt
@@ -0,0 +1,90 @@
+Usage
+=====
+
+1.  Requirements/Pre-requisites
+
+    1.  A Linux development environment. This release has been
+        build tested on the following Linux host environments:
+
+        1. Linux Ubuntu 10.10
+        2. Red Hat Enterprise Linux WS release 4 (Nahant Update 4)
+
+        This release is not intended to be used on development
+        environments other than Linux.
+
+    2.  An installation of the ARM RealView Development Suite. This
+        release was built and tested with version 4.1 [Build 514].
+
+    3.  An installation of the Perl scripting language. This release
+        was built and tested with v5.10.1.
+
+    4.  An installation of the GNU coreutils suite
+        <http://www.gnu.org/software/coreutils/>.  This release was
+        built and tested with v8.5.
+
+2.  Build instructions
+
+    Note that this release relies on the 'env' utility, which is
+    a part of the coreutils suite. The 'env' utility is expected
+    to be located at '/bin/env'. If it is at a different
+    location then this must be reflected in the first line of
+    the following file:
+
+    arm-virtualizer-v2_2-160212/bootwrapper/makemap
+
+    Failure to make this modification will result in a build
+    failure.
+
+    To build the software:
+
+    $ tar -jxf arm-virtualizer-v2_2-160212.tar.bz2
+    $ cd arm-virtualizer-v2_2-160212/bootwrapper
+    $ make clean && make
+
+    The resulting file is 'img.axf'.
+
+    This image may be loaded and executed on the model debugger
+    as explained in section 3 below.
+
+    Note that the pre-built stub kernel image is located at:
+
+    arm-virtualizer-v2_2-160212/bootwrapper/payload/kernel
+
+    .. and the placeholder dummy root filesystem image is located
+    at:
+
+    arm-virtualizer-v2_2-160212/bootwrapper/payload/fsimg
+
+    These may be replaced with custom built images such as a
+    suitably configured linux kernel image and a root filesystem
+    image.
+
+    Look at docs/03-Linux-kernel-build.txt for instructions on
+    building a suitable Linux kernel.
+
+    Look at docs/06-Optional-rootfs-build.txt for optionally
+    building a complete root filesystem.
+
+3.  Usage
+
+    If the Real-Time System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1
+    and RTSM_VE_Cortex_A15x4_A7x4) is installed, the resulting
+    img.axf file may be loaded, executed and debugged on the
+    model and the associated model debugger.
+
+    This model may be obtained from ARM by separate arrangement.
+
+    Steps to run the software:
+
+    a.    Depending upon whether the MPx1 or MPx4 model is being used,
+          update the big-little-mp<x>.mxscript file (x is 1 or
+          4 as the case may be) with the absolute
+          path to the model and the img.axf file. (Comments in the
+          file indicate where the changes have to be made)
+
+    b.    Invoke the modeldebugger and the script file as follows:
+
+          $ <path to modeldebugger> -s <path to big-little-mp<x>.mxscript>
+
+          The default build simultaneously switches clusters
+          every 12 million cycles (appx).
diff --git a/linaro/arm-virt-bl/docs/02-Code-layout.txt b/linaro/arm-virt-bl/docs/02-Code-layout.txt
new file mode 100644
index 0000000..ba1690e
--- /dev/null
+++ b/linaro/arm-virt-bl/docs/02-Code-layout.txt
@@ -0,0 +1,563 @@
+Code layout
+===========
+
+A   Introduction
+
+    The software contained in the 'bootwrapper' directory allows
+    the execution of a software payload e.g. a Linux stack to
+    alternate between two multi-core clusters of ARM Cortex-A15
+    & Cortex-A7 processors connected by a coherent
+    interconnect. To achieve this aim it provides the ability
+    to:
+
+    1.  Save the processor context on one cluster (henceforth
+        called the outbound cluster) and restore it on the other
+        cluster (henceforth called the inbound cluster).
+
+    2.  Hide any software visible microarchitectural differences
+        between the Cortex-A15 & Cortex-A7 processors.
+
+    3.  Use the ARM Virtualization Extensions to perform 1. and 2.
+        in a payload software agnostic manner.
+
+    This software is intended to be executed on the Real-Time
+    System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1 and
+    RTSM_VE_Cortex_A15x4_A7x4).
+
+    In addition to switching the payload software execution
+    between the two clusters, the software also contains support
+    for executing the payload software simultaneously on the two
+    clusters.
+
+    This is called the MP configuration. In it's current state,
+    it mainly involves making the payload software believe that
+    the A15 cluster includes the cpus present on Cortex-A7 cluster
+    i.e.there is one cluster with more cpus that there
+    physically are. [Note that MP support is highly experimental
+    and unstable. It is NOT the focus of this release and is
+    intended for purely informational purposes. The cluster
+    swithing mode of operation remains the focus of this
+    release.]
+
+    The Virtualizer software needs initialization prior to being
+    used to perform any of the above functions. The
+    initialization needs to be done before the payload software
+    is executed. Hence, it makes sense to do this from the
+    existing boot firmware being used on the platform. The code
+    in the 'bootwrapper' directory is a bareminimal bootloader
+    that:
+
+    1.  Sets up the environment for execution of the payload
+        software in the Non-secure world by programming the
+        appropriate coprocessor and memory mapped peripheral
+        registers from the Secure world.
+
+    2.  Invokes the entry point of the Virtualizer software
+        (bl_setup()) which does the necessary initialization.
+
+    3.  Passes control to the payload software in the Non-secure
+        world.
+
+B   Code layout overview
+
+    1.  bootwrapper/
+
+        Apart from containing the bootloader, this directory
+        also contains scatter files to load the bootloader,
+        Virtualizer and the payload software correctly on the
+        target platform as a single ELF file (img.axf).
+
+        The important files here are:
+
+        1.  vectors.S
+
+            1.  Implements the Secure world exception vectors
+                which are loaded to the base of physical memory
+                (0x80000000) at reset.
+
+        2.  boot.S
+
+            1.  Handles a power-on reset.
+
+            2.  Initialises the I-Cache, sets up the stack &
+                passes control to the C handler for performing
+                the rest of the initialization.
+
+        3.  c_start.c
+
+            1.  Picks up from where the start() routine left in
+                the previous file.
+
+            2.  Programs the exception vector tables for the
+                Secure world.
+
+            3.  Provides Non-secure access to certain
+                coprocessor registers and memory mapped
+                peripherals e.g. access to the cache
+                coherent interconnect registers, coprocessors
+                etc.
+
+            4.  Enables functionality which can be initialised
+                only in the Secure world. e.g. Configuration of
+                interrupts as Non-secure.
+
+            5.  Synchronises execution with the secondary cpus
+                (if present) so that any global peripheral is
+                accesses by them only after the primary has
+                initialised it.
+
+            6.  Enters the non-secure HYP mode and initialises
+                the Virtualizer.
+
+            7.  Enters the non-secure SVC mode and jumps to the
+                payload software entry point.
+
+        4.  payload/
+
+            1.  Contains two files 'fsimg' and 'kernel'.
+
+            2.  The 'kernel' is a raw Linux kernel binary image.
+                The instructions to build this Linux image can
+                be found in docs/03-Linux-kernel-build.txt.
+                This image can be replaced with a raw binary
+                image of any other software payload which is
+                desired to be run on this system.
+
+            3.  The 'fsimg' is an empty filesystem stub. If
+                desired, it can be replaced with a suitable
+                filesystem image in a Linux initramfs format. A
+                custom busybox filesystem was used for testing.
+                More complex filesystems may be used if needed
+                but will require the use of MMC emulation with
+                the ARM FastModels.
+                See docs/06-Optional-rootfs-build.txt for
+                details.
+
+        5.  boot.map.template
+
+            1.  Scatter file which combines the payload
+                software, Virtualizer and the bootloader into a
+                single ELF file (img.axf) which can
+                then be loaded on the relevant platform.
+
+        6.  makemap
+
+            1.  Simple perl script that takes an ELF image of
+                the Virtualizer, parses through the relevant
+                sections & adds those sections to the scatter
+		file so that a consolidated image can be
+		created.
+
+    2.  big-little/common
+
+        This directory mainly deals with setting up of the HYP
+        processor mode and the Virtual GIC. This allows the
+        payload software to run unmodified while either the
+        Switching or the MP mode is active in the background.
+
+        The important files here are:
+
+        1.  hyp_vectors.s
+
+            1.  Implements the HYP mode vector table.
+
+            2.  It contains the entry point "bl_setup()" which
+                is invoked by the bootwrapper to initialise the
+                Virtualizer software.
+
+            3.  The exception vector for interrupts
+                [irq_entry()] is the entry point for all
+                physical interrupts. The exception vector for
+                hypervisor traps [hvc_entry()] is the entry
+                point for all accesses made by the payload
+                software that need to be handled in the HYP
+                mode.
+
+            4.  Also contained is rudimentary support for fault
+                exception handlers [dabt_entry(), iabt_entry() &
+                undef_entry()].
+
+        2.  hyp_setup.c
+
+            1.  Extends the initialization of the Virtualizer
+                software into C code after a cold reset.
+
+            2.  If switching is being done asynchronously then
+                the HYP timer interrupt is setup to periodically
+                (~12 million instructions) trigger a switchover
+                to the other cluster.
+
+            3.  If in MP mode, then CCI snoops are enabled for
+                both the clusters.
+
+        3.  vgic_handle.c
+
+            1.  Extends handling of physical interrupts into C
+                code from irq_entry(). Interrupts are
+                acknowledged (optionally EOI'ed) and queued as
+                virtual interrupts. The HYP timer interrupt is
+                handled differently. When recieved, its used as
+                a trigger to initiate the switchover process.
+
+        4.  vgiclib.c
+
+            1.  Implements handling of virtual interrupts once
+                they have been queued up in the vGIC HYP view
+                list registers. It maintains the list registers
+                and also saves and restores the context of the
+                vGIC HYP view interface.
+
+        5.  pagetable_setup.c
+
+            1.  Creates and sets up the HYP mode and 2nd stage
+                translation page tables. Accesses by the payload
+                software to the vGIC physical cpu interface are
+                mapped to the vGIC virtual cpu interface using
+                the 2nd stage translation page tables.
+
+            2.  In the MP configuration, the translation tables
+                are shared by all the cpus in the two clusters.
+                Hence the first cpu in only one of the clusters
+                creates them.
+
+        6.  vgic_setup.c
+
+            1.  Enables virtual interrupts and exceptions,
+                initialises the physical cpu interface and the
+                HYP view interface.
+
+    3.  big-little/lib
+
+        This directory implements common functionality thats
+        used across all the Virtualizer code. This includes:
+
+        1.  Locks which can be used with Strongly Ordered and
+            Device memory.
+
+        2.  Code tracing support on the Fast Models platform
+            through the use of memory mapped TUBE registers &
+            the Generic Trace plugin.
+            Details of this feature can be found in
+            docs/04-Cache-hit-rate-howto.txt.
+
+        3.  Events to synchronise the switching process between
+            the clusters and within the clusters. They also used
+            to synchronise the setup phase after a cold reset in
+            the MP configuration.
+
+        4.  UART routines to enable support semihosting of
+            printf family of functions.
+
+        5.  Cache maintenance, Stack manipulation and Locking
+            routines.
+
+    4.  big-little/include
+
+        1.  This directory contains the headers specific to HYP
+            mode setup, Switching process and common helper
+            routines. Most importantly, context.h contains the
+            data structures which are used to save and restore
+            the processor context.
+
+    5.  big-little/switcher
+
+        This directory implements code to save and restore
+        processor context and to initiate/handle a
+        async/synchronous switchover request.
+
+        1.  context/
+
+            1.  ns_context.c
+
+                1.  Contains top level routines to save and
+                    restore the Non-secure world context.
+
+                2.  It requests the secure world to save its own
+                    context and bring the inbound cluster out of
+                    reset. It also uses events to synchronise
+                    the switching process between the inbound
+                    and outbound clusters.
+
+            2.  gic.c
+
+                1.  Contains routines to save and restore the
+                    context of the vGIC physical distributor and
+                    cpu interfaces.
+
+            3.  sh_vgic.c
+
+                1.  The two clusters share the interrupt
+                    controller instead of each cluster having
+                    its own. A consequence of this is that there
+                    is no longer a 1 to 1 mapping between cpu
+                    ids and cpu interface ids e.g. on an
+                    MPx1+MPx1 cluster configuration,
+                    cpu0 of the Cortex-A7 cluster would
+                    correspond to cpuinterface1 on the shared
+                    vGIC. This in turn affects routing of
+                    peripheral and software generated
+                    interrupts. This file implements code to
+                    allow use of the shared vGIC correctly
+                    keeping this limitation in mind.
+
+        2.  trigger/
+
+            1.  async_switchover.c
+
+                1.  Contains code to use the HYP timer interrupt
+                    as a trigger to initiate a switchover
+                    asynchronously.
+
+            2.  sync_switchover.c
+
+                1.  Contains code to handle an HVC instructions
+                    executed by the payload software:
+
+		    a. to initiate a synchronous switchover.
+		       ("HVC #1")
+
+		    b. to find the id of the cluster on which its
+		       currently executing. ("HVC #2")
+
+            3.  handle_switchover.s
+
+                1.  Contains code to start saving the non-secure
+                    world context and request the secure world to
+                    power down the outbound cluster once the
+                    inbound cluster is up and running.
+
+    6.  big-little/virtualisor
+
+        This directory implements code that using the ARM
+        Virtualization extensions:
+
+        1.  Hides any microarchitectural differences between the
+            Cortex-A15 & Cortex-A7 processors visible to the
+            payload software.
+
+        2.  Provides a different view of the underlying hardware
+            than what really exists e.g. in the switching mode
+            it traps accesses made by the host cluster
+            (Cortex-A7 cluster currently) to the shared vGIC
+            physical distributor interface, so that routing of
+            interrupts can take place correctly. In the MP mode,
+            the L2 control and MPIDR registers are virtualized
+            to tell the payload software that there is one
+            cluster with multiple processors instead of two.
+
+        The ARM Virtualization extensions provide a set of trap
+        registers (HCPTR (Hyp Coprocessor Trap Register), HSTR
+        (Hyp System Trap Register), HDCR (Hyp Debug
+        Configuration Register)) to be able to select what
+        accesses made by the payload software to the coprocessor
+        block will be trapped in the HYP mode.
+
+        Accesses to memory mapped peripherals e.g. shared vGIC
+        can betrapped into the HYP mode by populating
+        appropriate entries in the 2nd stage translation tables.
+        This is how microarchitectural differences between the
+        two processor sets are resolved.
+
+        Whenever a trap into HYP mode is taken, the HSR (Hyp
+        Syndrome Register) contains enough information about the
+        type of trap taken for the software to take appropriate
+        action.
+
+        The Virtualizer design centres around the traps
+        recognized by the HSR. Also, to deal with
+        microarchitectural differences the concept of a HOST
+        cluster is introduced. It is possible for each
+        cpu to find out the system topology using the Kingfisher
+        System Control Block. Once it knows the host cluster id
+        & whether the software is expected to switch execution
+        or run in the MP mode (provided at compile time), the
+        CPU Can configure itself accordingly.
+
+        The processor cluster for which the payload software has
+        been built to run on [assumed to be Cortex-A15 for this
+        release] is termed as the TARGET while the cluster on
+        which the differences are expected to crop up is called
+        the HOST (assumed to be Cortex-A7 for this release).
+        The HOST environment variable is used to specify
+        the host cluster. The target cluster is assumed to be
+        the logical complement of the host i.e. cluster ids can
+        only take the values of 0 and 1.
+
+        The HOST processor emulates the TARGET processor by
+        trapping the accesses to differing processor features
+        into the HYP mode. Most of the microarchitectural
+        differences & registers that need to be virtualized are
+        handled in a generic (CPU Independent) layer of
+        code. Additionally, each processor exports functions to
+        setup, handle & optionally save/restore context of each
+        trap that the HSR recognises. These handlers are invoked
+        whenever the software runs
+        on that processor.
+
+        1.  virt_setup.c
+
+            1.  Generic function that initialises the required
+                traps. This is done once each on both the host
+                and target  clusters if the trap handler needs
+                to obtain some information about the target
+                cluster to be able to work correctly e.g the
+                Cortex-A7 processor cluster needs to find out
+                the cache geometry of the Cortex-A15
+                processor cluster to be able to handle cache
+                maintenance operations by set/way correctly.This
+                function further calls any setup function that
+                has been exported by the processor the code is
+                executing on.
+
+        2.  virt_handle.c
+
+            1.  Generic function that extends the hvc_entry()
+                routine to C Code. It calls the generic trap
+                handler (if registered) and then any trap
+                handlers exported by the processor on
+                which the trap has been invoked.
+
+        3.  virt_context.c
+
+            1.  Generic function that saves and restores traps
+                on the host cluster & then calls any
+                save/restore function that has been exported by
+                the processor the code is executing on.
+
+        4.  cache_geom.c
+
+            1.  Generic function that detects cache geometries
+                on the host and target clusters & then maps
+                cache maintenance operations by set/way from the
+                target to the host cache.
+
+        5.  mem_trap.c
+
+            1.  Generic function that sets up any memory traps
+                by editing the 2nd stage translation tables.
+
+        6.  vgic_trap_handler.c
+
+            1.  Generic function that handles trapped accesses
+                to the shared vGIC.
+
+    7.  include/
+
+        Header files specific to the Virtualisor code.
+
+    8.  cpus/
+
+        Placeholders for any traps that the Cortex-A7 or A15 processor
+        cluster might want to setup. No traps need to be setup
+        at the moment.
+
+    9.  big-little/secure_world
+
+        Since both Cortex-A7 & Cortex-A15 processors support ARM
+        TrustZone Security Extensions, there is certain context
+        that needs to be setup, saved & restored in the Secure
+        world.
+
+        This context allows access to certain coprocessor and
+        peripheral registers to the Non-secure world. It also
+        configures the shared vGIC for use by the Non-secure
+        world.
+
+        Execution shifts to the Secure world through the SMC
+        instruction which is a part of the ARM V7-ISA.
+
+        1.  monmode_vectors.s
+
+            1.  Implements the monitor mode vector table.  It
+                contains the secure entry point [do_smc()] for
+                the SMC instruction alongwith rudimentary
+                support for other fault exceptions taken while
+                executing in the secure world.
+
+            2.  Three types of SMC exceptions are expected (type
+                of exception is contained in r0):
+
+                1.  SMC_SEC_INIT
+
+                    Called once after a power on reset to
+                    initialise the Secure world stacks,
+                    coherency, pagetables, to configure some
+                    coprocessor and memory mapped peripheral
+		    (Coherent interconnect & shared vGIC)
+		    registers for use of these features by
+                    the Non-secure world.
+
+                2.  SMC_SEC_SAVE
+
+                    Called from ns_context.c to request the
+                    secure world to save its context and bring
+                    the corresponding core in the inbound
+                    cluster out of reset so that it can start
+                    restoring the saved state.
+
+                3.  SMC_SEC_SHUTDOWN
+
+                    Called from handle_switchover.s to request
+                    the secure world to flush the L1 and L2 caches
+                    and power down the outbound cluster.
+
+               Also implemented is a function to handle warm
+               resets on the inbound cluster. Bareminimal
+               context is initialised while the rest is restored
+               before control is passed to the Non-secure world
+               handler for restoring context [restore_context()]
+               in ns_context.c
+
+        2.  secure_context.c
+
+            Implements code to save and restore the secure world
+            context
+
+        3.  secure_resets.c
+
+            Implements code to power down the outbound cluster
+            and bring individual cores in the inbound cluster
+            out of reset.
+
+        4.  ve_reset_handler.s
+
+            Base of physical memory in the Versatile Express
+            memory map is at 0x80000000. The processors are
+            brought out of reset at 0x0 which points to Secure
+            RAM/Flash memory. This file implements a small stub
+            function that is placed at 0x0 so that execution
+            jumps to 0x80000000 after a cold reset and to the
+            warm_reset() handler in monmode_vectors.s
+            after a warm reset.
+
+        The secure world code is built into a seperate ELF image
+        to maintain its distinction from the Virtualizer code
+        that executes in the Non-secure world.
+
+    10. big-little/bl.scf.template
+
+        1.  Scatter file that is used to build the Non-secure
+            world code in the Virtualizer software. The
+            resultant image is bl.axf.
+
+    11. big-little/bl-sec.scf.template
+
+        1.  Scatter file that is used to build the Secure world
+            code in the Virtualizer software. The resultant
+            image is bl_sec.axf.
+
+    12. acsr/
+
+        The secure world code is built into a seperate ELF image
+        to maintain its distinction from the Virtualizer code
+        that executes in the Non-secure world.
+
+        1.  helpers.s
+
+            Helper functions to access the CP15 coprocessor
+            space.
+
+        2.  v7.s
+
+            Contains routines to save and restore ARM processor
+            context.
diff --git a/linaro/arm-virt-bl/docs/03-Linux-kernel-build.txt b/linaro/arm-virt-bl/docs/03-Linux-kernel-build.txt
new file mode 100644
index 0000000..876cce3
--- /dev/null
+++ b/linaro/arm-virt-bl/docs/03-Linux-kernel-build.txt
@@ -0,0 +1,58 @@
+Building and installing a Linux kernel
+======================================
+
+A suitable Linux kernel image for use with the virtualizer
+can be built as follows (GCC toolchain used for these steps is:
+CodeSourcery Sourcery G++ Lite 2010.09 v4.5.1)
+
+$ tar -jxf arm-virtualizer-v2_2-160212.tar.bz2
+$ cd arm-virtualizer-v2_2-160212/bootwrapper
+$ make clean
+$ pushd /tmp
+$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git arm-platforms.git
+$ cd arm-platforms.git
+$ git checkout -b ael-11.06 origin/ael-11.06
+$ yes | make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- vexpress-new_defconfig
+$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- -j4
+$ popd
+$ cp $OLDPWD/arch/arm/boot/Image payload/kernel
+
+The virtualizer can now be built as usual by invoking:
+
+$ make clean && make
+
+.. in the top bootwrapper directory.
+
+This will result in a file called img.axf located at
+arm-virtualizer-v2_2-160212/bootwrapper/img.axf.
+
+To launch the ARM FastModel with the virtualizer, first modify
+arm-virtualizer-v2_2-160212/bootwrapper/big-little-MP<x>.mxscript
+as usual to fill in paths to the model binary and the img.axf files.
+The mxscript file is adequately commented to assist with this.
+
+In case of an MP1 model, we would use the big-little-MP1.mxscript file
+and we would specify the path to the model in a manner similar to:
+
+string model = "/home/working_dir/RTSM_VE_Cortex-A15x1-A7x1";
+
+Similarly, in case of an MP4 model, we would use the big-little-MP4.mxscript
+and we would specify the path to the model in a manner similar to:
+
+string model = "/home/working_dir/models/RTSM_VE_Cortex-A15x4-A7x4";
+
+The path to the img.axf file is specified using the app directive as
+follows:
+
+string app = "arm-virtualizer-v2_2-160212/bootwrapper/img.axf";
+
+The model can then be launched using:
+
+modeldebugger -s arm-virtualizer-v2_2-160212/bootwrapper/big-little-MP<x>.mxscript
+
+Where 'x' is the 1 or 4 respectively in the case of an MP1 model run or an
+MP4 model run.
+
+This will result in the Linux kernel console messages appearing the ARM
+FastModel UART emulation window. The virtualizer will switch execution
+between the two clusters at ~12 million instruction intervals.
diff --git a/linaro/arm-virt-bl/docs/04-Cache-hit-rate-howto.txt b/linaro/arm-virt-bl/docs/04-Cache-hit-rate-howto.txt
new file mode 100644
index 0000000..27c83b4
--- /dev/null
+++ b/linaro/arm-virt-bl/docs/04-Cache-hit-rate-howto.txt
@@ -0,0 +1,198 @@
+Cache hit-rate HOWTO
+====================
+
+A   Introduction
+
+    The ARM Fast Models are accompanied with a trace infrastructure
+    referred to as the Model Trace Interface (MTI). The MTI trace
+    provides a mechanism to dynamically register to events from the
+    model. The GenericTrace.so MTI trace plugin provides a number of
+    trace events whose output can be logged in a simple text file.
+    The usage of this plugin is given in Section B.
+
+    In this document we will consider how the GenericTrace.so plugin
+    can be used during a cluster switchover to calculate the number
+    of cache hits in the outbound cluster L2 cache originating from
+    the inbound cluster before the outbound L2 is flushed and the
+    cluster placed in reset.
+
+B   Plugin Usage
+
+    The GenericTrace plugin is loaded using the "--trace-plugin"
+    parameter in the command line to launch the model.
+
+    A list of trace sources provided by the plugin can be listed as
+    follows:
+
+    "RTSM_VE_Cortex-A15x1-A7x1 --trace-plugin GenericTrace.so
+     --parameter TRACE.GenericTrace.trace-sources= "
+
+    A list of parameters supported by the Generic Trace plugin can
+    be listed as follows:
+
+    "RTSM_VE_Cortex-A15x1-A7x1 --trace-plugin GenericTrace.so -l"
+
+    Some of the interesting parameters are:
+
+    TRACE.GenericTrace.trace-file: The trace file to write into. If
+    empty will print to console / STDOUT.
+
+    TRACE.GenericTrace.perf-period: Print performance every N
+    instructions. Since the instruction count and the global counter
+    have the same value on the Fast Models, this parameter provides
+    a good approximation of time.
+
+    TRACE.GenericTrace.flush: If set to true then the trace file will be
+    flushed after every event.
+
+C   Plugin Trace sources
+
+    The GenericTrace plugin provides events which allow each cluster
+    to trace snoop requests originating from a different cluster that
+    hit in its caches. For snoops originating from the Cortex-A7 cluster
+    that hit in the A15 cluster, the event is 'read_for_4_came_from_snoop'
+    & for the opposite case the event is 'read_for_3_came_from_snoop'.
+    The numbers '3' & '4' in the name of the trace sources are the ids
+    of the CCI slave interfaces from where the snoop originated.
+
+    These trace sources are the per-cluster implementation of the
+    event id '0xA' "(Read data last handshake - data returned
+    from the cache rather than from downstream)" of the CCI PMU.
+    Please refer to the "Cache Coherent Interconnect (CCI-400)
+    Architecture Specification" for further details.
+
+    The plugin also provides the ability to trace code execution through
+    a memory mapped "tube" interface. This interface defines a list of
+    registers which when written to in a particular sequence and the
+    'sw_trace_event' trace source selected during model invocation will
+    print out the register values in the trace file.
+
+    The "tube" interface defines:
+
+    - Three LE 64 bit registers of arbitrary data that can be
+      written (and retain their values).
+
+    - A tube-like char register which when written with '\0'
+      will generate an event with the current state of the
+      64-bit registers and with the characters sent to the
+      device with a unique sequence_id.
+
+    All of these registers are banked and write-only, the trace
+    event will also output the cluster id and the CPU id. ARM
+    FastModels implement 1 to 4 TUBE interfaces. Please refer to
+    Section E for supported interfaces in the current model
+    release. The memory map of these registers can be found in
+    big-little/include/misc.h.
+
+    The 'write_trace' function in big-litte/lib/tube.c implements the
+    software sequence to program the tube interface. This function is
+    called at various points in switchover process. It prints out a
+    message which indicates that an event is  about to start or has
+    completed alongwith the value of the global counter in one of the
+    64 bit registers. To enable this functionality, the environment
+    variable "TUBE" needs to be defined to TRUE prior to code compilation.
+
+D   Putting it all together
+
+    The list of steps to use the above mentioned functionality is:
+
+    1. Build the Virtualizer code with "TUBE" support. On the
+       tcsh shell, this is as follows;
+
+       $ setenv TUBE TRUE; make clean && make
+
+    2. Launch the model with the MTI trace plugin support and a
+       selection of the right trace sources using a suitable
+       MXScript file in the 'bootwrapper' directory.
+
+    Once the switchover process starts, the trace file will contain output
+    that looks like this (not including the comments):
+
+    .
+    .
+    .
+    .
+    // Lines beginning with "PERFORMANCE" are a result of the value of the
+    // "TRACE.GenericTrace.perf-period" parameter. This string is printed
+    // every <value> number of instructions (200 in this case) in the trace
+    // file. It indicates at what rate is the model executing instructions
+    // & the number of instructions executed thus far.
+    PERFORMANCE:   2.8 MIPS (Inst:67216767)
+    .
+    .
+    .
+    // Lines beginning with "sw_trace_event<x>" are a result of enabling
+    // "TUBE" support in the code and selecting the "sw_trace_event" source
+    // while invoking the model. The interpretation of this message is:
+    //
+    // <x>          : indicates the "TUBE" interface number.
+    // sequence_id      : a unique number assigned to each message
+    // cluster_and_cpu_id   : in the format 0x<cluster id><cpu id>. Each id
+    //            occupies 8 bits.
+    // data0        : first 64-bit register value. Programmed with
+    //            the value of the global counter.
+    // data1        : second 64-bit register value. Not used.
+    // data2        : third 64-bit register value. Not used.
+    // message      : String written to the TUBE register
+    sw_trace_event2: sequence_id=0x00000001 cluster_and_cpu_id=0x0000 data0=0x000000000401a3dc data1=0x0000000000000000 data2=0x0000000000000000 message="Secure Coherency Enable Start":30
+    .
+    .
+    .
+    PERFORMANCE:   0.2 MIPS (Inst:67217079)
+    sw_trace_event2: sequence_id=0x00000002 cluster_and_cpu_id=0x0000 data0=0x000000000401a581 data1=0x0000000000000000 data2=0x0000000000000000 message="Secure Coherency Enable End":28
+    PERFORMANCE:   0.9 MIPS (Inst:67217301)
+    PERFORMANCE:   5.8 MIPS (Inst:67217511)
+    .
+    .
+    .
+    // Lines beginning with "read_for_<x>_came_from_snoop" are a result of
+    // enabling the event sources for monitoring the cache hits resulting
+    // from snoops originating from master interface <x> on the CCI.
+    // The following line indicates that a snoop from the Cortex-A7 cluster
+    // hit in the caches of the A15 cluster. It also prints the cache line
+    // address and whether the access was Secure or Non-secure.
+    read_for_4_came_from_snoop: Bus address=0x000000008ff02440 Is non secure=N
+    read_for_4_came_from_snoop: Bus address=0x000000008ff02440 Is non secure=N
+    read_for_4_came_from_snoop: Bus address=0x000000008ff02240 Is non secure=N
+    read_for_4_came_from_snoop: Bus address=0x000000008ff02240 Is non secure=N
+    read_for_4_came_from_snoop: Bus address=0x000000008ff012c0 Is non secure=N
+    PERFORMANCE:   0.0 MIPS (Inst:135292834)
+    sw_trace_event: sequence_id=0x00000010 cluster_and_cpu_id=0x0000 data0=0x000000000810672e data1=0x0000000000000000 data2=0x0000000000000000 message="L2 Flush Begin":15
+    PERFORMANCE:   5.5 MIPS (Inst:135293056)
+    PERFORMANCE:   7.2 MIPS (Inst:135293374)
+    PERFORMANCE:   7.4 MIPS (Inst:135293587)
+    PERFORMANCE:  12.4 MIPS (Inst:135293800)
+    PERFORMANCE:  10.0 MIPS (Inst:135294118)
+    read_for_4_came_from_snoop: Bus address=0x0000000080054a80 Is non secure=Y
+    read_for_4_came_from_snoop: Bus address=0x0000000080054a80 Is non secure=Y
+    read_for_4_came_from_snoop: Bus address=0x0000000080054ac0 Is non secure=Y
+    read_for_4_came_from_snoop: Bus address=0x0000000080054ac0 Is non secure=Y
+    read_for_4_came_from_snoop: Bus address=0x0000000080074c80 Is non secure=Y
+    PERFORMANCE:   0.5 MIPS (Inst:135294331)
+    .
+    .
+    .
+    .
+    PERFORMANCE:  10.5 MIPS (Inst:135541612)
+    PERFORMANCE:   3.3 MIPS (Inst:135541929)
+    sw_trace_event: sequence_id=0x00000011 cluster_and_cpu_id=0x0000 data0=0x0000000008143442 data1=0x0000000000000000 data2=0x0000000000000000 message="L2 Flush End":13
+    .
+    .
+    .
+    .
+
+    Post-processing scripts can be developed which count the number of
+    'read_for_<x>_came_from_snoop' events between two 'sw_trace_event<x>'
+    events. In the above example, the result will be the number of snoop
+    hits in the A15 caches while they were being flushed. In addition,
+    the "PERFORMANCE" strings can be used to determine the cache hit rate.
+    In this case, they indicate the number of hits in the last 200
+    instructions. Repeated iterations can be done where each iteration
+    changes the point of time when the L2 cache is flushed during a
+    switchover. By monitoring its effect on the cache hit rate, a suitable
+    time can be determined to power down the outbound L2 cache.
+
+E   Status of "TUBE" support
+
+    The Real-Time System Model v7.0.1 (RTSM_VE_Cortex_A15x1_A7x1 and
+    RTSM_VE_Cortex_A15x4_A7x4) implements 'tube' interfaces TUB0-3.
diff --git a/linaro/arm-virt-bl/docs/05-FAQ.txt b/linaro/arm-virt-bl/docs/05-FAQ.txt
new file mode 100644
index 0000000..f054a6d
--- /dev/null
+++ b/linaro/arm-virt-bl/docs/05-FAQ.txt
@@ -0,0 +1,20 @@
+Frequently asked questions
+==========================
+
+Q1. What is the per-core context size that is switched between
+    clusters?
+
+A1:
+
+    Per-CPU context:
+
+    CP15 and VFP context: 768 bytes
+    vGIC Virtual CPU interface (payload view) context: 128 bytes
+    vGIC Virtual CPU interface (HYP mode view) context: 280 bytes
+    vGIC Distributor context (SGIs & PPIs): 128 bytes
+    Virt. Ext. Registers: 40 bytes
+
+    Global context:
+
+    vGIC Distributor context (SPIs): 2048 bytes
+    2nd stage translation trap context: 40 bytes
diff --git a/linaro/arm-virt-bl/docs/06-Optional-rootfs-build.txt b/linaro/arm-virt-bl/docs/06-Optional-rootfs-build.txt
new file mode 100644
index 0000000..6dd9d62
--- /dev/null
+++ b/linaro/arm-virt-bl/docs/06-Optional-rootfs-build.txt
@@ -0,0 +1,122 @@
+Optional Root filesystem build and use instructions
+===================================================
+
+A   Introduction
+
+    This note describes ways to build Linux user-land
+    filesystems of varying complexity for use with the
+    virtualizer. Note that there are several ways to create
+    filesystems and this note doesn't cover all possibilities.
+
+    The default virtualizer release contains an empty filesystem
+    stub located at:
+
+    arm-virtualizer-v2_2-160212/bootwrapper/payload/fsimg
+
+    A build using this stub doesn't contain a functional
+    filesytem that the Linux kernel image can use. fsimg can be
+    replaced with a suitable filesystem image but with the
+    following constraints:
+
+    1.  Compressed or uncompressed cpio archives are supported.
+
+    2.  The image size is limited to ~200 MB.
+
+    The size restriction implies that only very 'lean'
+    filesystems such as busybox <http://www.busybox.net/> may be
+    used. While busybox presents a minimal but robust command
+    line environment, quite often a more conventional desktop
+    like environment with window management on top of an X
+    server is required in order to run web browsers etc.
+
+    In this note, we illustrate a method to use a larger (~2GB) filesystem image
+    that can be used with the ARM FastModels MMC emulation. Note that the MMC
+    emulations only supports images that are just under 2GB in size.
+
+    Note that if the MMC route is used, the bootwrapper/payload/fsimg filesystem
+    image will be suppressed and ignored.
+
+    Locating a root filesystem on the MMC emulation allows the Linux kernel to
+    access and use this filesystem.  This is facilitated by indicating the
+    filesystem location to the kernel via the kernel command-line arguments by
+    appending 'root=/dev/mmcblk0' (for a single partition MMC image) to the
+    argument list.
+
+    Note that when using this technique, the fsimg file is ignored.
+
+B   Building and installing a Linux kernel
+
+    A suitable Linux kernel image for use with the virtualizer
+    can be built as follows:
+
+    $ tar -jxf arm-virtualizer-v2_2-160212.tar.bz2
+    $ cd arm-virtualizer-v2_2-160212/bootwrapper
+    $ make clean
+    $ pushd /tmp
+    $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/maz/ael-kernel.git ael-kernel.git
+    $ cd ael-kernel.git
+    $ git checkout -b ael-11.06 origin/ael-11.06
+    $ yes | make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- vexpress-new_defconfig
+    $ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- -j4
+    $ popd
+    $ cp $OLDPWD/arch/arm/boot/Image payload/kernel
+
+    Note that the using the vexpress-new_defconfig configuration
+    ensures that the kernel is built with MMC support.
+
+C   Building a suitable root filesystem
+
+    A suitable root filesystem can be built using Ubuntu Linux's rootstock utility
+    <https://wiki.ubuntu.com/ARM/RootfsFromScratch> as follows:
+
+    $ sudo apt-get install rootstock
+    $ sudo rootstock --fqdn ubuntu --login ubuntu --password ubuntu --imagesize 2040M --seed lxde,gdm --notarball
+    $ mv qemu-armel-*.img mmc.img
+
+    Note that the complete filesystem build will take ~30
+    minutes. On boot, the username and password is 'ubuntu'.
+
+    The rootstock invocation above will produce a rootfilesystem containing an
+    LXDE desktop <http://lxde.org/> that has a firefox browser.
+
+D   Modifying the kernel command line to support the MMC image.
+
+    The virtualizer build system and the mxscripts that are used for launching
+    the ARM FastModel require modifications to support the MMC image.
+
+    The build system modification is to change the Linux kernel command line
+    arguments to make the kernel aware of the location of the root filesystem.
+    The command line should contain the string 'root=/dev/mmcblk0'.
+
+    To make this modification, edit the file bootwrapper/Makefile and change the
+    BOOTARGS specification on line 42 from:
+
+    BOOTARGS=mem=255M console=ttyAMA0,115200 migration_cost=500
+    cachepolicy=writealloc
+
+    to
+
+    BOOTARGS=root=/dev/mmcblk0 mem=255M console=ttyAMA0,115200
+    migration_cost=500 cachepolicy=writealloc
+
+    The ARM FastModel mxscript modification is to get the FastModel to use the
+    mmc.img file created in step C above with the MMC emulation.
+
+    To make this modification uncomment the 'string mmcimage=' line (line 42)
+    and provide the complete path to the mmc.img file generated in step C above.
+
+E   Building the virtualizer
+
+    $ cd bootwrapper
+    $ make clean && make
+
+F   Launching the ARM FastModel
+
+    $ modeldebugger -s big-little-MP<x>.mxscript
+
+    .. where x is 1 or 4 as the case may be (MP1 build or MP4
+    build).
+
+G   Known limitations
+
+    None.
author	John Rigby <john.rigby@linaro.org>	2012-03-27 09:36:17 -0600
committer	John Rigby <john.rigby@linaro.org>	2012-03-27 09:36:17 -0600
commit	891a674c57a0a1e4a2b871d5b6e874857e977ef5 (patch)
tree	4cdc25fb8fe7352086da9a16cc1a9555a2d1f400 /linaro/arm-virt-bl/docs
parent	f0d8629d72fea2d76060d97c2f463e17c67ec844 (diff)