diff --git a/.circleci/config.yml b/.circleci/config.yml index 7dcbc80957..9c6e9f80d8 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -308,6 +308,7 @@ jobs: tools-version: "esp-tools" group-key: "group-accels" project-key: "chipyard-hwacha" + timeout: "30m" chipyard-gemmini-run-tests: executor: main-env steps: diff --git a/docs/Advanced-Concepts/Chip-Communication.rst b/docs/Advanced-Concepts/Chip-Communication.rst index 50c5ac9a1d..6e8d2c0e5e 100644 --- a/docs/Advanced-Concepts/Chip-Communication.rst +++ b/docs/Advanced-Concepts/Chip-Communication.rst @@ -7,7 +7,7 @@ There are two types of DUTs that can be made: `tethered` or `standalone` DUTs. A `tethered` DUT is where a host computer (or just host) must send transactions to the DUT to bringup a program. This differs from a `standalone` DUT that can bringup itself (has its own bootrom, loads programs itself, etc). An example of a tethered DUT is a Chipyard simulation where the host loads the test program into the DUTs memory and signals to the DUT that the program is ready to run. -An example of a standalone DUT is a Chipyard simulation where a program can be loaded from an SDCard by default. +An example of a standalone DUT is a Chipyard simulation where a program can be loaded from an SDCard out of reset. In this section, we mainly describe how to communicate to tethered DUTs. There are two ways the host (otherwise known as the outside world) can communicate with a tethered Chipyard DUT: @@ -45,33 +45,21 @@ Using the Tethered Serial Interface (TSI) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ By default, Chipyard uses the Tethered Serial Interface (TSI) to communicate with the DUT. -TSI protocol is an implementation of HTIF that is used to send commands to the -RISC-V DUT. These TSI commands are simple R/W commands -that are able to probe the DUT's memory space. During simulation, the host sends TSI commands to a -simulation stub called ``SimSerial`` (C++ class) that resides in a ``SimSerial`` Verilog module -(both are located in the ``generators/testchipip`` project). This ``SimSerial`` Verilog module then -sends the TSI command recieved by the simulation stub into the DUT which then converts the TSI -command into a TileLink request. This conversion is done by the ``SerialAdapter`` module -(located in the ``generators/testchipip`` project). In simulation, FESVR -resets the DUT, writes into memory the test program, and indicates to the DUT to start the program -through an interrupt (see :ref:`customization/Boot-Process:Chipyard Boot Process`). Using TSI is currently the fastest -mechanism to communicate with the DUT in simulation. - -In the case of a chip tapeout bringup, TSI commands can be sent over a custom communication -medium to communicate with the chip. For example, some Berkeley tapeouts have a FPGA -with a RISC-V soft-core that runs FESVR. The FESVR on the soft-core sends TSI commands -to a TSI-to-TileLink converter living on the FPGA (i.e. ``SerialAdapter``). After the transaction is -converted to TileLink, the ``TLSerdesser`` (located in ``generators/testchipip``) serializes the -transaction and sends it to the chip (this ``TLSerdesser`` is sometimes also referred to as a -serial-link or serdes). Once the serialized transaction is received on the -chip, it is deserialized and masters a bus on the chip. The following image shows this flow: - -.. image:: ../_static/images/chip-bringup.png - -.. note:: - The ``TLSerdesser`` can also be used as a slave (client), so it can sink memory requests from the chip - and connect to off-chip backing memory. Or in other words, ``TLSerdesser`` creates a bi-directional TileLink - interface. +TSI protocol is an implementation of HTIF that is used to send commands to the RISC-V DUT. +These TSI commands are simple R/W commands that are able to access the DUT's memory space. +During simulation, the host sends TSI commands to a simulation stub in the test harness called ``SimSerial`` +(C++ class) that resides in a ``SimSerial`` Verilog module (both are located in the ``generators/testchipip`` +project). +This ``SimSerial`` Verilog module then sends the TSI command recieved by the simulation stub +to an adapter that converts the TSI command into a TileLink request. +This conversion is done by the ``SerialAdapter`` module (located in the ``generators/testchipip`` project). +After the transaction is converted to TileLink, the ``TLSerdesser`` (located in ``generators/testchipip``) serializes the +transaction and sends it to the chip (this ``TLSerdesser`` is sometimes also referred to as a digital serial-link or SerDes). +Once the serialized transaction is received on the chip, it is deserialized and masters a TileLink bus on the chip +which handles the request. +In simulation, FESVR resets the DUT, writes into memory the test program, and indicates to the DUT to start the program +through an interrupt (see :ref:`customization/Boot-Process:Chipyard Boot Process`). +Using TSI is currently the fastest mechanism to communicate with the DUT in simulation (compared to DMI/JTAG) and is also used by FireSim. Using the Debug Module Interface (DMI) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -90,14 +78,14 @@ command into a TileLink request. This conversion is done by the DTM named ``Debu When the DTM receives the program to load, it starts to write the binary byte-wise into memory. This is considerably slower than the TSI protocol communication pipeline (i.e. ``SimSerial``/``SerialAdapter``/TileLink) which directly writes the program binary to memory. -Thus, Chipyard removes the DTM by default in favor of the TSI protocol for DUT communication. Starting the TSI or DMI Simulation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -All default Chipyard configurations use TSI to communicate between the simulation and the simulated SoC/DUT. Hence, when running a -software RTL simulation, as is indicated in the :ref:`simulation/Software-RTL-Simulation:Software RTL Simulation` section, you are in-fact using TSI to communicate with the DUT. As a -reminder, to run a software RTL simulation, run: +All default Chipyard configurations use TSI to communicate between the simulation and the simulated SoC/DUT. +Hence, when running a software RTL simulation, as is indicated in the +:ref:`simulation/Software-RTL-Simulation:Software RTL Simulation` section, you are in-fact using TSI to communicate with the DUT. +As a reminder, to run a software RTL simulation, run: .. code-block:: bash @@ -105,11 +93,10 @@ reminder, to run a software RTL simulation, run: # or cd sims/vcs - make CONFIG=LargeBoomConfig run-asm-tests - -FireSim FPGA-accelerated simulations use TSI by default as well. + make CONFIG=RocketConfig run-asm-tests -If you would like to build and simulate a Chipyard configuration with a DTM configured for DMI communication, then you must tie-off the TSI interface, and instantiate the `SimDTM`. Note that we use `WithTiedOffSerial ++ WithSimDebug` instead of `WithTiedOffDebug ++ WithSimSerial`. +If you would like to build and simulate a Chipyard configuration with a DTM configured for DMI communication, +then you must tie-off the serial-link interface, and instantiate the `SimDTM`. .. literalinclude:: ../../generators/chipyard/src/main/scala/config/RocketConfigs.scala :language: scala @@ -129,14 +116,110 @@ Then you can run simulations with the new DMI-enabled top-level and test-harness Using the JTAG Interface ------------------------ -The main way to use JTAG with a Rocket Chip based system is to instantiate the Debug Transfer Module (DTM) -and configure it to use a JTAG interface. The default Chipyard designs instantiate the DTM and configure it -to use JTAG. You may attach OpenOCD and GDB to any of the default JTAG-enabled designs. +Another way to interface with the DUT is to use JTAG. +Similar to the :ref:`Advanced-Concepts/Chip-Communication:Using the Debug Module interface (DMI)` section, in order to use the JTAG protocol, +the DUT needs to contain a Debug Transfer Module (DTM) configured to use JTAG instead of DMI. +Once the JTAG port is exposed, the host can communicate over JTAG to the DUT through a simulation stub +called ``SimJTAG`` (C++ class) that resides in a ``SimJTAG`` Verilog module (both reside in the ``generators/rocket-chip`` project). +This simulation stub creates a socket that OpenOCD and GDB can connect to when the simulation is running. +The default Chipyard designs instantiate the DTM configured to use JTAG (i.e. ``RocketConfig``). + +.. note:: + As mentioned, default Chipyard designs are enabled with JTAG. + However, they also use TSI/Serialized-TL with FESVR in case the JTAG interface isn't used. + This allows users to choose how to communicate with the DUT (use TSI or JTAG). Debugging with JTAG -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~ + +Roughly the steps to debug with JTAG in simulation are as follows: + +1. Build a Chipyard JTAG-enabled RTL design. Remember default Chipyard designs are JTAG ready. + +.. code-block:: bash + + cd sims/verilator + # or + cd sims/vcs + + make CONFIG=RocketConfig + +2. Run the simulation with remote bit-bang enabled. Since we hope to load/run the binary using JTAG, + we can pass ``none`` as a binary (prevents FESVR from loading the program). (Adapted from: https://github.com/chipsalliance/rocket-chip#3-launch-the-emulator) + +.. code-block:: bash + + # note: this uses Chipyard make invocation to run the simulation to properly wrap the simulation args + make CONFIG=RocketConfig BINARY=none SIM_FLAGS="+jtag_rbb_enable=1 --rbb-port=9823" run-binary -Please refer to the following resources on how to debug with JTAG. +3. `Follow the instructions here to connect to the simulation using OpenOCD + GDB. `__ + +.. note:: + This section was adapted from the instruction in Rocket Chip and riscv-isa-sim. For more information refer + to that documentation: `Rocket Chip GDB Docs `__, + `riscv-isa-sim GDB Docs `__ + +Example Test Chip Bringup Communication +--------------------------------------- + +Intro to Typical Chipyard Test Chip +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Most, if not all, Chipyard configurations are tethered using TSI (over a serial-link) and have access +to external memory through an AXI port (backing AXI memory). +The following image shows the DUT with these set of default signals: + +.. image:: ../_static/images/default-chipyard-config-communication.png + +In this setup, the serial-link is connected to the TSI/FESVR peripherals while the AXI port is connected +to a simulated AXI memory. +However, AXI ports tend to have many signals, and thus wires, associated with them so instead of creating an AXI port off the DUT, +one can send the memory transactions over the bi-directional serial-link (``TLSerdesser``) so that the main +interface to the DUT is the serial-link (which has comparatively less signals than an AXI port). +This new setup (shown below) is a typical Chipyard test chip setup: + +.. image:: ../_static/images/bringup-chipyard-config-communication.png + +Simulation Setup of the Example Test Chip +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To test this type of configuration (TSI/memory transactions over the serial-link), most of the same TSI collateral +would be used. +The main difference is that the TileLink-to-AXI converters and simulated AXI memory resides on the other side of the +serial-link. + +.. image:: ../_static/images/chip-bringup-simulation.png + +.. note:: + Here the simulated AXI memory and the converters can be in a different clock domain in the test harness + than the reference clock of the DUT. + For example, the DUT can be clocked at 3.2GHz while the simulated AXI memory can be clocked at 1GHz. + This functionality is done in the harness binder that instantiates the TSI collateral, TL-to-AXI converters, + and simulated AXI memory. + See :ref:`Advanced-Concepts/Harness-Clocks:Creating Clocks in the Test Harness` on how to generate a clock + in a harness binder. + +This type of simulation setup is done in the following multi-clock configuration: + +.. literalinclude:: ../../generators/chipyard/src/main/scala/config/RocketConfigs.scala + :language: scala + :start-after: DOC include start: MulticlockAXIOverSerialConfig + :end-before: DOC include end: MulticlockAXIOverSerialConfig + +Bringup Setup of the Example Test Chip after Tapeout +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Assuming this example test chip is taped out and now ready to be tested, we can communicate with the chip using this serial-link. +For example, a common test setup used at Berkeley to evaluate Chipyard-based test-chips includes an FPGA running a RISC-V soft-core that is able to speak to the DUT (over an FMC). +This RISC-V soft-core would serve as the host of the test that will run on the DUT. +This is done by the RISC-V soft-core running FESVR, sending TSI commands to a ``SerialAdapter`` / ``TLSerdesser`` programmed on the FPGA. +Once the commands are converted to serialized TileLink, then they can be sent over some medium to the DUT +(like an FMC cable or a set of wires connecting FPGA outputs to the DUT board). +Similar to simulation, if the chip requests offchip memory, it can then send the transaction back over the serial-link. +Then the request can be serviced by the FPGA DRAM. +The following image shows this flow: + +.. image:: ../_static/images/chip-bringup.png -* https://github.com/chipsalliance/rocket-chip#-debugging-with-gdb -* https://github.com/riscv/riscv-isa-sim#debugging-with-gdb +In fact, this exact type of bringup setup is what the following section discusses: +:ref:`Prototyping/VCU118:Introduction to the Bringup Platform`. diff --git a/docs/Advanced-Concepts/Harness-Clocks.rst b/docs/Advanced-Concepts/Harness-Clocks.rst new file mode 100644 index 0000000000..ef2249749e --- /dev/null +++ b/docs/Advanced-Concepts/Harness-Clocks.rst @@ -0,0 +1,38 @@ +.. _harness-clocks: + +Creating Clocks in the Test Harness +=================================== + +Chipyard currently allows the SoC design (everything under ``ChipTop``) to +have independent clock domains through diplomacy. +This implies that some reference clock enters the ``ChipTop`` and then is divided down into +separate clock domains. +From the perspective of the ``TestHarness`` module, the ``ChipTop`` clock and reset is +provided from a clock and reset called ``buildtopClock`` and ``buildtopReset``. +In the default case, this ``buildtopClock`` and ``buildtopReset`` is directly wired to the +clock and reset IO's of the ``TestHarness`` module. +However, the ``TestHarness`` has the ability to generate a standalone clock and reset signal +that is separate from the reference clock/reset of ``ChipTop``. +This allows harness components (including harness binders) the ability to "request" a clock +for a new clock domain. +This is useful for simulating systems in which modules in the harness have independent clock domains +from the DUT. + +Requests for a harness clock is done by the ``HarnessClockInstantiator`` class in ``generators/chipyard/src/main/scala/TestHarness.scala``. +This class is accessed in harness components by referencing the Rocket Chip parameters key ``p(HarnessClockInstantiatorKey)``. +Then you can request a clock and syncronized reset at a particular frequency by invoking the ``requestClockBundle`` function. +Take the following example: + +.. literalinclude:: ../../generators/chipyard/src/main/scala/HarnessBinders.scala + :language: scala + :start-after: DOC include start: HarnessClockInstantiatorEx + :end-before: DOC include end: HarnessClockInstantiatorEx + +Here you can see the ``p(HarnessClockInstantiatorKey)`` is used to request a clock and reset at ``memFreq`` frequency. + +.. note:: + In the case that the reference clock entering ``ChipTop`` is not the overall reference clock of the simulation + (i.e. the clock/reset coming into the ``TestHarness`` module), the ``buildtopClock`` and ``buildtopReset`` can + differ from the implicit ``TestHarness`` clock and reset. For example, if the ``ChipTop`` reference is 500MHz but an + extra harness clock is requested at 1GHz, the ``TestHarness`` implicit clock/reset will be at 1GHz while the ``buildtopClock`` + and ``buildtopReset`` will be at 500MHz. diff --git a/docs/Advanced-Concepts/index.rst b/docs/Advanced-Concepts/index.rst index 12b1271629..b294d11b52 100644 --- a/docs/Advanced-Concepts/index.rst +++ b/docs/Advanced-Concepts/index.rst @@ -14,4 +14,5 @@ They expect you to know about Chisel, Parameters, configs, etc. Debugging-BOOM Resources CDEs + Harness-Clocks diff --git a/docs/_static/images/bringup-chipyard-config-communication.png b/docs/_static/images/bringup-chipyard-config-communication.png new file mode 100644 index 0000000000..9da84dcfe4 Binary files /dev/null and b/docs/_static/images/bringup-chipyard-config-communication.png differ diff --git a/docs/_static/images/chip-bringup-simulation.png b/docs/_static/images/chip-bringup-simulation.png new file mode 100644 index 0000000000..007f808b98 Binary files /dev/null and b/docs/_static/images/chip-bringup-simulation.png differ diff --git a/docs/_static/images/chip-bringup.png b/docs/_static/images/chip-bringup.png index 4e8d060271..07c80f0528 100644 Binary files a/docs/_static/images/chip-bringup.png and b/docs/_static/images/chip-bringup.png differ diff --git a/docs/_static/images/chip-communication.png b/docs/_static/images/chip-communication.png index 4492bd8b81..8c0f8a1daf 100644 Binary files a/docs/_static/images/chip-communication.png and b/docs/_static/images/chip-communication.png differ diff --git a/docs/_static/images/default-chipyard-config-communication.png b/docs/_static/images/default-chipyard-config-communication.png new file mode 100644 index 0000000000..9763e1d44f Binary files /dev/null and b/docs/_static/images/default-chipyard-config-communication.png differ diff --git a/fpga/src/main/scala/arty/HarnessBinders.scala b/fpga/src/main/scala/arty/HarnessBinders.scala index ef7b180595..3581ae3d4c 100644 --- a/fpga/src/main/scala/arty/HarnessBinders.scala +++ b/fpga/src/main/scala/arty/HarnessBinders.scala @@ -32,7 +32,7 @@ class WithArtyJTAGHarnessBinder extends OverrideHarnessBinder({ (system: HasPeripheryDebug, th: ArtyFPGATestHarness, ports: Seq[Data]) => { ports.map { case j: JTAGIO => - withClockAndReset(th.harnessClock, th.hReset) { + withClockAndReset(th.buildtopClock, th.hReset) { val io_jtag = Wire(new JTAGPins(() => new BasePin(), false)).suggestName("jtag") JTAGPinsFromPort(io_jtag, j) diff --git a/fpga/src/main/scala/arty/TestHarness.scala b/fpga/src/main/scala/arty/TestHarness.scala index 503d2de60a..db7ddc0131 100644 --- a/fpga/src/main/scala/arty/TestHarness.scala +++ b/fpga/src/main/scala/arty/TestHarness.scala @@ -27,8 +27,8 @@ class ArtyFPGATestHarness(override implicit val p: Parameters) extends ArtyShell val dut = Module(lazyDut.module) } - val harnessClock = clock_32MHz - val harnessReset = hReset + val buildtopClock = clock_32MHz + val buildtopReset = hReset val success = false.B val dutReset = dReset diff --git a/fpga/src/main/scala/vcu118/TestHarness.scala b/fpga/src/main/scala/vcu118/TestHarness.scala index 5002817fc9..45afe7f757 100644 --- a/fpga/src/main/scala/vcu118/TestHarness.scala +++ b/fpga/src/main/scala/vcu118/TestHarness.scala @@ -121,13 +121,13 @@ class VCU118FPGATestHarnessImp(_outer: VCU118FPGATestHarness) extends LazyRawMod val hReset = Wire(Reset()) hReset := _outer.dutClock.in.head._1.reset - val harnessClock = _outer.dutClock.in.head._1.clock - val harnessReset = WireInit(hReset) + val buildtopClock = _outer.dutClock.in.head._1.clock + val buildtopReset = WireInit(hReset) val dutReset = hReset.asAsyncReset val success = false.B - childClock := harnessClock - childReset := harnessReset + childClock := buildtopClock + childReset := buildtopReset // harness binders are non-lazy _outer.topDesign match { case d: HasTestHarnessFunctions => diff --git a/generators/chipyard/src/main/scala/ChipTop.scala b/generators/chipyard/src/main/scala/ChipTop.scala index 61a043b6ed..8d05d10e64 100644 --- a/generators/chipyard/src/main/scala/ChipTop.scala +++ b/generators/chipyard/src/main/scala/ChipTop.scala @@ -15,6 +15,9 @@ import barstools.iocell.chisel._ case object BuildSystem extends Field[Parameters => LazyModule]((p: Parameters) => new DigitalTop()(p)) +trait HasReferenceClockFreq { + def refClockFreqMHz: Double +} /** * The base class used for building chips. This constructor instantiates a module specified by the BuildSystem parameter, @@ -24,15 +27,16 @@ case object BuildSystem extends Field[Parameters => LazyModule]((p: Parameters) */ class ChipTop(implicit p: Parameters) extends LazyModule with BindingScope - with HasTestHarnessFunctions with HasIOBinders { + with HasTestHarnessFunctions with HasReferenceClockFreq with HasIOBinders { // The system module specified by BuildSystem lazy val lazySystem = LazyModule(p(BuildSystem)(p)).suggestName("system") - // The implicitClockSinkNode provides the implicit clock and reset for the System + // The implicitClockSinkNode provides the implicit clock and reset for the system (connected by clocking scheme) val implicitClockSinkNode = ClockSinkNode(Seq(ClockSinkParameters(name = Some("implicit_clock")))) // Generate Clocks and Reset - p(ClockingSchemeKey)(this) + val mvRefClkFreq = p(ClockingSchemeKey)(this) + def refClockFreqMHz: Double = mvRefClkFreq.getWrappedValue // NOTE: Making this a LazyRawModule is moderately dangerous, as anonymous children // of ChipTop (ex: ClockGroup) do not receive clock or reset. diff --git a/generators/chipyard/src/main/scala/Clocks.scala b/generators/chipyard/src/main/scala/Clocks.scala index e0f2fc72e6..b5553347bd 100644 --- a/generators/chipyard/src/main/scala/Clocks.scala +++ b/generators/chipyard/src/main/scala/Clocks.scala @@ -7,7 +7,7 @@ import scala.collection.mutable.{ArrayBuffer} import freechips.rocketchip.prci._ import freechips.rocketchip.subsystem.{BaseSubsystem, SubsystemDriveAsyncClockGroupsKey, InstantiatesTiles} import freechips.rocketchip.config.{Parameters, Field, Config} -import freechips.rocketchip.diplomacy.{OutwardNodeHandle, InModuleBody, LazyModule} +import freechips.rocketchip.diplomacy.{ModuleValue, OutwardNodeHandle, InModuleBody, LazyModule} import freechips.rocketchip.util.{ResetCatchAndSync} import barstools.iocell.chisel._ @@ -39,7 +39,7 @@ object GenerateReset { } -case object ClockingSchemeKey extends Field[ChipTop => Unit](ClockingSchemeGenerators.dividerOnlyClockGenerator) +case object ClockingSchemeKey extends Field[ChipTop => ModuleValue[Double]](ClockingSchemeGenerators.dividerOnlyClockGenerator) /* * This is a Seq of assignment functions, that accept a clock name and return an optional frequency. * Functions that appear later in this seq have higher precedence that earlier ones. @@ -60,7 +60,7 @@ class ClockNameContainsAssignment(name: String, fMHz: Double) extends Config((si }) object ClockingSchemeGenerators { - val dividerOnlyClockGenerator: ChipTop => Unit = { chiptop => + val dividerOnlyClockGenerator: ChipTop => ModuleValue[Double] = { chiptop => implicit val p = chiptop.p // Requires existence of undriven asyncClockGroups in subsystem @@ -77,22 +77,25 @@ object ClockingSchemeGenerators { val resetSetterResetProvider = resetSetter.map(_.tileResetProviderNode).getOrElse(ClockGroupEphemeralNode()) val aggregator = LazyModule(new ClockGroupAggregator("allClocks")).node + // provides the implicit clock to the system (chiptop.implicitClockSinkNode := ClockGroup() := aggregator) + // provides the system clock (ex. the bus clocks) (systemAsyncClockGroup :*= ClockGroupNamePrefixer() :*= aggregator) val referenceClockSource = ClockSourceNode(Seq(ClockSourceParameters())) + val dividerOnlyClkGenerator = LazyModule(new DividerOnlyClockGenerator("buildTopClockGenerator")) + // provides all the divided clocks (from the top-level clock) (aggregator := ClockGroupFrequencySpecifier(p(ClockFrequencyAssignersKey), p(DefaultClockFrequencyKey)) := ClockGroupResetSynchronizer() := resetSetterResetProvider - := DividerOnlyClockGenerator() + := dividerOnlyClkGenerator.node := referenceClockSource) - val asyncResetBroadcast = FixedClockBroadcast(None) resetSetter.foreach(_.asyncResetSinkNode := asyncResetBroadcast) val asyncResetSource = ClockSourceNode(Seq(ClockSourceParameters())) @@ -115,8 +118,11 @@ object ClockingSchemeGenerators { } chiptop.harnessFunctions += ((th: HasHarnessSignalReferences) => { - clock_io := th.harnessClock + clock_io := th.buildtopClock Nil }) + + // return the reference frequency + dividerOnlyClkGenerator.module.referenceFreq } } } diff --git a/generators/chipyard/src/main/scala/ConfigFragments.scala b/generators/chipyard/src/main/scala/ConfigFragments.scala index cc245bafb6..a36285ebeb 100644 --- a/generators/chipyard/src/main/scala/ConfigFragments.scala +++ b/generators/chipyard/src/main/scala/ConfigFragments.scala @@ -215,7 +215,31 @@ class WithTLBackingMemory extends Config((site, here, up) => { case ExtTLMem => up(ExtMem, site) // enable TL backing memory }) -class WithTileFrequency(fMHz: Double) extends ClockNameContainsAssignment("tile", fMHz) +class WithSerialTLBackingMemory extends Config((site, here, up) => { + case ExtMem => None + case SerialTLKey => up(SerialTLKey, site).map { k => k.copy( + memParams = { + val memPortParams = up(ExtMem, site).get + require(memPortParams.nMemoryChannels == 1) + memPortParams.master + }, + isMemoryDevice = true + )} +}) + +/** + * Mixins to define either a specific tile frequency for a single hart or all harts + * + * @param fMHz Frequency in MHz of the tile or all tiles + * @param hartId Optional hartid to assign the frequency to (if unspecified default to all harts) + */ +class WithTileFrequency(fMHz: Double, hartId: Option[Int] = None) extends ClockNameContainsAssignment({ + hartId match { + case Some(id) => s"tile_$id" + case None => "tile" + } + }, + fMHz) class WithPeripheryBusFrequencyAsDefault extends Config((site, here, up) => { case DefaultClockFrequencyKey => (site(PeripheryBusKey).dtsFrequency.get / (1000 * 1000)).toDouble diff --git a/generators/chipyard/src/main/scala/HarnessBinders.scala b/generators/chipyard/src/main/scala/HarnessBinders.scala index 9159873d97..b87659f431 100644 --- a/generators/chipyard/src/main/scala/HarnessBinders.scala +++ b/generators/chipyard/src/main/scala/HarnessBinders.scala @@ -21,7 +21,7 @@ import barstools.iocell.chisel._ import testchipip._ -import chipyard.HasHarnessSignalReferences +import chipyard.{HasHarnessSignalReferences, HarnessClockInstantiatorKey} import chipyard.iobinders.GetSystemParameters import tracegen.{TraceGenSystemModuleImp} @@ -90,21 +90,21 @@ class WithUARTAdapter extends OverrideHarnessBinder({ class WithSimSPIFlashModel(rdOnly: Boolean = true) extends OverrideHarnessBinder({ (system: HasPeripherySPIFlashModuleImp, th: HasHarnessSignalReferences, ports: Seq[SPIChipIO]) => { - SimSPIFlashModel.connect(ports, th.harnessReset, rdOnly)(system.p) + SimSPIFlashModel.connect(ports, th.buildtopReset, rdOnly)(system.p) } }) class WithSimBlockDevice extends OverrideHarnessBinder({ (system: CanHavePeripheryBlockDevice, th: HasHarnessSignalReferences, ports: Seq[ClockedIO[BlockDeviceIO]]) => { implicit val p: Parameters = GetSystemParameters(system) - ports.map { b => SimBlockDevice.connect(b.clock, th.harnessReset.asBool, Some(b.bits)) } + ports.map { b => SimBlockDevice.connect(b.clock, th.buildtopReset.asBool, Some(b.bits)) } } }) class WithBlockDeviceModel extends OverrideHarnessBinder({ (system: CanHavePeripheryBlockDevice, th: HasHarnessSignalReferences, ports: Seq[ClockedIO[BlockDeviceIO]]) => { implicit val p: Parameters = GetSystemParameters(system) - ports.map { b => withClockAndReset(b.clock, th.harnessReset) { BlockDeviceModel.connect(Some(b.bits)) } } + ports.map { b => withClockAndReset(b.clock, th.buildtopReset) { BlockDeviceModel.connect(Some(b.bits)) } } } }) @@ -112,7 +112,7 @@ class WithLoopbackNIC extends OverrideHarnessBinder({ (system: CanHavePeripheryIceNIC, th: HasHarnessSignalReferences, ports: Seq[ClockedIO[NICIOvonly]]) => { implicit val p: Parameters = GetSystemParameters(system) ports.map { n => - withClockAndReset(n.clock, th.harnessReset) { + withClockAndReset(n.clock, th.buildtopReset) { NicLoopback.connect(Some(n.bits), p(NICKey)) } } @@ -122,7 +122,7 @@ class WithLoopbackNIC extends OverrideHarnessBinder({ class WithSimNetwork extends OverrideHarnessBinder({ (system: CanHavePeripheryIceNIC, th: BaseModule with HasHarnessSignalReferences, ports: Seq[ClockedIO[NICIOvonly]]) => { implicit val p: Parameters = GetSystemParameters(system) - ports.map { n => SimNetwork.connect(Some(n.bits), n.clock, th.harnessReset.asBool) } + ports.map { n => SimNetwork.connect(Some(n.bits), n.clock, th.buildtopReset.asBool) } } }) @@ -139,6 +139,46 @@ class WithSimAXIMem extends OverrideHarnessBinder({ } }) +class WithSimAXIMemOverSerialTL extends OverrideHarnessBinder({ + (system: CanHavePeripheryTLSerial, th: HasHarnessSignalReferences, ports: Seq[ClockedIO[SerialIO]]) => { + implicit val p = chipyard.iobinders.GetSystemParameters(system) + + p(SerialTLKey).map({ sVal => + require(sVal.axiMemOverSerialTLParams.isDefined) + val axiDomainParams = sVal.axiMemOverSerialTLParams.get + require(sVal.isMemoryDevice) + + val memFreq = axiDomainParams.getMemFrequency(system.asInstanceOf[HasTileLinkLocations]) + + ports.map({ port => +// DOC include start: HarnessClockInstantiatorEx + withClockAndReset(th.buildtopClock, th.buildtopReset) { + val memOverSerialTLClockBundle = p(HarnessClockInstantiatorKey).requestClockBundle("mem_over_serial_tl_clock", memFreq) + val serial_bits = SerialAdapter.asyncQueue(port, th.buildtopClock, th.buildtopReset) + val harnessMultiClockAXIRAM = SerialAdapter.connectHarnessMultiClockAXIRAM( + system.serdesser.get, + serial_bits, + memOverSerialTLClockBundle, + th.buildtopReset) +// DOC include end: HarnessClockInstantiatorEx + val success = SerialAdapter.connectSimSerial(harnessMultiClockAXIRAM.module.io.tsi_ser, th.buildtopClock, th.buildtopReset.asBool) + when (success) { th.success := true.B } + + // connect SimDRAM from the AXI port coming from the harness multi clock axi ram + (harnessMultiClockAXIRAM.mem_axi4 zip harnessMultiClockAXIRAM.memNode.edges.in).map { case (axi_port, edge) => + val memSize = sVal.memParams.size + val lineSize = p(CacheBlockBytes) + val mem = Module(new SimDRAM(memSize, lineSize, BigInt(memFreq.toLong), edge.bundle)).suggestName("simdram") + mem.io.axi <> axi_port.bits + mem.io.clock := axi_port.clock + mem.io.reset := axi_port.reset + } + } + }) + }) + } +}) + class WithBlackBoxSimMem(additionalLatency: Int = 0) extends OverrideHarnessBinder({ (system: CanHaveMasterAXI4MemPort, th: HasHarnessSignalReferences, ports: Seq[ClockedAndResetIO[AXI4Bundle]]) => { val p: Parameters = chipyard.iobinders.GetSystemParameters(system) @@ -204,11 +244,11 @@ class WithSimDebug extends OverrideHarnessBinder({ case d: ClockedDMIIO => val dtm_success = WireInit(false.B) when (dtm_success) { th.success := true.B } - val dtm = Module(new SimDTM).connect(th.harnessClock, th.harnessReset.asBool, d, dtm_success) + val dtm = Module(new SimDTM).connect(th.buildtopClock, th.buildtopReset.asBool, d, dtm_success) case j: JTAGIO => val dtm_success = WireInit(false.B) when (dtm_success) { th.success := true.B } - val jtag = Module(new SimJTAG(tickDelay=3)).connect(j, th.harnessClock, th.harnessReset.asBool, ~(th.harnessReset.asBool), dtm_success) + val jtag = Module(new SimJTAG(tickDelay=3)).connect(j, th.buildtopClock, th.buildtopReset.asBool, ~(th.buildtopReset.asBool), dtm_success) } } }) @@ -242,9 +282,9 @@ class WithSerialAdapterTiedOff extends OverrideHarnessBinder({ (system: CanHavePeripheryTLSerial, th: HasHarnessSignalReferences, ports: Seq[ClockedIO[SerialIO]]) => { implicit val p = chipyard.iobinders.GetSystemParameters(system) ports.map({ port => - val bits = SerialAdapter.asyncQueue(port, th.harnessClock, th.harnessReset) - withClockAndReset(th.harnessClock, th.harnessReset) { - val ram = SerialAdapter.connectHarnessRAM(system.serdesser.get, bits, th.harnessReset) + val bits = SerialAdapter.asyncQueue(port, th.buildtopClock, th.buildtopReset) + withClockAndReset(th.buildtopClock, th.buildtopReset) { + val ram = SerialAdapter.connectHarnessRAM(system.serdesser.get, bits, th.buildtopReset) SerialAdapter.tieoff(ram.module.io.tsi_ser) } }) @@ -255,10 +295,10 @@ class WithSimSerial extends OverrideHarnessBinder({ (system: CanHavePeripheryTLSerial, th: HasHarnessSignalReferences, ports: Seq[ClockedIO[SerialIO]]) => { implicit val p = chipyard.iobinders.GetSystemParameters(system) ports.map({ port => - val bits = SerialAdapter.asyncQueue(port, th.harnessClock, th.harnessReset) - withClockAndReset(th.harnessClock, th.harnessReset) { - val ram = SerialAdapter.connectHarnessRAM(system.serdesser.get, bits, th.harnessReset) - val success = SerialAdapter.connectSimSerial(ram.module.io.tsi_ser, th.harnessClock, th.harnessReset.asBool) + val bits = SerialAdapter.asyncQueue(port, th.buildtopClock, th.buildtopReset) + withClockAndReset(th.buildtopClock, th.buildtopReset) { + val ram = SerialAdapter.connectHarnessRAM(system.serdesser.get, bits, th.buildtopReset) + val success = SerialAdapter.connectSimSerial(ram.module.io.tsi_ser, th.buildtopClock, th.buildtopReset.asBool) when (success) { th.success := true.B } } }) diff --git a/generators/chipyard/src/main/scala/IOBinders.scala b/generators/chipyard/src/main/scala/IOBinders.scala index ab6ccdf8d2..c55d86e0e3 100644 --- a/generators/chipyard/src/main/scala/IOBinders.scala +++ b/generators/chipyard/src/main/scala/IOBinders.scala @@ -260,7 +260,6 @@ class WithSerialTLIOCells extends OverrideIOBinder({ }).getOrElse((Nil, Nil)) }) - class WithAXI4MemPunchthrough extends OverrideLazyIOBinder({ (system: CanHaveMasterAXI4MemPort) => { implicit val p: Parameters = GetSystemParameters(system) diff --git a/generators/chipyard/src/main/scala/TestHarness.scala b/generators/chipyard/src/main/scala/TestHarness.scala index c638c0813c..ba09e6dcd4 100644 --- a/generators/chipyard/src/main/scala/TestHarness.scala +++ b/generators/chipyard/src/main/scala/TestHarness.scala @@ -1,12 +1,16 @@ package chipyard import chisel3._ -import scala.collection.mutable.{ArrayBuffer} + +import scala.collection.mutable.{ArrayBuffer, LinkedHashMap} import freechips.rocketchip.diplomacy.{LazyModule} import freechips.rocketchip.config.{Field, Parameters} +import freechips.rocketchip.util.{ResetCatchAndSync} +import freechips.rocketchip.prci.{ClockBundle, ClockBundleParameters, ClockSinkParameters, ClockParameters} import chipyard.harness.{ApplyHarnessBinders, HarnessBinders} import chipyard.iobinders.HasIOBinders +import chipyard.clocking.{SimplePllConfiguration, ClockDividerN} // ------------------------------- // Chipyard Test Harness @@ -19,26 +23,84 @@ trait HasTestHarnessFunctions { } trait HasHarnessSignalReferences { - def harnessClock: Clock - def harnessReset: Reset + // clock/reset of the chiptop reference clock (can be different than the implicit harness clock/reset) + def buildtopClock: Clock + def buildtopReset: Reset def dutReset: Reset def success: Bool } +class HarnessClockInstantiator { + private val _clockMap: LinkedHashMap[String, (Double, ClockBundle)] = LinkedHashMap.empty + + // request a clock bundle at a particular frequency + def requestClockBundle(name: String, freqRequested: Double): ClockBundle = { + val clockBundle = Wire(new ClockBundle(ClockBundleParameters())) + _clockMap(name) = (freqRequested, clockBundle) + clockBundle + } + + // connect all clock wires specified to a divider only PLL + def instantiateHarnessDividerPLL(refClock: ClockBundle): Unit = { + val sinks = _clockMap.map({ case (name, (freq, bundle)) => + ClockSinkParameters(take=Some(ClockParameters(freqMHz=freq / (1000 * 1000))), name=Some(name)) + }).toSeq + + val pllConfig = new SimplePllConfiguration("harnessDividerOnlyClockGenerator", sinks) + pllConfig.emitSummaries() + + val dividedClocks = LinkedHashMap[Int, Clock]() + def instantiateDivider(div: Int): Clock = { + val divider = Module(new ClockDividerN(div)) + divider.suggestName(s"ClockDivideBy${div}") + divider.io.clk_in := refClock.clock + dividedClocks(div) = divider.io.clk_out + divider.io.clk_out + } + + // connect wires to clock source + for (sinkParams <- sinks) { + // bypass the reference freq. (don't create a divider + reset sync) + val (divClock, divReset) = if (sinkParams.take.get.freqMHz != pllConfig.referenceFreqMHz) { + val div = pllConfig.sinkDividerMap(sinkParams) + val divClock = dividedClocks.getOrElse(div, instantiateDivider(div)) + (divClock, ResetCatchAndSync(divClock, refClock.reset.asBool)) + } else { + (refClock.clock, refClock.reset) + } + + _clockMap(sinkParams.name.get)._2.clock := divClock + _clockMap(sinkParams.name.get)._2.reset := divReset + } + } +} + +case object HarnessClockInstantiatorKey extends Field[HarnessClockInstantiator](new HarnessClockInstantiator) + class TestHarness(implicit val p: Parameters) extends Module with HasHarnessSignalReferences { val io = IO(new Bundle { val success = Output(Bool()) }) + val buildtopClock = Wire(Clock()) + val buildtopReset = Wire(Reset()) + val lazyDut = LazyModule(p(BuildTop)(p)).suggestName("chiptop") val dut = Module(lazyDut.module) + io.success := false.B - val harnessClock = clock - val harnessReset = WireInit(reset) - val success = io.success + val freqMHz = lazyDut match { + case d: HasReferenceClockFreq => d.refClockFreqMHz + case _ => p(DefaultClockFrequencyKey) + } + val refClkBundle = p(HarnessClockInstantiatorKey).requestClockBundle("buildtop_reference_clock", freqMHz * (1000 * 1000)) + + buildtopClock := refClkBundle.clock + buildtopReset := WireInit(refClkBundle.reset) + val dutReset = refClkBundle.reset.asAsyncReset - val dutReset = reset.asAsyncReset + val success = io.success lazyDut match { case d: HasTestHarnessFunctions => d.harnessFunctions.foreach(_(this)) @@ -46,5 +108,10 @@ class TestHarness(implicit val p: Parameters) extends Module with HasHarnessSign lazyDut match { case d: HasIOBinders => ApplyHarnessBinders(this, d.lazySystem, d.portMap) } + + val implicitHarnessClockBundle = Wire(new ClockBundle(ClockBundleParameters())) + implicitHarnessClockBundle.clock := clock + implicitHarnessClockBundle.reset := reset + p(HarnessClockInstantiatorKey).instantiateHarnessDividerPLL(implicitHarnessClockBundle) } diff --git a/generators/chipyard/src/main/scala/clocking/DividerOnlyClockGenerator.scala b/generators/chipyard/src/main/scala/clocking/DividerOnlyClockGenerator.scala index 2b66619661..b272c80ce3 100644 --- a/generators/chipyard/src/main/scala/clocking/DividerOnlyClockGenerator.scala +++ b/generators/chipyard/src/main/scala/clocking/DividerOnlyClockGenerator.scala @@ -51,7 +51,7 @@ object FrequencyUtils { require(!requestedOutputs.contains(0.0)) val requestedFreqs = requestedOutputs.map(_.freqMHz) val fastestFreq = requestedFreqs.max - require(fastestFreq < maximumAllowableFreqMHz) + require(fastestFreq <= maximumAllowableFreqMHz) val candidateFreqs = Seq.tabulate(Math.ceil(maximumAllowableFreqMHz / fastestFreq).toInt)(i => (i + 1) * fastestFreq) @@ -89,6 +89,7 @@ class SimplePllConfiguration( ElaborationArtefacts.add(s"${name}.freq-summary", summaryString) println(summaryString) } + def referenceSinkParams(): ClockSinkParameters = sinkDividerMap.find(_._2 == 1).get._1 } case class DividerOnlyClockGeneratorNode(pllName: String)(implicit valName: ValName) @@ -144,7 +145,3 @@ class DividerOnlyClockGenerator(pllName: String)(implicit p: Parameters, valName } } } - -object DividerOnlyClockGenerator { - def apply()(implicit p: Parameters, valName: ValName) = LazyModule(new DividerOnlyClockGenerator(valName.name)).node -} diff --git a/generators/chipyard/src/main/scala/config/AbstractConfig.scala b/generators/chipyard/src/main/scala/config/AbstractConfig.scala index a70ae4dfd7..da84bd0501 100644 --- a/generators/chipyard/src/main/scala/config/AbstractConfig.scala +++ b/generators/chipyard/src/main/scala/config/AbstractConfig.scala @@ -54,4 +54,3 @@ class AbstractConfig extends Config( new freechips.rocketchip.subsystem.WithNExtTopInterrupts(0) ++ // no external interrupts new chipyard.WithMulticlockCoherentBusTopology ++ // hierarchical buses including mbus+l2 new freechips.rocketchip.system.BaseConfig) // "base" rocketchip system - diff --git a/generators/chipyard/src/main/scala/config/RocketConfigs.scala b/generators/chipyard/src/main/scala/config/RocketConfigs.scala index 7c4de6f5f0..a81c144940 100644 --- a/generators/chipyard/src/main/scala/config/RocketConfigs.scala +++ b/generators/chipyard/src/main/scala/config/RocketConfigs.scala @@ -212,3 +212,26 @@ class LBWIFRocketConfig extends Config( new freechips.rocketchip.subsystem.WithNoMemPort ++ // remove AXI4 backing memory new freechips.rocketchip.subsystem.WithNBigCores(1) ++ new chipyard.config.AbstractConfig) + +// DOC include start: MulticlockAXIOverSerialConfig +class MulticlockAXIOverSerialConfig extends Config( + new chipyard.config.WithSystemBusFrequencyAsDefault ++ + new chipyard.config.WithSystemBusFrequency(250) ++ + new chipyard.config.WithPeripheryBusFrequency(250) ++ + new chipyard.config.WithMemoryBusFrequency(250) ++ + new chipyard.config.WithFrontBusFrequency(50) ++ + new chipyard.config.WithTileFrequency(500, Some(1)) ++ + new chipyard.config.WithTileFrequency(250, Some(0)) ++ + + new chipyard.config.WithFbusToSbusCrossingType(AsynchronousCrossing()) ++ + new testchipip.WithAsynchronousSerialSlaveCrossing ++ + new freechips.rocketchip.subsystem.WithAsynchronousRocketTiles( + AsynchronousCrossing().depth, + AsynchronousCrossing().sourceSync) ++ + + new chipyard.harness.WithSimAXIMemOverSerialTL ++ // add SimDRAM DRAM model for axi4 backing memory over the SerDes link, if axi4 mem is enabled + new chipyard.config.WithSerialTLBackingMemory ++ // remove axi4 mem port in favor of SerialTL memory + + new freechips.rocketchip.subsystem.WithNBigCores(2) ++ + new chipyard.config.AbstractConfig) +// DOC include end: MulticlockAXIOverSerialConfig diff --git a/generators/firechip/src/main/scala/BridgeBinders.scala b/generators/firechip/src/main/scala/BridgeBinders.scala index 95f0bf3b4e..bf1ff6805d 100644 --- a/generators/firechip/src/main/scala/BridgeBinders.scala +++ b/generators/firechip/src/main/scala/BridgeBinders.scala @@ -12,6 +12,8 @@ import freechips.rocketchip.devices.debug.{Debug, HasPeripheryDebugModuleImp} import freechips.rocketchip.amba.axi4.{AXI4Bundle} import freechips.rocketchip.subsystem._ import freechips.rocketchip.tile.{RocketTile} +import freechips.rocketchip.prci.{ClockBundle, ClockBundleParameters} +import freechips.rocketchip.util.{ResetCatchAndSync} import sifive.blocks.devices.uart._ import testchipip._ @@ -70,11 +72,11 @@ class WithSerialBridge extends OverrideHarnessBinder({ (system: CanHavePeripheryTLSerial, th: FireSim, ports: Seq[ClockedIO[SerialIO]]) => { ports.map { port => implicit val p = GetSystemParameters(system) - val bits = SerialAdapter.asyncQueue(port, th.harnessClock, th.harnessReset) - val ram = withClockAndReset(th.harnessClock, th.harnessReset) { - SerialAdapter.connectHarnessRAM(system.serdesser.get, bits, th.harnessReset) + val bits = SerialAdapter.asyncQueue(port, th.buildtopClock, th.buildtopReset) + val ram = withClockAndReset(th.buildtopClock, th.buildtopReset) { + SerialAdapter.connectHarnessRAM(system.serdesser.get, bits, th.buildtopReset) } - SerialBridge(th.harnessClock, ram.module.io.tsi_ser, p(ExtMem).map(_ => MainMemoryConsts.globalName)) + SerialBridge(th.buildtopClock, ram.module.io.tsi_ser, p(ExtMem).map(_ => MainMemoryConsts.globalName)) } Nil } @@ -101,7 +103,56 @@ class WithUARTBridge extends OverrideHarnessBinder({ class WithBlockDeviceBridge extends OverrideHarnessBinder({ (system: CanHavePeripheryBlockDevice, th: FireSim, ports: Seq[ClockedIO[BlockDeviceIO]]) => { implicit val p: Parameters = GetSystemParameters(system) - ports.map { b => BlockDevBridge(b.clock, b.bits, th.harnessReset.toBool) } + ports.map { b => BlockDevBridge(b.clock, b.bits, th.buildtopReset.asBool) } + Nil + } +}) + +class WithAXIOverSerialTLCombinedBridges extends OverrideHarnessBinder({ + (system: CanHavePeripheryTLSerial, th: FireSim, ports: Seq[ClockedIO[SerialIO]]) => { + implicit val p = GetSystemParameters(system) + + p(SerialTLKey).map({ sVal => + require(sVal.axiMemOverSerialTLParams.isDefined) + val axiDomainParams = sVal.axiMemOverSerialTLParams.get + require(sVal.isMemoryDevice) + + val memFreq = axiDomainParams.getMemFrequency(system.asInstanceOf[HasTileLinkLocations]) + + ports.map({ port => + val axiClock = p(ClockBridgeInstantiatorKey).requestClock("mem_over_serial_tl_clock", memFreq) + val axiClockBundle = Wire(new ClockBundle(ClockBundleParameters())) + axiClockBundle.clock := axiClock + axiClockBundle.reset := ResetCatchAndSync(axiClock, th.buildtopReset.asBool) + + val serial_bits = SerialAdapter.asyncQueue(port, th.buildtopClock, th.buildtopReset) + + val harnessMultiClockAXIRAM = withClockAndReset(th.buildtopClock, th.buildtopReset) { + SerialAdapter.connectHarnessMultiClockAXIRAM( + system.serdesser.get, + serial_bits, + axiClockBundle, + th.buildtopReset) + } + SerialBridge(th.buildtopClock, harnessMultiClockAXIRAM.module.io.tsi_ser, Some(MainMemoryConsts.globalName)) + + // connect SimAxiMem + (harnessMultiClockAXIRAM.mem_axi4 zip harnessMultiClockAXIRAM.memNode.edges.in).map { case (axi4, edge) => + val nastiKey = NastiParameters(axi4.bits.r.bits.data.getWidth, + axi4.bits.ar.bits.addr.getWidth, + axi4.bits.ar.bits.id.getWidth) + system match { + case s: BaseSubsystem => FASEDBridge(axi4.clock, axi4.bits, axi4.reset.asBool, + CompleteConfig(p(firesim.configs.MemModelKey), + nastiKey, + Some(AXI4EdgeSummary(edge)), + Some(MainMemoryConsts.globalName))) + case _ => throw new Exception("Attempting to attach FASED Bridge to misconfigured design") + } + } + }) + }) + Nil } }) @@ -141,7 +192,7 @@ class WithDromajoBridge extends ComposeHarnessBinder({ class WithTraceGenBridge extends OverrideHarnessBinder({ (system: TraceGenSystemModuleImp, th: FireSim, ports: Seq[Bool]) => - ports.map { p => GroundTestBridge(th.harnessClock, p)(system.p) }; Nil + ports.map { p => GroundTestBridge(th.buildtopClock, p)(system.p) }; Nil }) class WithFireSimMultiCycleRegfile extends ComposeIOBinder({ diff --git a/generators/firechip/src/main/scala/FireSim.scala b/generators/firechip/src/main/scala/FireSim.scala index ff76597094..9559480763 100644 --- a/generators/firechip/src/main/scala/FireSim.scala +++ b/generators/firechip/src/main/scala/FireSim.scala @@ -2,6 +2,8 @@ package firesim.firesim +import scala.collection.mutable.{LinkedHashMap} + import chisel3._ import chisel3.experimental.{IO} @@ -38,44 +40,113 @@ object NodeIdx { /** * Under FireSim's current multiclock implementation there can be only a * single clock bridge. This requires, therefore, that it be instantiated in - * the harness and reused across all supernode instances. This class attempts to + * the harness and reused across all supernode instances. This class attempts to * memoize its instantiation such that it can be referenced from within a ClockScheme function. */ class ClockBridgeInstantiator { - private var _clockRecord: Option[RecordMap[Clock]] = None + private val _harnessClockMap: LinkedHashMap[String, (Double, Clock)] = LinkedHashMap.empty + + // Assumes that the supernode implementation results in duplicated clocks + // (i.e. only 1 set of clocks is generated for all BuildTop designs) + private val _buildtopClockMap: LinkedHashMap[String, (RationalClock, Clock)] = LinkedHashMap.empty + private var _buildtopRefTuple: Option[(String, Double)] = None + + /** + * Request a clock at a particular frequency + * + * @param name An identifier for the associated clock domain + * + * @param freqRequested Freq. for the domain in Hz + */ + def requestClock(name: String, freqRequested: Double): Clock = { + val clkWire = Wire(new Clock) + _harnessClockMap(name) = (freqRequested, clkWire) + clkWire + } - def getClockRecord: RecordMap[Clock] = _clockRecord.get + /** + * Get a RecordMap of clocks for a set of input RationalClocks + * + * @param allClocks Seq. of RationalClocks that want a clock + * + * @param baseClockName Name of domain that the allClocks is rational to + * + * @param baseFreqRequested Freq. for the reference domain in Hz + */ + def requestClockRecordMap(allClocks: Seq[RationalClock], baseClockName: String, baseFreqRequested: Double): RecordMap[Clock] = { + require(!_buildtopRefTuple.isDefined, "Can only request one RecordMap of Clocks") + + val ratClockRecordMapWire = Wire(RecordMap(allClocks.map { c => (c.name, Clock()) }:_*)) + + _buildtopRefTuple = Some((baseClockName, baseFreqRequested)) + for (clock <- allClocks) { + val clkWire = Wire(new Clock) + _buildtopClockMap(clock.name) = (clock, clkWire) + ratClockRecordMapWire(clock.name).get := clkWire + } - def getClockRecordOrInstantiate(allClocks: Seq[RationalClock], baseClockName: String): RecordMap[Clock] = { - if (_clockRecord.isEmpty) { - require(allClocks.exists(_.name == baseClockName), - s"Provided base-clock name, ${baseClockName}, does not match a defined clock. Available clocks:\n " + - allClocks.map(_.name).mkString("\n ")) + ratClockRecordMapWire + } - val baseClock = allClocks.find(_.name == baseClockName).get - val simplified = allClocks.map { c => - c.copy(multiplier = c.multiplier * baseClock.divisor, divisor = c.divisor * baseClock.multiplier) - .simplify - } + /** + * Connect all clocks requested to ClockBridge + */ + def instantiateFireSimClockBridge: Unit = { + require(_buildtopRefTuple.isDefined, "Must have rational clocks to assign to") + require(_buildtopClockMap.exists(_._1 == _buildtopRefTuple.get._1), + s"Provided base-clock name for rational clocks, ${_buildtopRefTuple.get._1}, doesn't match a name within specified rational clocks." + + "Available clocks:\n " + _buildtopClockMap.map(_._1).mkString("\n ")) + + // Simplify the RationalClocks ratio's + val refRatClock = _buildtopClockMap.find(_._1 == _buildtopRefTuple.get._1).get._2._1 + val simpleRatClocks = _buildtopClockMap.map { t => + val ratClock = t._2._1 + ratClock.copy( + multiplier = ratClock.multiplier * refRatClock.divisor, + divisor = ratClock.divisor * refRatClock.multiplier).simplify + } - /** - * Removes clocks that have the same frequency before instantiating the - * clock bridge to avoid unnecessary BUFGCE use. - */ - val distinct = simplified.foldLeft(Seq(RationalClock(baseClockName, 1, 1))) { case (list, candidate) => - if (list.exists { clock => clock.equalFrequency(candidate) }) list else list :+ candidate - } + // Determine all the clock dividers (harness + rational clocks) + // Note: Requires that the BuildTop reference frequency is requested with proper freq. + val refRatClockFreq = _buildtopRefTuple.get._2 + val refRatSinkParams = ClockSinkParameters(take=Some(ClockParameters(freqMHz=refRatClockFreq / (1000 * 1000))),name=Some(_buildtopRefTuple.get._1)) + val harSinkParams = _harnessClockMap.map { case (name, (freq, bundle)) => + ClockSinkParameters(take=Some(ClockParameters(freqMHz=freq / (1000 * 1000))),name=Some(name)) + }.toSeq + val allSinkParams = harSinkParams :+ refRatSinkParams + + // Use PLL config to determine overall div's + val pllConfig = new SimplePllConfiguration("firesimOverallClockBridge", allSinkParams) + pllConfig.emitSummaries + + // Adjust all BuildTop RationalClocks with the div determined by the PLL + val refRatDiv = pllConfig.sinkDividerMap(refRatSinkParams) + val adjRefRatClocks = simpleRatClocks.map { clock => + clock.copy(divisor = clock.divisor * refRatDiv).simplify + } - val clockBridge = Module(new RationalClockBridge(distinct)) - val cbVecTuples = distinct.zip(clockBridge.io.clocks) - val outputWire = Wire(RecordMap(simplified.map { c => (c.name, Clock()) }:_*)) - for (parameter <- simplified) { - val (_, cbClockField) = cbVecTuples.find(_._1.equalFrequency(parameter)).get - outputWire(parameter.name).get := cbClockField - } - _clockRecord = Some(outputWire) + // Convert harness clocks to RationalClocks + val harRatClocks = harSinkParams.map { case ClockSinkParameters(_, _, _, _, clkParamsOpt, nameOpt) => + RationalClock(nameOpt.get, 1, pllConfig.referenceFreqMHz.toInt / clkParamsOpt.get.freqMHz.toInt) + } + + val allAdjRatClks = adjRefRatClocks ++ harRatClocks + + // Removes clocks that have the same frequency before instantiating the + // clock bridge to avoid unnecessary BUFGCE use. + val allDistinctRatClocks = allAdjRatClks.foldLeft(Seq(RationalClock(pllConfig.referenceSinkParams.name.get, 1, 1))) { + case (list, candidate) => if (list.exists { clock => clock.equalFrequency(candidate) }) list else list :+ candidate + } + + val clockBridge = Module(new RationalClockBridge(allDistinctRatClocks)) + val cbVecTuples = allDistinctRatClocks.zip(clockBridge.io.clocks) + + // Connect all clocks (harness + BuildTop clocks) + for (clock <- allAdjRatClks) { + val (_, cbClockField) = cbVecTuples.find(_._1.equalFrequency(clock)).get + _buildtopClockMap.get(clock.name).map { case (_, clk) => clk := cbClockField } + _harnessClockMap.get(clock.name).map { case (_, clk) => clk := cbClockField } } - getClockRecord } } @@ -117,29 +188,35 @@ class WithFireSimSimpleClocks extends Config((site, here, up) => { clockBundle.reset := reset } - val pllConfig = new SimplePllConfiguration("FireSim RationalClockBridge", clockGroupEdge.sink.members) + val pllConfig = new SimplePllConfiguration("firesimBuildTopClockGenerator", clockGroupEdge.sink.members) pllConfig.emitSummaries val rationalClockSpecs = for ((sinkP, division) <- pllConfig.sinkDividerMap) yield { RationalClock(sinkP.name.get, 1, division) } chiptop.harnessFunctions += ((th: HasHarnessSignalReferences) => { - reset := th.harnessReset + reset := th.buildtopReset input_clocks := p(ClockBridgeInstantiatorKey) - .getClockRecordOrInstantiate(rationalClockSpecs.toSeq, p(FireSimBaseClockNameKey)) + .requestClockRecordMap(rationalClockSpecs.toSeq, p(FireSimBaseClockNameKey), pllConfig.referenceFreqMHz * (1000 * 1000)) Nil }) + + // return the reference frequency + pllConfig.referenceFreqMHz } } }) class FireSim(implicit val p: Parameters) extends RawModule with HasHarnessSignalReferences { freechips.rocketchip.util.property.cover.setPropLib(new midas.passes.FireSimPropertyLibrary()) - val harnessClock = Wire(Clock()) - val harnessReset = WireInit(false.B) - val peekPokeBridge = PeekPokeBridge(harnessClock, harnessReset) + + val buildtopClock = Wire(Clock()) + val buildtopReset = WireInit(false.B) + val peekPokeBridge = PeekPokeBridge(buildtopClock, buildtopReset) def dutReset = { require(false, "dutReset should not be used in Firesim"); false.B } def success = { require(false, "success should not be used in Firesim"); false.B } + var btFreqMHz: Option[Double] = None + // Instantiate multiple instances of the DUT to implement supernode for (i <- 0 until p(NumNodes)) { // It's not a RC bump without some hacks... @@ -151,6 +228,12 @@ class FireSim(implicit val p: Parameters) extends RawModule with HasHarnessSigna case AsyncClockGroupsKey => p(AsyncClockGroupsKey).copy }))) val module = Module(lazyModule.module) + + btFreqMHz = Some(lazyModule match { + case d: HasReferenceClockFreq => d.refClockFreqMHz + case _ => p(DefaultClockFrequencyKey) + }) + lazyModule match { case d: HasTestHarnessFunctions => require(d.harnessFunctions.size == 1, "There should only be 1 harness function to connect clock+reset") d.harnessFunctions.foreach(_(this)) @@ -160,5 +243,8 @@ class FireSim(implicit val p: Parameters) extends RawModule with HasHarnessSigna } NodeIdx.increment() } - harnessClock := p(ClockBridgeInstantiatorKey).getClockRecord("implicit_clock").get + + buildtopClock := p(ClockBridgeInstantiatorKey).requestClock("buildtop_reference_clock", btFreqMHz.get * (1000 * 1000)) + + p(ClockBridgeInstantiatorKey).instantiateFireSimClockBridge } diff --git a/generators/firechip/src/main/scala/TargetConfigs.scala b/generators/firechip/src/main/scala/TargetConfigs.scala index 92cd02c3d3..43fea874a9 100644 --- a/generators/firechip/src/main/scala/TargetConfigs.scala +++ b/generators/firechip/src/main/scala/TargetConfigs.scala @@ -59,26 +59,14 @@ class WithNIC extends icenet.WithIceNIC(inBufFlits = 8192, ctrlQueueDepth = 64) class WithNVDLALarge extends nvidia.blocks.dla.WithNVDLA("large") class WithNVDLASmall extends nvidia.blocks.dla.WithNVDLA("small") - -// Tweaks that are generally applied to all firesim configs -class WithFireSimConfigTweaks extends Config( +// Non-frequency tweaks that are generally applied to all firesim configs +class WithFireSimDesignTweaks extends Config( // Required: Bake in the default FASED memory model new WithDefaultMemModel ++ // Required*: Uses FireSim ClockBridge and PeekPokeBridge to drive the system with a single clock/reset new WithFireSimSimpleClocks ++ // Required*: When using FireSim-as-top to provide a correct path to the target bootrom source new WithBootROM ++ - // Optional*: Removing this will require adjusting the UART baud rate and - // potential target-software changes to properly capture UART output - new chipyard.config.WithPeripheryBusFrequency(3200.0) ++ - // Optional: These three configs put the DRAM memory system in it's own clock domian. - // Removing the first config will result in the FASED timing model running - // at the pbus freq (above, 3.2 GHz), which is outside the range of valid DDR3 speedgrades. - // 1 GHz matches the FASED default, using some other frequency will require - // runnings the FASED runtime configuration generator to generate faithful DDR3 timing values. - new chipyard.config.WithMemoryBusFrequency(1000.0) ++ - new chipyard.config.WithAsynchrousMemoryBusCrossing ++ - new testchipip.WithAsynchronousSerialSlaveCrossing ++ // Required: Existing FAME-1 transform cannot handle black-box clock gates new WithoutClockGating ++ // Required*: Removes thousands of assertions that would be synthesized (* pending PriorityMux bugfix) @@ -87,10 +75,6 @@ class WithFireSimConfigTweaks extends Config( new chipyard.config.WithTraceIO ++ // Optional: Request 16 GiB of target-DRAM by default (can safely request up to 32 GiB on F1) new freechips.rocketchip.subsystem.WithExtMemSize((1 << 30) * 16L) ++ - // Required: Adds IO to attach SerialBridge. The SerialBridges is responsible - // for signalling simulation termination under simulation success. This fragment can - // be removed if you supply an auxiliary bridge that signals simulation termination - new testchipip.WithDefaultSerialTL ++ // Optional: Removing this will require using an initramfs under linux new testchipip.WithBlockDevice ++ // Required*: Scale default baud rate with periphery bus frequency @@ -99,6 +83,27 @@ class WithFireSimConfigTweaks extends Config( new chipyard.config.WithNoDebug ) +// Tweaks to modify target clock frequencies / crossings to firesim defaults +class WithFireSimDefaultFrequencyTweaks extends Config( + // Optional*: Removing this will require adjusting the UART baud rate and + // potential target-software changes to properly capture UART output + new chipyard.config.WithPeripheryBusFrequency(3200.0) ++ + // Optional: These three configs put the DRAM memory system in it's own clock domian. + // Removing the first config will result in the FASED timing model running + // at the pbus freq (above, 3.2 GHz), which is outside the range of valid DDR3 speedgrades. + // 1 GHz matches the FASED default, using some other frequency will require + // runnings the FASED runtime configuration generator to generate faithful DDR3 timing values. + new chipyard.config.WithMemoryBusFrequency(1000.0) ++ + new chipyard.config.WithAsynchrousMemoryBusCrossing ++ + new testchipip.WithAsynchronousSerialSlaveCrossing +) + +// Tweaks that are generally applied to all firesim configs +class WithFireSimConfigTweaks extends Config( + new WithFireSimDefaultFrequencyTweaks ++ + new WithFireSimDesignTweaks +) + /******************************************************************************* * Full TARGET_CONFIG configurations. These set parameters of the target being * simulated. @@ -204,6 +209,15 @@ class FireSimMulticlockRocketConfig extends Config( new freechips.rocketchip.subsystem.WithRationalRocketTiles ++ // Add rational crossings between RocketTile and uncore new FireSimRocketConfig) +class FireSimMulticlockAXIOverSerialConfig extends Config( + new WithAXIOverSerialTLCombinedBridges ++ // use combined bridge to connect to axi mem over serial + new WithDefaultFireSimBridges ++ + new testchipip.WithBlockDevice(false) ++ // disable blockdev + new WithDefaultMemModel ++ + new WithFireSimDesignTweaks ++ // don't inherit firesim clocking + new chipyard.MulticlockAXIOverSerialConfig +) + //********************************************************************************** // System with 16 LargeBOOMs that can be simulated with Golden Gate optimizations // - Requires MTModels and MCRams mixins as prefixes to the platform config @@ -215,3 +229,4 @@ class FireSim16LargeBoomConfig extends Config( new WithFireSimConfigTweaks ++ new boom.common.WithNLargeBooms(16) ++ new chipyard.config.AbstractConfig) + diff --git a/generators/testchipip b/generators/testchipip index ca3cc6245c..fd7760e286 160000 --- a/generators/testchipip +++ b/generators/testchipip @@ -1 +1 @@ -Subproject commit ca3cc6245c2edd253bcec67283dbfdbda4d5c3dc +Subproject commit fd7760e2862661bf6277acfeeb42644797e876d0 diff --git a/sims/vcs/Makefile b/sims/vcs/Makefile index 2c44ae6e54..405cda5441 100644 --- a/sims/vcs/Makefile +++ b/sims/vcs/Makefile @@ -66,7 +66,7 @@ $(sim_debug): $(sim_vsrcs) $(sim_common_files) $(dramsim_lib) $(EXTRA_SIM_REQS) ######################################################################################### .PRECIOUS: $(output_dir)/%.vpd %.vpd $(output_dir)/%.vpd: $(output_dir)/% $(sim_debug) - (set -o pipefail && $(sim_debug) $(PERMISSIVE_ON) $(SIM_FLAGS) $(EXTRA_SIM_FLAGS) $(VERBOSE_FLAGS) +vcdplusfile=$@ $(PERMISSIVE_OFF) $< >(spike-dasm > $<.out) | tee $<.log) + (set -o pipefail && $(sim_debug) $(PERMISSIVE_ON) $(SIM_FLAGS) $(EXTRA_SIM_FLAGS) $(SEED_FLAG) $(VERBOSE_FLAGS) +vcdplusfile=$@ $(PERMISSIVE_OFF) $< >(spike-dasm > $<.out) | tee $<.log) ######################################################################################### # general cleanup rules