Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diplomatic multiclock #614

Merged
merged 9 commits into from
Aug 28, 2020
Merged

Diplomatic multiclock #614

merged 9 commits into from
Aug 28, 2020

Conversation

@jerryz123 jerryz123 marked this pull request as draft June 23, 2020 22:24
Copy link
Contributor

@colinschmidt colinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a reasonable first pass. I can see how this would be used for more complicated clock driving strategies at the top-level.
I agree with the TODOs in your initial comment, we should have a rational clock tile crossing example with the divider built in and an async tile crossing with two clocks driving from off chip.

generators/chipyard/src/main/scala/ChipTop.scala Outdated Show resolved Hide resolved
generators/chipyard/src/main/scala/ChipTop.scala Outdated Show resolved Hide resolved
@jerryz123 jerryz123 changed the title [RFC] Diplomatic Multiclock Initial simple diplomatic clocks Jul 7, 2020
@jerryz123 jerryz123 changed the base branch from rc-retile to simple_configs July 7, 2020 05:03
@jerryz123 jerryz123 force-pushed the diplomatic-clocks branch 2 times, most recently from 05e3385 to ffd3b2d Compare July 8, 2020 00:37
@jerryz123
Copy link
Contributor Author

I cleaned this up substantially. Changes include

  • ChipTop is now LazyModule, as it contains ClockNodes
  • ChipyardClockKey lets you define a function to drive the diplomatic clocks of the system
  • Two SW-simulation demonstrations of ChipyardClockKey, harnessClock just passes through the clock from the TestHarness, and passes it everywhere. harnessMultiClock divides the harnessClock, and uses that for the uncore
  • MultiClockRocketConfig uses the harnessMultiClock setting, but it actually doesn't simulate correctly, because SimSerial and SimDRAM receive the fast TestHarness clock, instead of the divided clock. The fix for this will come later, as it is more involved.
  • ConfigValName is removed, I think this also resolves the naming issues with TestHarness appearing everywhere
  • FireSimMultiClockPOC is removed, as the base FireSim design now supports multiclock through the ChipyardClockKey.
  • FireSimMulticlockRocketConfig demonstrates the old FireSimMulticlockPOC config
  • FireSim now builds ChipTop, instead of DigitalTop, removing a bit of code duplication.

@jerryz123 jerryz123 marked this pull request as ready for review July 8, 2020 04:01
@jerryz123 jerryz123 changed the title Initial simple diplomatic clocks Diplomatic multiclock Jul 8, 2020
@jerryz123 jerryz123 changed the base branch from simple_configs to dev July 10, 2020 00:12
@jerryz123 jerryz123 force-pushed the diplomatic-clocks branch 3 times, most recently from 3b7edfd to f13e864 Compare August 4, 2020 19:14
.circleci/do-rtl-build.sh Show resolved Hide resolved
// The implicit clock and reset for the system is also, by convention, used for all the IOBinders
// TODO: This may not be the right thing to do in all cases
withClockAndReset(implicit_clock, implicit_reset) {
val (_ports, _iocells, _harnessFunctions) = p(IOBinders).values.flatMap(f => f(lSystem) ++ f(lSystem.module)).unzip3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will want to understand how to generate the clock for the test harness side of things.
I don't think we want to have the parts inside the system punch their clock into the test harness.
Either the testharness module should be able to recover the clock from an explicit pin that would exist in a real ASIC (e.g. DDR or serial links with clock recovery), or it should explicitly require some configuration of the test harness component to work correctly (e.g. uart baud rate).

But we can discuss this more later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree with the thought here, but i think it gets tricky when we want to model IFs that might not actually be used in a tape out. If we look at memory system AXI4 as a example -- it really has no business going off chip in the first place. It seems to me that it would be OK in this case to add the IO since the IF is not really fit-for-tapeout in the first place.

generators/chipyard/src/main/scala/ChipTop.scala Outdated Show resolved Hide resolved

simpleClockGroupSourceNode.map { n => n.out.unzip._1.map { out: ClockGroupBundle =>
out.member.elements.map { case (name, data) =>
// This is mega hacks, how are you actually supposed to do this?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be the way, it seems hackier than before there was diplomatic clocking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💀

At least everything is contained in one place now.

generators/chipyard/src/main/scala/ConfigFragments.scala Outdated Show resolved Hide resolved

def success = io.success
def harnessReset = this.reset.asBool
ldut.harnessFunctions.foreach(_(this))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the IOBinders depended on being called before the success val is set right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? IOBinders can either set or read the success signal. Are you envisioning a case where the order of evaluation matters?

@@ -17,6 +17,7 @@ class AbstractConfig extends Config(
new chipyard.config.WithBootROM ++ // use default bootrom
new chipyard.config.WithUART ++ // add a UART
new chipyard.config.WithL2TLBs(1024) ++ // use L2 TLBs
new chipyard.config.WithNoSubsystemDrivenClocks ++ // drive the subsystem diplomatic clocks from ChipTop instead of using implicit clocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad we an abstract config 👍
I would be tempted to make the default config the divided version, since that is more "realistic".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be cool, if only it worked 😢

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do that once it does work.

Copy link
Contributor

@davidbiancolin davidbiancolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits. Cleaning up and unifying the firesim/chipyard handling is going to depend on what SiFive suggests.

generators/chipyard/src/main/scala/ChipTop.scala Outdated Show resolved Hide resolved
// The implicit clock and reset for the system is also, by convention, used for all the IOBinders
// TODO: This may not be the right thing to do in all cases
withClockAndReset(implicit_clock, implicit_reset) {
val (_ports, _iocells, _harnessFunctions) = p(IOBinders).values.flatMap(f => f(lSystem) ++ f(lSystem.module)).unzip3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree with the thought here, but i think it gets tricky when we want to model IFs that might not actually be used in a tape out. If we look at memory system AXI4 as a example -- it really has no business going off chip in the first place. It seems to me that it would be OK in this case to add the IO since the IF is not really fit-for-tapeout in the first place.

Debug.tieoffDebug(debugPortOpt, resetctrlOpt, Some(psdPort))(system.p)
// tieoffDebug doesn't actually tie everything off :/
debugPortOpt.foreach { d =>
d.clockeddmi.foreach({ cdmi => cdmi.dmi.req.bits := DontCare; cdmi.dmiClock := th.clock })
// TODO: Using harness clock/reset will be incorrect when systemClock =/= harnessClock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the harness is expected to provide the clock here right, so this becomes a debug module configuration problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I blanket applied this comment everywhere harnessClock was referenced in the IOBinders, but this case should actually not be using the harnessClock at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it turns out the way we drive the debug module clocks is pretty wrong, debug.clock should not be punched off-chip to be driven.
We should fix the WithSimDebug and WithTiedOffDebug IOBinders in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is probably some confusion between the DMI_DTM signals going off chip (which is only intended as a simulation config) and the JTAG_DTM version which has a pretty clear set of signals that should go off-chip (and is intended to be actually implemented in the real world). Perhaps this confusion is carried through to the IOBinders?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is accurate. I think the problem is that we are plumbing through the entire DebugIO bundle, instead of just DebugIO.systemjtag or DebugIO.clockeddmi.

DebugIO.clock appears to require a clock which is synchronous to the Tilelink buses (https://github.com/chipsalliance/rocket-chip/blob/3176ed81c2bfa1a24622c511a1e4b0777bc9377c/src/main/scala/devices/debug/Debug.scala#L1775-L1780), so it should not be generated offchip. I feel like we should be calling Debug.connectClockAndReset somewhere within the chip...

generators/chipyard/src/main/scala/TestHarness.scala Outdated Show resolved Hide resolved
@@ -106,8 +106,8 @@ class BoomF1Tests extends FireSimTestSuite("FireSim", "DDR3FRFCFSLLC4MB_FireSimL
class RocketNICF1Tests extends FireSimTestSuite("FireSim", "WithNIC_DDR3FRFCFSLLC4MB_FireSimRocketConfig", "BaseF1Config")
// Multiclock tests
class RocketMulticlockF1Tests extends FireSimTestSuite(
"FireSimMulticlockPOC",
"FireSimQuadRocketMulticlockConfig",
"FireSim",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to require changing the inis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this when I regenerate AGFIs

@jerryz123 jerryz123 force-pushed the diplomatic-clocks branch 4 times, most recently from 87bbc63 to ce81f02 Compare August 26, 2020 04:39
@jerryz123
Copy link
Contributor Author

@davidbiancolin @colinschmidt
I favor merging this PR as-is (after agfi regeneration), assuming no major changes are still necessary. There are still a number of problems to solve before multi-clock designs can be cleanly enabled. Those should be addressed in follow-up PRs, as they may require extensive changes through sub-projects, and additional rocketchip bumps.

@davidbiancolin
Copy link
Contributor

I'm fine with that -- but you should put a failing require or assert in the divided config.

@colinschmidt
Copy link
Contributor

Yeah. Merging is fine as we have plenty of time before the next release.
Is there a list of things that need to be fixed or is that an unknown unknown? If you know of a list or at least a partial list its probably worth filing a tracking issue.

@jerryz123 jerryz123 mentioned this pull request Aug 26, 2020
7 tasks
Copy link
Contributor

@colinschmidt colinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jerryz123 jerryz123 merged commit ee1ce11 into dev Aug 28, 2020
@alonamid alonamid mentioned this pull request Nov 24, 2020
@jerryz123 jerryz123 deleted the diplomatic-clocks branch December 24, 2020 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants