-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange input signal behavior #158
Comments
Made a minimal example using one file. I also stumbled upon the sudden appearance of unknown bits. There are comments in the code regarding this. Maybe these problems are somehow related? Or should I open a separate issue? import 'dart:convert';
import 'dart:io';
import 'package:rohd/rohd.dart';
class UTF8Encoder extends Module {
UTF8Encoder(Logic codepoint) {
codepoint = addInput('codepoint', codepoint, width: 21);
final bytes = addOutput('bytes', width: 32);
final count = Logic(name: 'count', width: 2);
final offset = Logic(name: 'offset', width: 8);
final byte = Logic(name: 'byte', width: 8);
final debug1 = Logic(name: 'debug_1', width: 21);
Combinational([
count < 0,
If((codepoint >= 0) & codepoint.lte(0x7F), then: [
bytes < codepoint.zeroExtend(32)
], orElse: [
If((codepoint >= 0x800) & codepoint.lte(0xFFFF), then:[
count < 2,
offset < 0xE0
])
]),
If(~count.eq(0), then: [
byte < ((codepoint >>> (Const(6, width: 5) * count.zeroExtend(5))) +
offset.zeroExtend(21)).slice(7, 0),
// NOTE: Codepoint doesn't work correctly without this.
debug1 < codepoint,
// An attempt to directly substitute "byte" into "bytes" in this case
// leads to the error "Cannot convert invalid LogicValue to int:
// 32'bxxxxxxxx101000001000000011100010", but is it possible for
// unknown bits to appear if "zeroExtend" is executed?
/*
bytes < ((codepoint >>> (Const(6, width: 5) * count.zeroExtend(5))) +
offset.zeroExtend(21)).slice(7, 0).zeroExtend(32),
*/
bytes < byte.zeroExtend(32),
bytes < bytes.withSet(8, Const(0x80, width: 8) |
((codepoint >>> (Const(6, width: 4) * (count - 1).zeroExtend(4)))
.slice(7, 0) & Const(0x3F, width: 8))),
count < count - 1,
If(~count.eq(0), then: [
bytes < bytes.withSet(16, Const(0x80, width: 8) |
((codepoint >>> (Const(6, width: 3) * (count - 1).zeroExtend(3)))
.slice(7, 0) & Const(0x3F, width: 8))),
count < count - 1,
If(~count.eq(0), then: [
bytes < bytes.withSet(24, Const(0x80, width: 8) |
(codepoint.slice(7, 0) & Const(0x3F, width: 8))),
])
])
])
]);
}
Logic get bytes => output('bytes');
}
Future<void> main() async {
// Preparing inputs.
final codepoint = Logic(width: 21);
// Create and build module instance.
final utf8encoder = UTF8Encoder(codepoint);
await utf8encoder.build();
// Generate SystemVerilog code.
File('code.sv').writeAsStringSync(utf8encoder.generateSynth());
// Prepare WaveDumper for simulation tracking.
WaveDumper(utf8encoder);
// Prepare input and output.
final codepoints = '†† †† † †'.runes;
final utf8Bytes = <int>[];
// Run simulation.
SimpleClockGenerator(10);
await Simulator.tick();
await Simulator.tick();
for (final inputCodepoint in codepoints) {
codepoint.inject(inputCodepoint);
await Simulator.tick();
await Simulator.tick();
await Simulator.tick();
for (var i = 0; i < 4; ++i) {
final tempByte = (utf8encoder.bytes.value.toInt() >> (8 * i)) & 0xFF;
if ((tempByte != 0) || (i == 0)) {
utf8Bytes.add(tempByte);
}
}
}
codepoint.inject(0);
await Simulator.tick();
await Simulator.tick();
// Print result.
stdout.writeln(utf8.decode(utf8Bytes));
} |
I'm able to reproduce a failure in your example code, but I'm still trying to figure out what the failure means since I'm not that familiar with the details of UTF8 encoding. It is certainly strange that some extraneous conditional assignments have an effect on the behavior, so I expect something suspicious happening within ROHD here, so thank you for filing this! As a side note, if you use |
Further shortening of the code: import 'dart:io';
import 'package:rohd/rohd.dart';
class ExampleModule extends Module {
ExampleModule(Logic codepoint) {
codepoint = addInput('codepoint', codepoint, width: 21);
final bytes = addOutput('bytes', width: 32);
final count = Logic(name: 'count', width: 2);
Combinational([
If(codepoint.eq(0x2020), then:[
count < 2,
bytes < ((codepoint >>> (Const(6, width: 5) * count.zeroExtend(5))) +
Const(0xE0, width: 21)).slice(7, 0).zeroExtend(32),
count < count - 2,
]),
]);
}
Logic get bytes => output('bytes');
}
Future<void> main() async {
final codepoint = Logic(width: 21);
final exampleModule = ExampleModule(codepoint);
await exampleModule.build();
File('code.sv').writeAsStringSync(exampleModule.generateSynth());
final codepoints = '†† †† † † q†† †'.runes;
for (final inputCodepoint in codepoints) {
codepoint.put(inputCodepoint);
print(exampleModule.bytes.value);
}
} I don't know how to reduce it yet. When Tried to discard all the details regarding UTF-8 encoding. Just watch the console output.
Ignore completely unknown lines - these are My guess is that the Thanks for the advice! But in fact, I used |
I also seem to have missed the fact that the 24 most significant bits are unknown even though the |
Thank you for the shortened example, it's very helpful towards debugging this! |
I think I've root caused a collection of issues in |
Checked - everything works. Great job! |
Describe the bug
The incoming signal behaves incorrectly in a certain case. Adding a useless assignment operator solves the problem.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The "codepoint" input does not require the use of a useless assignment operator to work correctly.
Actual behavior
The "codepoint" input does not work correctly.
Additional details
Additional context
To me, this is some weird issue that occurs with code point U+2020 but does not occur with U+2019 and U+2021. I also want to note that I encountered a problem when testing using this file. Before the appearance of the U+2020 code point, there are many other three-byte sequences in the file, but there are no problems with them.
The text was updated successfully, but these errors were encountered: