NES database lookup #611

yo1dog · 2022-10-31T21:35:04Z

yo1dog
Oct 31, 2022

Don't know much about NES in general, but I had a thought: AFAIK, dumping NES is difficult because it requires knowing PRG/CHR size beforehand. Hence the need for manually sourcing and inputting these values from places like http://nes.dnsabr.com. Cart dumper automates this by storing a database of hashes of globally known seekable sections of the PRG ROM. However, this does not always work because there can be conflicts when these sections are identical between different games.

My mind immediately jumps to a b-tree index type strategy: Progressively scan and hash the PRG and CHR ROMs using the current match set to increase the seekable area and narrow down possibilities.

Let's pretend we have a database of the entire NES library which contains hashes for the first n bytes of each PRG ROM:

name	PRG ROM size	16k CRC	128k CRC	512k CRC
Mario	16k	`9f115a9e`	-	-
Zelda	128k	`a3b3e36e`	`53b88a7a`	-
Contra	128k	`a3b3e36e`	`b0c8c11e`	-
Kirby	512k	`a3b3e36e`	`2864f7cc`	`4f3ad289`
Tetris	512k	`a3b3e36e`	`2864f7cc`	`d2dce641`
Gradius	512k	`dd611655`	`f5525447`	`707c2b2f`
Frogger	512k	`4bf93c55`	`3b2dc183`	`12c70afd`
PacMan	512k	`4bf93c55`	`3b2dc183`	`8c7869e6`

Now we are dumping a cart. We know the minimum size is 16k so it is safe to read and hash the first 16k. Doing so produces a3b3e36e. This matches Zelda, Contra, Kirby, and Tetris. Of the 4, the smallest size is 128k, so we continue reading and hashing to 128k. Now we produce 2864f7cc which matches Kirby and Tetris. Both games are 512k, so we continue reading and hashing to 512k and produce d2dce641 which matches Tetris.

Theoretically this would resolve all ambiguity except for (very rare) cases in which both the PRG ROM and CHR ROM begin with the entirety of another PRG ROM and CHR ROM.

The database could be minimized drastically to only hashes required to resolve conflicts. For example, there is no reason to store the 128k and 512k hashes for Gradius because its 16k hash is unique. Same with the 128k hashes for Frogger and PacMan as they provide no disambiguation. In fact, rather than storing a flat lookup table, you could store an index-tree-like structure instead that contained disambiguation instructions:

Read 16k
├ 9f115a9e: Mario
├ a3b3e36e: Read 128k
│ ├ 53b88a7a: Zelda
│ ├ b0c8c11e: Contra
│ └ 2864f7cc: Read 512k
│   ├ 4f3ad289: Kirby
│   └ d2dce641: Tetris
├ dd611655: Gradius
└ 4bf93c55: Read 512k
  ├ 12c70afd: Frogger
  └ 8c7869e6: PacMan

I tested this theory on a headerless no-intro ROM set using NES2.0 DB. It was able to index and distinguish all but 21 of the 3,560 ROMs. The Virtual Console and cassette dumps can be excluded which brings the number down to 9, only 3 of which are "standard" games:

9FFE2F55 PRG:65536 CHR:131072
  ├─ 9FFE2F55 Sky Shark (USA) - PRG:65536 CHR:131072
  └─ 4AF742FA Sky Shark (USA) (Rev 1) - PRG:131072 CHR:131072
 E41220D8 PRG:262144 CHR:0
  ├─ E41220D8 Assimilate (USA) (RetroUSB) (Aftermarket) (Homebrew) - PRG:262144 CHR:0
  └─ 7145F667 Assimilate (USA) (RetroUSB) (Aftermarket) (Homebrew) (Alt) - PRG:524288 CHR:0
 CD8233EF PRG:16384 CHR:8192
  ├─ 2F55BE88 Lunar Ball (Japan) - PRG:16384 CHR:8192
  ├─ 80CBCACB Golden Game 100-in-1 (Asia) (En) (Pirate) - PRG:1048576 CHR:0
  ├─ 6175B9A0 Golden Game 150-in-1 (Asia) (En) (Pirate) - PRG:2097152 CHR:0
  ├─ 46A1AE7B Golden Game 210-in-1 (Asia) (En) (Pirate) - PRG:2097152 CHR:0
  └─ 4E5668A9 Golden Game 260-in-1 (Asia) (En) (Pirate) - PRG:3145728 CHR:0
  
20F98977 PRG:16384 CHR:16384
  ├─ 20F98977 City Connection (Japan) - PRG:16384 CHR:16384
  └─ D20775DA City Connection (Japan) (Virtual Console, Switch Online) - PRG:32768 CHR:16384
0F05FF0A PRG:32768 CHR:8192
  ├─ 0F05FF0A Seicross (Japan) (Rev 1) - PRG:32768 CHR:8192
  └─ 3413E33B Seicross (Japan) (Virtual Console) - PRG:32768 CHR:16384
E37A39AB PRG:131072 CHR:65536
  ├─ E37A39AB Yoshi's Cookie (Europe) - PRG:131072 CHR:65536
  └─ CAA76927 Yoshi's Cookie (Europe) (Virtual Console) - PRG:131072 CHR:131072
A2623BC1 PRG:131072 CHR:131072
  ├─ A2623BC1 Nantettatte!! Baseball (Japan) - PRG:131072 CHR:131072
  ├─ 6C039D11 Nantettatte!! Baseball + Nantettatte!! Baseball - Ko-Game Cassette - '91 Kaimaku Hen (Japan) - PRG:147456 CHR:131072
  └─ A5275B36 Nantettatte!! Baseball + Nantettatte!! Baseball - Ko-Game Cassette - OB All Star Hen (Japan) - PRG:147456 CHR:131072
ADFAD6B6 PRG:131072 CHR:0
  ├─ ADFAD6B6 Karaoke Studio (Japan) - PRG:131072 CHR:0
  ├─ 4B6EF399 Karaoke Studio Senyou Cassette - Top Hit 20 Vol. 1 (Japan) - PRG:262144 CHR:0
  └─ 50F3E338 Karaoke Studio Senyou Cassette - Top Hit 20 Vol. 2 (Japan) - PRG:262144 CHR:0

All of these are instances in which the original/parent ROM is included in its entirety at the start of the child ROM.

I attached the generated index in JSON. Right now it's just a map of partial CRC32 to full CRC32, but it could instead map to game name, PRG ROM size, mapper, etc.

nesIndex2.json.txt

Thoughts?

yo1dog · 2022-11-01T18:45:57Z

yo1dog
Nov 1, 2022
Author

I realized there was a bug with the way I was hashing the CHR ROM. Here is an updated index which maps to full NES2.0 header data and No-Intro names: nesIndex4.json.txt

0 replies

sanni · 2022-11-02T21:28:59Z

sanni
Nov 2, 2022
Maintainer

Interesting idea 👍

But I'm not sure it will work on physical cartridges(at least not all of them) because with some carts you have to set the mapper first to read the bytes of the ROM.

2 replies

yo1dog Nov 2, 2022
Author

This is the part I was not sure about. What are the situations in which you cannot start reading the PRG ROM without knowing the mapper? How does the original console handle this?

kamalawala Nov 3, 2022

This is the part I was not sure about. What are the situations in which you cannot start reading the PRG ROM without knowing the mapper? How does the original console handle this?

Yeah, not overly familiar with the NES but I'd imagine it's like the Game Boy in that it starts at a fixed point in ROM and mapper support needs to be implemented in software (via the readable part of ROM). Though, unlike the NES, Game Boy ROMs already have headers (not sure if the console does much with them other than setting CGB mode, checking SGB compatibility, verifying the header checksum, and verifying the Nintendo logo).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NES database lookup #611

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

NES database lookup #611

yo1dog Oct 31, 2022

Replies: 2 comments · 2 replies

yo1dog Nov 1, 2022 Author

sanni Nov 2, 2022 Maintainer

yo1dog Nov 2, 2022 Author

kamalawala Nov 3, 2022

yo1dog
Oct 31, 2022

Replies: 2 comments 2 replies

yo1dog
Nov 1, 2022
Author

sanni
Nov 2, 2022
Maintainer

yo1dog Nov 2, 2022
Author