Object size mismatch error

feng1688qiao · July 13, 2024, 12:16pm

hello,
Hello, I encountered an error message,

panic: L6: 000280.sst: object size mismatch (.config/store/000280.sst): 3899392 (disk) !=  5344592 (MANIFEST)

Viewing the file 000280.sst, it shows January 1, 1970, and the program cannot start properly

# ll .config/store/000280.sst 
-rw-r--r-- 1 root root 3899392 Jan  1  1970 .config/store/000280.sst

Is there a way to ignore this error and let the program start normally
The error message is as follows:

{"level":"info","ts":1720867277.8080835,"caller":"rpc/data_worker_ipc_server.go:92","msg":"data worker listening","address":"/ip4/127.0.0.1/tcp/40000"}
panic: L6: 000280: object size mismatch (.config/store/000280.sst): 3899392 (disk) != 5344592 (MANIFEST)

goroutine 1 [running]:
source.quilibrium.com/quilibrium/monorepo/node/store.NewPebbleDB(0xc00032a0c0)
        /opt/ceremonyclient/node/store/pebble.go:18 +0x8d
source.quilibrium.com/quilibrium/monorepo/node/app.NewNode(0xc00061a550, 0xc000668000)
        /opt/ceremonyclient/node/app/wire_gen.go:72 +0x65
main.main()
        /opt/ceremonyclient/node/main.go:426 +0xae5
{"level":"error","ts":1720867289.786702,"caller":"rpc/data_worker_ipc_server.go:125","msg":"parent process not found,

abc · July 14, 2024, 7:37pm

Hello, it looks like a store file got corrupted somehow. Is it possible that you restored from a backup and something went wrong?

feng1688qiao · July 18, 2024, 2:37am

Yes, it is a file recovered from a backup. I don’t know what caused the damage. Is there any way to repair the file

abc · July 18, 2024, 8:45am

Not sure tbh, I’ve never done this, but maybe you could try to update the MANIFEST file to match the expected size for the sst file and see what happens?

Tyga · July 18, 2024, 9:02pm

Why this happens:

This, I believe, is caused from one of your MANIFEST file being outdated or being included (when it should have been removed from the backup).

Outdated = your backup didn’t need to delete a MANIFEST file, but also failed to update it to reflect the SST file’s size change

Including an unneeded MANIFEST = as your database grows, some .sst, .log, MANIFEST and OPTIONS files are removed as some are modified to include that data (called consolidation)

If your backup doesn’t remove these unneeded files or track these file changes, then database software attempt to load them back into the node’s database on initialization.

Order of operations

It goes something like this:

The database software consolidates an older SST into a newer one
It updates the MANIFEST/OPTIONS file(s) to reflect this change
Occasionally, MANIFEST/OPTIONS files will also be consolidated into newer ones, and old ones removed
Your backup saw it already had the MANIFEST file and didn’t sync it or didn’t delete it from your backup after it was consolidated
when you restored and ran your node, it ran through all the MANIFEST files and attempted to restore that state-- however because your file sizes are now changed due to the consolidation, any outdated or previously removed MANIFEST files will see this as a mismatch and fail, as it should (the file shouldn’t be there or the referenced files are not the expected content).

Other considerations

It also could be that your SST files were not updated in the backups as well.
your backups are changing the files as part of the backup process and your restorations do not take this into account

Solutions:

if you have your previously working node’s store still available (I recommend doing a few comparisons to see if your backed up store matches your current working store state before completely decommissioning a node, just to verify your backups work as intended, use ls -al or equivalent to verify file sizes are there) and then finding a way to create a backup in a way that actually works while you still have the working database files
you can attempt to remove older MANIFEST/OPTION files to see if that works (always work on a copy, so you can restart)
try to figure out what your backup command is doing and seeing if there is something you aren’t accounting for when restoring your files from it

In my case, my backups were not syncing the files after they were created in the backup, meaning that they only were backed up as the they were when the first backup happened and any changes were ignored-- essentially all store updates to files were lost in the backup. Realizing this by evaluating my backup command, I determined that trying my point #2 would be a waste of time and #1 was not an option for most of my nodes with faulty backups already having been decommissioned.

feng1688qiao · July 19, 2024, 9:51am

Okay, thank you both

Topic		Replies	Views
Node running error Node Running	3	136	November 23, 2024
Parent process not found error Node Running bug	3	205	August 1, 2024
gRPC error: received message larger than max when querying peer manifest Node Running question , answered	7	181	July 1, 2024
[help]The log shows background error:pebble/table: invalid table Node Running question , answered	1	95	September 10, 2024
How to solve this problem？parent process not found General	0	102	October 24, 2024

Object size mismatch error

Why this happens:

Order of operations

Other considerations

Solutions:

Related topics