Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Lfs to pass usertests #612

Merged
merged 13 commits into from
Mar 13, 2022
Merged

Conversation

travis1829
Copy link
Collaborator

@travis1829 travis1829 commented Mar 8, 2022

  • Added basic synchronization. 2c5ffeb
  • Fixed bugs in InodeGuard::bmap_internal. 1acf937
  • Added SegSumEntry::IndirectMap. de1c0d3
  • and more

Now, only the following usertests seem to always fail.

  • manywrites
  • bigwrite

We need to implement the segment cleaner to fix this.

@travis1829 travis1829 changed the title [WIP] Lfs usertests Fix Lfs to pass usertests Mar 8, 2022
@travis1829 travis1829 marked this pull request as ready for review March 8, 2022 16:52
@travis1829
Copy link
Collaborator Author

travis1829 commented Mar 9, 2022

EDIT: Now, when tested with rustc 2022-03-09 and on release mode, only the following usertests fail and all others seem to always success.

  • manywrites
  • bigwrite : this one fails just because we didn't handle cases where we're out of disk blocks
  • createtest
  • iref

Also, note #613

.set(self.inum, disk_block_no, &mut segment, ctx));
let mut imap = tx.fs.imap(ctx);
assert!(imap.set(self.inum, disk_block_no, &mut segment, ctx));
imap.free(ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it valid to free map before committing the segment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will cause any problems, but I changed it anyway just in case and to make things consistent with Itable::alloc_inode. Note that,

  • if you were talking about deadlocks, I don't think it'll cause any problems since when we want to lock both Segment and Imap, we always do it in the order of Segment -> Imap.
  • if its about race conditions, then I also think its fine since we already wrote to the imap and segment.commit() doesn't modify its content (just reads the buf and writes it to the disk). If we want to modify the imap's content, we need the segment guard anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no problem, isn’t freeing the imap after committing the segment detrimental to the performance in a multi-threaded execution?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right and actually, there's a few more places where I decided to be careful (even though it may be unnecessary) rather than aim for better performance. I think we should remove such waste of performance later after we confirmed we removed all deadlocks in lfs.

kernel-rs/src/fs/lfs/inode.rs Outdated Show resolved Hide resolved
///
/// # Note
///
/// You should make sure the segment has an empty block before calling this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be better to check this inside this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since InodeGuard::{writable_data_block_inner, writable_indirect_block} is only used inside the InodeGuard::writable_data_block, and since those two methods must be called always after checking that we have at least two blocks on the segment, I think its better to just panic instead of runtime check inside those functions.

@travis1829
Copy link
Collaborator Author

  • Finally, it seems like almost all deadlock issues and bugs are resolved.
  • Now, all usertests except manywrites, bigwrite and sharedfd always pass even on CPUS=3 (checked at least 10 times).
    • manywrites and bigwrite fail because we didn't implement the segment cleaner.
    • sharedfd : This one rarely fails on CPUS=3.
  • Note that I tested this using rustc 2022-03-09 on debug mode (with CPUS=3).

@travis1829
Copy link
Collaborator Author

travis1829 commented Mar 11, 2022

The failure of sharedfd seems to be by a disk error. An assertion in VirtioDisk::intr fails.

assert!(!info.inflight[id].status, "Disk::intr status");
Thread 3 hit Breakpoint 1, core::panicking::panic (expr=...)
    at /kaist-cp-home/taewoo.kim/.rustup/toolchains/nightly-2022-03-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:48
48          panic_fmt(fmt::Arguments::new_v1(&[expr], &[]));
(gdb) bt
#0  core::panicking::panic (expr=...)
    at /kaist-cp-home/taewoo.kim/.rustup/toolchains/nightly-2022-03-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:48
#1  0x0000000080014112 in rv6_kernel::virtio::virtio_disk::VirtioDisk::intr (self=..., kernel=...)
    at src/virtio/virtio_disk.rs:376
#2  0x000000008001318e in rv6_kernel::trap::<impl rv6_kernel::kernel::KernelRef>::handle_irq (self=..., 
    irq_type=<optimized out>) at src/trap.rs:232
#3  0x0000000080012fee in rv6_kernel::trap::<impl rv6_kernel::kernel::KernelRef>::kernel_trap (self=..., 
    trap_info=<optimized out>) at src/trap.rs:175
#4  0x00000000800301dc in rv6_kernel::trap::kerneltrap::{{closure}} (kref=...) at src/trap.rs:53
#5  rv6_kernel::kernel::kernel_ref::{{closure}} (k=...) at src/kernel.rs:47
#6  rv6_kernel::util::branded::Branded<T>::new (inner=..., f=...) at src/util/branded.rs:235
#7  rv6_kernel::kernel::kernel_ref (f=...) at src/kernel.rs:47
#8  rv6_kernel::trap::kerneltrap (arg=<optimized out>) at src/trap.rs:53
#9  0x0000000080034b24 in kernelvec ()
Backtrace stopped: frame did not save the PC

It seems like the error does not happen in Ufs, so seems like its a problem caused by the Lfs.
EDIT: Fixed in 9df7a1d

@jeehoonkang
Copy link
Member

bors r+

without actually reviewing this PR...

@kaist-cp-bors
Copy link
Contributor

kaist-cp-bors bot commented Mar 13, 2022

Build succeeded:

@kaist-cp-bors kaist-cp-bors bot merged commit 2ebafa1 into kaist-cp:main Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants