Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve folder name for persistent doc tests #69458

Merged
merged 1 commit into from
Mar 31, 2020
Merged

improve folder name for persistent doc tests #69458

merged 1 commit into from
Mar 31, 2020

Conversation

Luro02
Copy link
Contributor

@Luro02 Luro02 commented Feb 25, 2020

This fixes #69411, by using the entire path as folder name and storing already visited paths in a HashMap + appending a number to the file name for duplicates.

@Luro02
Copy link
Contributor Author

Luro02 commented Feb 26, 2020

One solution is to pass around a HashSet, that contains already written doc tests.

If there is a duplicate folder it would be overwritten (if the folder does not already exist in the HashSet) otherwise the next free number would be appended to the filepath, e.g. module_1_file_rs_1.

This would prevent cluttering of the folder (old tests would be overwritten) and it does not involve difficult parsing to know wether or not the test is from a proc-macro. It would also prevent future name clashes.

@GuillaumeGomez
Copy link
Member

Just a thought about this: instead of basing the id on the file line, wouldn't it be better to base on it the test number? Like it's the third test of this file so file_3.rs or equivalent. What do you think of this?

Also, cc @rust-lang/rustdoc

@Luro02
Copy link
Contributor Author

Luro02 commented Feb 28, 2020

@GuillaumeGomez this would work, but from where would you get the test number? or should this iterate through all folders? (would require one more syscall for each test, which could decrease runtime performance)

@rust-highfive
Copy link
Collaborator

The job mingw-check of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2020-02-29T11:54:41.0297103Z ========================== Starting Command Output ===========================
2020-02-29T11:54:41.0299556Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/d9b6d399-9205-4dba-a479-e6fe5730d964.sh
2020-02-29T11:54:41.0299814Z 
2020-02-29T11:54:41.0303314Z ##[section]Finishing: Disable git automatic line ending conversion
2020-02-29T11:54:41.0322179Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/69458/merge to s
2020-02-29T11:54:41.0326065Z Task         : Get sources
2020-02-29T11:54:41.0326334Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-02-29T11:54:41.0326593Z Version      : 1.0.0
2020-02-29T11:54:41.0326768Z Author       : Microsoft
---
2020-02-29T11:54:44.3080044Z ##[command]git remote add origin https://github.com/rust-lang/rust
2020-02-29T11:54:44.3089590Z ##[command]git config gc.auto 0
2020-02-29T11:54:44.3097458Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2020-02-29T11:54:44.3104142Z ##[command]git config --get-all http.proxy
2020-02-29T11:54:44.3112552Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/69458/merge:refs/remotes/pull/69458/merge
---
2020-02-29T12:05:19.6399420Z     Checking rustdoc v0.0.0 (/checkout/src/librustdoc)
2020-02-29T12:05:20.1646437Z error[E0412]: cannot find type `Path` in this scope
2020-02-29T12:05:20.1647876Z    --> src/librustdoc/test.rs:210:35
2020-02-29T12:05:20.1648799Z     |
2020-02-29T12:05:20.1649848Z 210 |     visited_tests: &Mutex<HashMap<Path, usize>>,
2020-02-29T12:05:20.1652410Z     |
2020-02-29T12:05:20.1653353Z help: possible candidates are found in other modules, you can import them into scope
2020-02-29T12:05:20.1654272Z     |
2020-02-29T12:05:20.1655490Z 1   | use crate::clean::types::Path;
---
2020-02-29T12:05:20.1660897Z 1   | use syntax::ast::Path;
2020-02-29T12:05:20.1661790Z     |
2020-02-29T12:05:20.1662622Z help: you might be missing a type parameter
2020-02-29T12:05:20.1663423Z     |
2020-02-29T12:05:20.1665154Z 194 | fn run_test<Path>(
2020-02-29T12:05:20.1666626Z 
2020-02-29T12:05:20.1666626Z 
2020-02-29T12:05:22.5324872Z error[E0282]: type annotations needed for `std::sync::Mutex<std::collections::HashMap<K, V>>`
2020-02-29T12:05:22.5326443Z     |
2020-02-29T12:05:22.5327217Z 723 |         let test_number = Mutex::new(HashMap::new());
2020-02-29T12:05:22.5328456Z     |             -----------              ^^^^^^^^^^^^ cannot infer type for type parameter `K`
2020-02-29T12:05:22.5329395Z     |             |
2020-02-29T12:05:22.5329395Z     |             |
2020-02-29T12:05:22.5330889Z     |             consider giving `test_number` the explicit type `std::sync::Mutex<std::collections::HashMap<K, V>>`, where the type parameter `K` is specified
2020-02-29T12:05:22.6088034Z error: aborting due to 2 previous errors
2020-02-29T12:05:22.6093551Z 
2020-02-29T12:05:22.6106545Z Some errors have detailed explanations: E0282, E0412.
2020-02-29T12:05:22.6107273Z For more information about an error, try `rustc --explain E0282`.
---
2020-02-29T12:05:22.6313097Z   local time: Sat Feb 29 12:05:22 UTC 2020
2020-02-29T12:05:22.7872908Z   network time: Sat, 29 Feb 2020 12:05:22 GMT
2020-02-29T12:05:22.7875187Z == end clock drift check ==
2020-02-29T12:05:23.3938226Z 
2020-02-29T12:05:23.4009270Z ##[error]Bash exited with code '1'.
2020-02-29T12:05:23.4021183Z ##[section]Finishing: Run build
2020-02-29T12:05:23.4070333Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/69458/merge to s
2020-02-29T12:05:23.4074801Z Task         : Get sources
2020-02-29T12:05:23.4075101Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-02-29T12:05:23.4075535Z Version      : 1.0.0
2020-02-29T12:05:23.4075752Z Author       : Microsoft
2020-02-29T12:05:23.4075752Z Author       : Microsoft
2020-02-29T12:05:23.4076079Z Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
2020-02-29T12:05:23.4076433Z ==============================================================================
2020-02-29T12:05:23.7663477Z Cleaning any cached credential from repository: rust-lang/rust (GitHub)
2020-02-29T12:05:23.7714449Z ##[section]Finishing: Checkout rust-lang/rust@refs/pull/69458/merge to s
2020-02-29T12:05:23.7814591Z Cleaning up task key
2020-02-29T12:05:23.7816166Z Start cleaning up orphan processes.
2020-02-29T12:05:23.8019496Z Terminate orphan process: pid (3901) (python)
2020-02-29T12:05:23.8354898Z ##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Luro02
Copy link
Contributor Author

Luro02 commented Feb 29, 2020

Simply reading the folder names and putting the test in the next free folder would not be a viable solution, because this would mean that old test will never be overwritten, therefore one has to keep somewhere a list of all folders that were already created.

I decided to use a Mutex, because I do not know what could be used instead, which also works in a multi-threaded environment.

src/librustdoc/test.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented Mar 1, 2020

☔ The latest upstream changes (presumably #69592) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 1, 2020
@@ -205,6 +207,7 @@ fn run_test(
mut error_codes: Vec<String>,
opts: &TestOptions,
edition: Edition,
visited_tests: &Mutex<HashMap<String, usize>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be changed to visited_tests: &Mutex<HashMap<u64, usize>>?

It is not required to keep the entire folder_name in memory a hash would suffice.

Comment on lines 249 to 254
visited_tests
.lock()
.unwrap()
.entry(folder_name.clone())
.and_modify(|v| *v += 1)
.or_insert(0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part would then be changed to something like this:

let mut hasher = std::collections::hash_map::DefaultHasher::new();
folder_name.hash(&mut hasher);

visited_tests
    .lock()
    .unwrap()
    .entry(hasher.finish())
    .and_modify(|v| *v += 1)
    .or_insert(0)

which would most likely increase performance, because only 8 bytes have to be saved for each test in the HashMap, instead of an arbitrary number of bytes, which would be most likely larger than 8 bytes. I think cloning an entire String should be slower than hashing it.

I am not sure if this change is desirable, because it might make the code harder to read?

@Luro02
Copy link
Contributor Author

Luro02 commented Mar 1, 2020

r? @QuietMisdreavus

@kinnison
Copy link
Contributor

kinnison commented Mar 8, 2020

The naming after a location in the source has always bugged me -- I wonder if there's any hope we could actually name the test after the path to the item being tested, and then perhaps a monotonic test number for that item.

@GuillaumeGomez
Copy link
Member

Sorry for not responding earlier, completely missed the notification...

That's why I suggested to use the position (not in term of line) of the test in the file. But it still seems to be not enough, but I'm not sure what I'm missing here...

An idea maybe @ollie27 ?

@Luro02
Copy link
Contributor Author

Luro02 commented Mar 14, 2020

What is wrong with the current implementation?

@GuillaumeGomez
Copy link
Member

I can't put the finger on it. But maybe your implementation is perfectly correct and I'm just imagining things. That's why I asked for other opinions. If they don't find anything, then we can just merge it. Don't worry, we'll move forward and sorry it takes so much time!

Copy link
Member

@ollie27 ollie27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think what this PR is proposing is a nice improvement but there will still be chances for filename collisions.

I don't like removing the line number from the path as that will make it difficult to figure out which test corresponds to which path. Maybe the line number can be included as well as the incremented number like "{name}_{line}_{number}". Possibly only appending the number if it's greater than 0 so in most cases it won't be noticed.

src/librustdoc/test.rs Outdated Show resolved Hide resolved
@Luro02
Copy link
Contributor Author

Luro02 commented Mar 17, 2020

Possibly only appending the number if it's greater than 0 so in most cases it won't be noticed.

I did not implement this, because I think it is better to have a unified naming scheme. It would also increase the code complexity unnecessarily.

@JohnCSimon
Copy link
Member

Ping from triage:
@Luro02 - can you please post an update to this PR and address ollie27's change requests?

@Luro02
Copy link
Contributor Author

Luro02 commented Mar 28, 2020

@JohnCSimon

I already addressed the changes requested by @ollie27.
I am just waiting for somebody to finally approve this PR.

@Luro02 Luro02 requested a review from ollie27 March 29, 2020 18:35
Copy link
Member

@ollie27 ollie27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay.

With the line number included in the hash this should be good to merge. We could do with some regression tests for --persist-doctests though but I can't see an easy way to write them so we can leave that as a follow up.

src/librustdoc/test.rs Outdated Show resolved Hide resolved
src/librustdoc/test.rs Outdated Show resolved Hide resolved
@Luro02 Luro02 requested a review from ollie27 March 30, 2020 13:59
Copy link
Member

@GuillaumeGomez GuillaumeGomez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. Just waiting for @ollie27's confirmation and then it's good to go! Thanks a lot!

@ollie27
Copy link
Member

ollie27 commented Mar 30, 2020

Yeah, looks good to me too.

@bors r=GuillaumeGomez,ollie27

@bors
Copy link
Contributor

bors commented Mar 30, 2020

📌 Commit bc00b16 has been approved by GuillaumeGomez,ollie27

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 30, 2020
Centril added a commit to Centril/rust that referenced this pull request Mar 30, 2020
…ie27

improve folder name for persistent doc tests

This partially fixes rust-lang#69411 by using the entire path as folder name, but I do not know how to deal with the proc-macro problem, where a doc test is forwarded to multiple generated functions, which have the same line for the doc test (origin).

For example

```rust
#[derive(ShortHand)]
pub struct ExtXMedia {
    /// The [`MediaType`] associated with this tag.
    ///
    /// # Example
    ///
 -> /// ``` <- this line is given to `run_test`
    /// # use hls_m3u8::tags::ExtXMedia;
    /// use hls_m3u8::types::MediaType;
    ///
    /// let mut media = ExtXMedia::new(MediaType::Audio, "ag1", "english audio channel");
    ///
    /// media.set_media_type(MediaType::Video);
    ///
    /// assert_eq!(media.media_type(), MediaType::Video);
    /// ```
    ///
    /// # Note
    ///
    /// This attribute is required.
    #[shorthand(enable(copy))]
    media_type: MediaType,

    // the rest of the fields are omitted
}
```

and my proc macro generates

```rust
#[allow(dead_code)]
impl ExtXMedia {
    /// The [`MediaType`] associated with this tag.
    ///
    /// # Example
    ///
    /// ```
    /// # use hls_m3u8::tags::ExtXMedia;
    /// use hls_m3u8::types::MediaType;
    ///
    /// let mut media = ExtXMedia::new(MediaType::Audio, "ag1", "english audio channel");
    ///
    /// media.set_media_type(MediaType::Video);
    ///
    /// assert_eq!(media.media_type(), MediaType::Video);
    /// ```
    ///
    /// # Note
    ///
    /// This attribute is required.
    #[inline(always)]
    #[must_use]
    pub fn media_type(&self) -> MediaType {
        struct _AssertCopy
        where
            MediaType: ::std::marker::Copy;
        self.media_type
    }
    /// The [`MediaType`] associated with this tag.
    ///
    /// # Example
    ///
    /// ```
    /// # use hls_m3u8::tags::ExtXMedia;
    /// use hls_m3u8::types::MediaType;
    ///
    /// let mut media = ExtXMedia::new(MediaType::Audio, "ag1", "english audio channel");
    ///
    /// media.set_media_type(MediaType::Video);
    ///
    /// assert_eq!(media.media_type(), MediaType::Video);
    /// ```
    ///
    /// # Note
    ///
    /// This attribute is required.
    #[inline(always)]
    pub fn set_media_type<VALUE: ::std::convert::Into<MediaType>>(
        &mut self,
        value: VALUE,
    ) -> &mut Self {
        self.media_type = value.into();
        self
    }
}
```

rustdoc then executes both tests with the same line (the line from the example above the field -> 2 different tests have the same name). We need a way to differentiate between the two tests generated by the proc-macro, so that they do not cause threading issues.
@RalfJung
Copy link
Member

partially fixes #69411

When this PR lands it will close that issue (because "fixes" is a magic keyword for GitHub). Is that deliberate, given that it is just a partial fix? If no, please edit the PR message to no longer say "fixes" (or "closes").

@Luro02
Copy link
Contributor Author

Luro02 commented Mar 30, 2020

@RalfJung
It is no longer a partial fix. The issue is resolved, by appending a number to all filenames and incrementing it if there is a conflict, so no files should be overwritten.

I updated the post. So it should be fine if the issue is closed with this PR.

@bors
Copy link
Contributor

bors commented Mar 30, 2020

⌛ Testing commit bc00b16 with merge a46c7116049fcbc2e8f7c72db429d36e14310020...

@bors
Copy link
Contributor

bors commented Mar 30, 2020

💔 Test failed - checks-azure

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 30, 2020
@Luro02
Copy link
Contributor Author

Luro02 commented Mar 31, 2020

I updated the fork (git rebase upstream/master)

@GuillaumeGomez
Copy link
Member

Once the CI is ok, let's approve again then.

@GuillaumeGomez
Copy link
Member

@bors r=GuillaumeGomez,ollie27

@bors
Copy link
Contributor

bors commented Mar 31, 2020

📌 Commit 2e40ac7 has been approved by GuillaumeGomez,ollie27

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 31, 2020
@bors
Copy link
Contributor

bors commented Mar 31, 2020

⌛ Testing commit 2e40ac7 with merge 02bf2b4659cb49caa3f0281f354ab048d3652e88...

@Centril
Copy link
Contributor

Centril commented Mar 31, 2020

@bors retry yielding

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 31, 2020
Rollup of 7 pull requests

Successful merges:

 - rust-lang#69425 (add fn make_contiguous to VecDeque)
 - rust-lang#69458 (improve folder name for persistent doc tests)
 - rust-lang#70268 (Document ThreadSanitizer in unstable-book)
 - rust-lang#70600 (Ensure there are versions of test code for aarch64 windows)
 - rust-lang#70606 (Clean up E0466 explanation)
 - rust-lang#70614 (remove unnecessary relocation check in const_prop)
 - rust-lang#70623 (Fix broken link in README)

Failed merges:

r? @ghost
@bors bors merged commit 3e31006 into rust-lang:master Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

random doc tests fail with --persist-doctests