Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: version parsing #240

Merged
merged 20 commits into from
Jul 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion crates/rattler_conda_types/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,14 @@ glob = "0.3.1"

[dev-dependencies]
rand = "0.8.5"
insta = { version = "1.29.0", features = ["yaml", "redactions"] }
insta = { version = "1.29.0", features = ["yaml", "redactions", "toml"] }
rattler_package_streaming = { path = "../rattler_package_streaming", default-features = false, features=["rustls-tls"] }
tempfile = "3.6.0"
rstest = "0.17.0"
assert_matches = "1.5.0"
hex-literal = "0.4.1"
criterion = { version = "0.4", features = ["html_reports"] }

[[bench]]
name = "parse"
harness = false
14 changes: 14 additions & 0 deletions crates/rattler_conda_types/benches/parse.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use rattler_conda_types::Version;

fn criterion_benchmark(c: &mut Criterion) {
c.bench_function("parse simple version", |b| {
b.iter(|| black_box("3.11.4").parse::<Version>())
});
c.bench_function("parse complex version", |b| {
b.iter(|| black_box("1!1.0b2.post345.dev456+3.2.20.rc3").parse::<Version>())
});
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
4 changes: 2 additions & 2 deletions crates/rattler_conda_types/src/conda_lock/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use crate::match_spec::parse::ParseMatchSpecError;
use crate::MatchSpec;
use crate::{
utils::serde::Ordered, NamelessMatchSpec, NoArchType, PackageRecord, ParsePlatformError,
ParseVersionError, Platform, RepoDataRecord, Version,
ParseVersionError, Platform, RepoDataRecord,
};
use fxhash::FxHashMap;
use rattler_digest::{serde::SerializableHash, Md5Hash, Sha256Hash};
Expand Down Expand Up @@ -350,7 +350,7 @@ impl TryFrom<LockedDependency> for RepoDataRecord {
.map(|(name, matchspec)| MatchSpec::from_nameless(matchspec, Some(name)).to_string())
.collect::<Vec<_>>();

let version = value.version.parse::<Version>()?;
let version = value.version.parse()?;
let md5 = match value.hash {
Md5(md5) => Some(md5),
Md5Sha256(md5, _) => Some(md5),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2134,7 +2134,7 @@ expression: "conda_lock.packages_for_platform(Platform::Osx64).collect::<Vec<_>>
platform: osx-64
dependencies:
__osx: ">=10.12"
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2176,7 +2176,7 @@ expression: "conda_lock.packages_for_platform(Platform::OsxArm64).collect::<Vec<
manager: conda
platform: osx-arm64
dependencies:
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3106,7 +3106,7 @@ expression: "conda_lock.packages_for_platform(Platform::Linux64).collect::<Vec<_
manager: conda
platform: linux-64
dependencies:
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happened here? o.O

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's what I explained in:

As you review this PR you will notice that some version string representations have changed. This is because the string representation is now constructed from the version itself. However, some cases are very hard to reproduce, for instance, the number 06 is represented as 6 in our code. While reviewing, make sure you check these changes and decide if they make sense or not.

contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3129,7 +3129,7 @@ package:
manager: conda
platform: linux-64
dependencies:
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down Expand Up @@ -5632,7 +5632,7 @@ package:
manager: conda
platform: linux-aarch64
dependencies:
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down Expand Up @@ -8040,7 +8040,7 @@ package:
manager: conda
platform: linux-ppc64le
dependencies:
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down Expand Up @@ -10512,7 +10512,7 @@ package:
platform: osx-64
dependencies:
__osx: ">=10.12"
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down Expand Up @@ -12968,7 +12968,7 @@ package:
manager: conda
platform: osx-arm64
dependencies:
certifi: ">=2020.06.20"
certifi: ">=2020.6.20"
contourpy: ">=1.0.1"
cycler: ">=0.10"
fonttools: ">=4.22.0"
Expand Down
2 changes: 1 addition & 1 deletion crates/rattler_conda_types/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ pub use repo_data::patches::{PackageRecordPatch, PatchInstructions, RepoDataPatc
pub use repo_data::{ChannelInfo, ConvertSubdirError, PackageRecord, RepoData};
pub use repo_data_record::RepoDataRecord;
pub use run_export::RunExportKind;
pub use version::{ParseVersionError, ParseVersionErrorKind, Version};
pub use version::{ParseVersionError, ParseVersionErrorKind, Version, VersionWithSource};
pub use version_spec::VersionSpec;

#[cfg(test)]
Expand Down
8 changes: 5 additions & 3 deletions crates/rattler_conda_types/src/match_spec/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -373,13 +373,15 @@ mod tests {
#[test]
fn test_digest_match() {
let record = PackageRecord {
name: "mamba".to_string(),
version: Version::from_str("1.0").unwrap(),
sha256: parse_digest_from_hex::<Sha256>(
"f44c4bc9c6916ecc0e33137431645b029ade22190c7144eead61446dcbcc6f97",
),
md5: parse_digest_from_hex::<Md5>("dede6252c964db3f3e41c7d30d07f6bf"),
..PackageRecord::default()
..PackageRecord::new(
String::from("mamba"),
Version::from_str("1.0").unwrap(),
String::from(""),
)
};

let spec = MatchSpec::from_str("mamba[version==1.0, sha256=aaac4bc9c6916ecc0e33137431645b029ade22190c7144eead61446dcbcc6f97]").unwrap();
Expand Down
7 changes: 3 additions & 4 deletions crates/rattler_conda_types/src/package/index.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
use std::path::Path;

use super::PackageFile;
use crate::{NoArchType, Version};
use crate::{NoArchType, VersionWithSource};
use serde::{Deserialize, Serialize};
use serde_with::{serde_as, skip_serializing_none, DisplayFromStr, OneOrMany};
use serde_with::{serde_as, skip_serializing_none, OneOrMany};

use rattler_macros::sorted;

Expand Down Expand Up @@ -70,8 +70,7 @@ pub struct IndexJson {
pub track_features: Vec<String>,

/// The version of the package
#[serde_as(as = "DisplayFromStr")]
pub version: Version,
pub version: VersionWithSource,
}

impl PackageFile for IndexJson {
Expand Down
35 changes: 30 additions & 5 deletions crates/rattler_conda_types/src/repo_data/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@ use fxhash::{FxHashMap, FxHashSet};

use rattler_digest::{serde::SerializableHash, Md5Hash, Sha256Hash};
use serde::{Deserialize, Serialize};
use serde_with::{serde_as, skip_serializing_none, DisplayFromStr, OneOrMany};
use serde_with::{serde_as, skip_serializing_none, OneOrMany};
use thiserror::Error;

use rattler_macros::sorted;

use crate::package::IndexJson;
use crate::{Channel, NoArchType, Platform, RepoDataRecord, Version};
use crate::{Channel, NoArchType, Platform, RepoDataRecord, VersionWithSource};

/// [`RepoData`] is an index of package binaries available on in a subdirectory of a Conda channel.
// Note: we cannot use the sorted macro here, because the `packages` and `conda_packages` fields are
Expand Down Expand Up @@ -62,7 +62,7 @@ pub struct ChannelInfo {
#[serde_as]
#[skip_serializing_none]
#[sorted]
#[derive(Debug, Deserialize, Serialize, Eq, PartialEq, Ord, PartialOrd, Clone, Hash, Default)]
#[derive(Debug, Deserialize, Serialize, Eq, PartialEq, Ord, PartialOrd, Clone, Hash)]
pub struct PackageRecord {
/// Optionally the architecture the package supports
pub arch: Option<String>,
Expand Down Expand Up @@ -139,8 +139,7 @@ pub struct PackageRecord {
pub track_features: Vec<String>,

/// The version of the package
#[serde_as(as = "DisplayFromStr")]
pub version: Version,
pub version: VersionWithSource,
// Looking at the `PackageRecord` class in the Conda source code a record can also include all
// these fields. However, I have no idea if or how they are used so I left them out.
//pub preferred_env: Option<String>,
Expand Down Expand Up @@ -182,6 +181,32 @@ impl RepoData {
}

impl PackageRecord {
/// A simple helper method that constructs a `PackageRecord` with the bare minimum values.
pub fn new(name: String, version: impl Into<VersionWithSource>, build: String) -> Self {
Self {
arch: None,
build,
build_number: 0,
constrains: vec![],
depends: vec![],
features: None,
legacy_bz2_md5: None,
legacy_bz2_size: None,
license: None,
license_family: None,
md5: None,
name,
noarch: Default::default(),
platform: None,
sha256: None,
size: None,
subdir: Platform::current().to_string(),
timestamp: None,
track_features: vec![],
version: version.into(),
}
}

/// Sorts the records topologically.
///
/// This function is deterministic, meaning that it will return the same result regardless of
Expand Down
120 changes: 120 additions & 0 deletions crates/rattler_conda_types/src/version/flags.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
use std::fmt::{Debug, Formatter};

/// Bitmask indicates if the first component stored in a [`super::Version`] refers to an epoch.
const EPOCH_MASK: u8 = 0b1;
const EPOCH_OFFSET: u8 = 0;

/// Bitmask that indicates what the index is of the first segment that belongs to the local version
/// part. E.g. the part after the '+' sign in `1.2.3+4.5.6`.
const LOCAL_VERSION_MASK: u8 = (1 << 7) - 1;
const LOCAL_VERSION_OFFSET: u8 = 1;

/// Encodes several edge cases in a single byte.
///
/// The first bit is used to indicate whether or not there is an explicit epoch present in the
/// version. If the flag is set it means the first entry in the [`Version::components`] array refers
/// to the epoch instead of to the first component of the first segment.
///
/// The remaining bits are used to encode the index of the first segment that belongs to the local
/// version part instead of to the common part. A value of `0` indicates that there is not local
/// version part.
#[derive(Copy, Clone, Eq, PartialEq, Default)]
#[repr(transparent)]
pub struct Flags(u8);

impl Debug for Flags {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Flags")
.field("has_epoch", &self.has_epoch())
.field("local_segment_index", &self.local_segment_index())
.finish()
}
}

impl Flags {
/// Sets whether or not the version has an explicit epoch.
#[must_use]
pub fn with_has_epoch(self, has_epoch: bool) -> Self {
let flag = self.0 & !(EPOCH_MASK << EPOCH_OFFSET);
Self(
flag | if has_epoch {
EPOCH_MASK << EPOCH_OFFSET
} else {
0
},
)
}

/// Returns true if this instance indicates that an explicit epoch is present.
pub fn has_epoch(self) -> bool {
(self.0 >> EPOCH_OFFSET) & EPOCH_MASK != 0
}

/// Sets the index where the local segment starts. Returns `None` if the index is too large to
/// be stored.
#[must_use]
pub fn with_local_segment_index(self, index: u8) -> Option<Self> {
if index > LOCAL_VERSION_MASK {
None
} else {
Some(Self(
(self.0 & !(LOCAL_VERSION_MASK << LOCAL_VERSION_OFFSET))
| (index << LOCAL_VERSION_OFFSET),
))
}
}

/// Returns the index of the first segment that belongs to the local part of the version.
pub fn local_segment_index(self) -> u8 {
(self.0 >> LOCAL_VERSION_OFFSET) & LOCAL_VERSION_MASK
}
}

#[cfg(test)]
mod test {
use crate::version::flags::Flags;

#[test]
fn test_epoch() {
assert_eq!(Flags::default().has_epoch(), false);
assert_eq!(Flags::default().with_has_epoch(true).has_epoch(), true);
assert_eq!(
Flags::default()
.with_has_epoch(true)
.with_has_epoch(false)
.has_epoch(),
false
);
}

#[test]
fn test_local_segment_idx() {
assert_eq!(Flags::default().local_segment_index(), 0);
assert_eq!(
Flags::default()
.with_local_segment_index(42)
.unwrap()
.local_segment_index(),
42
);
assert_eq!(
Flags::default()
.with_local_segment_index(127)
.unwrap()
.local_segment_index(),
127
);
assert_eq!(Flags::default().with_local_segment_index(128), None);
}

#[test]
fn test_all_elements() {
let flags = Flags::default()
.with_has_epoch(true)
.with_local_segment_index(101)
.unwrap();

assert!(flags.has_epoch());
assert_eq!(flags.local_segment_index(), 101);
}
}
Loading