Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache implementation #1073

Merged
merged 40 commits into from
Jan 11, 2017
Merged

Cache implementation #1073

merged 40 commits into from
Jan 11, 2017

Conversation

marcelofabri
Copy link
Collaborator

@marcelofabri marcelofabri commented Dec 27, 2016

Will fix #868

TODO:

  • Ignore cache if SwiftLint version has changed
  • Ignore cache if configuration has changed
  • Expose cache path in CLI and configuration
  • Ignore cache flag
  • Keep benchmarks when reading from cache
  • Tests

autoreleasepool {
if options.benchmark {
if let file = linter.file.path,
case let hash = linter.file.contents.hash,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is enough or we should use something more sophisticated 🤔

@SwiftLintBot
Copy link

SwiftLintBot commented Dec 27, 2016

5 Messages
📖 Linting Alamofire with this PR took 2.95s vs 2.95s on master (0% slower)
📖 Linting Aerial with this PR took 0.42s vs 0.41s on master (2% slower)
📖 Linting SourceKitten with this PR took 1.68s vs 1.66s on master (1% slower)
📖 Linting Moya with this PR took 0.45s vs 0.45s on master (0% slower)
📖 Linting realm-cocoa with this PR took 3.4s vs 3.37s on master (0% slower)

Generated by 🚫 danger

@codecov-io
Copy link

codecov-io commented Dec 27, 2016

Current coverage is 82.30% (diff: 61.64%)

Merging #1073 into master will decrease coverage by 0.48%

@@             master      #1073   diff @@
==========================================
  Files           161        164     +3   
  Lines          7909       8117   +208   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           6548       6681   +133   
- Misses         1361       1436    +75   
  Partials          0          0          

Powered by Codecov. Last update 480acd7...c28e308

@jpsim
Copy link
Collaborator

jpsim commented Dec 27, 2016

Do you understand why osscheck is saying that some violations were fixed by this PR?

@marcelofabri
Copy link
Collaborator Author

Do you understand why osscheck is saying that some violations were fixed by this PR?

Not really 🙃

@marcelofabri
Copy link
Collaborator Author

I've seen the log and that's because there was a crash while getting the version number (#1045)

return Version(value: value)
}

return Version(value: "0.15.0")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be reverted or done in a way that make set_version updates it as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in caee985

@@ -15,7 +15,7 @@ public enum LinterCacheError: Error {
case differentConfiguration
}

public struct LinterCache {
public class LinterCache {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change it to be a class to make sure all Linters used the same reference

@marcelofabri
Copy link
Collaborator Author

marcelofabri commented Dec 28, 2016

I think this is ready for review, other than missing changelog entry and README update (and the version workaround)

// Copyright © 2016 Realm. All rights reserved.
//

import Foundation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole file should just be public let version = "0.15.0". There's no point in getting the version from the bundle and then fall back on a string literal in case that fails if the values will always match and it'd be a bug for them not to...

The template idea that's rendered into this file via make is 👌 though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for not reading from the bundle, but I think it'd be better if we kept this namespaced inside a struct so we can avoid using String everywhere. This would also make it easier to implement version pinning if we want.

Copy link
Collaborator

@jpsim jpsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly cosmetic requests now that a global location is used for the cache

@@ -47,6 +48,8 @@ public struct Configuration: Equatable {
public let rules: [Rule]
public var rootPath: String? // the root path to search for nested configurations
public var configurationPath: String? // if successfully loaded from a path
public var hash: Int?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make Configuration conform to Hashable? http://swiftdoc.org/v3.0/protocol/Hashable/

Would require naming this variable as hashValue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because I thought it'd be too much work to make Rule conform to Hashable as well. Do you think it's worth the extra complexity?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Rule is uniquely identified by its identifier, so its hashValue can just be identifier.hashValue and this can be implemented as a protocol extension.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about its configuration?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ok thinking more about it, Hashable implies Equatable, and the protocol semantics require that all user-visible data from two equal types be indistinguishable, so makes sense not to conform to it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to make sure we're in the same page, do you still think we should make Configuration conform to Hashable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

@@ -179,6 +185,7 @@ public struct Configuration: Equatable {
}
self.init(dict: dict)!
configurationPath = fullPath
hash = yamlContents.hash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of hashing the yaml contents, we could hash the properties of this struct, meaning that changes to the configuration file like reordering items or adding comments wouldn't invalidate the cache.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would also allow Configuration to conform to Hashable and for caching to work with configurations created without a YAML source.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget the bit about conforming to Hashable, but do you think you could adapt this to use the has value from dict at least? Would allow writing comments and reordering YAML contents without throwing away the cache.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in c28e308, but had to bridge to NSDictionary

@@ -78,6 +84,11 @@ public struct Linter {
deprecatedToValidIdentifier[key] = value
}

if let cache = cache, let file = file.path {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let path = file.path would disambiguate from the property

@@ -86,8 +97,34 @@ public struct Linter {
return (violations, ruleTimes)
}

public init(file: File, configuration: Configuration = Configuration()!) {
private func cachedStyleViolations(_ benchmark: Bool = false) -> ([StyleViolation], [(id: String, time: Double)])? {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

benchmark should be a required argument label to conform to Swift 3 API Design Guidelines.

lock.unlock()
}

public func violations(for file: String, hash: Int) -> [StyleViolation]? {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weakly typed argument. Label should be forFile file: String.

private let lock = NSLock()

public init(currentVersion: Version = .current, configurationHash: Int? = nil) {
cache = [String: Any]()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just this?

cache = [
    "version": currentVersion.value,
    "configuration_hash: configurationHash,
    "files": [:]
]

configurationHash: configurationHash)
}

public func cacheFile(_ file: String, violations: [StyleViolation], hash: Int) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't conform to Swift 3 API Design Guidelines, and implementation could be simplified:

public func cache(violations: [StyleViolation], forFile file: String, fileHash: Int) {
    lock.lock()
    var filesCache = (cache["files"] as? [String: Any]) ?? [:]
    filesCache[file] = [
        "violations": violations.map(dictionaryForViolation),
        "hash": fileHash
    ]
    cache["files"] = filesCache
    lock.unlock()
}

Adds an insignificant amount of work inside the lock.

try json.write(to: url, atomically: true, encoding: .utf8)
}

private func dictionaryForViolation(_ violation: StyleViolation) -> [String: Any] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swift 3 API Design Guidelines: dictionary(for violation: StyleViolation)

}

extension StyleViolation {
fileprivate static func fromCache(_ cache: [String: Any], file: String) -> StyleViolation? {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swift 3 API Design Guidelines: from(cache: [String: Any], file: String) -> StyleViolation?

@@ -35,6 +35,7 @@ private enum ConfigurationKey: String {
case swiftlintVersion = "swiftlint_version"
case useNestedConfigs = "use_nested_configs" // deprecated
case warningThreshold = "warning_threshold"
case cachePath = "cache_path"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest of this list is sorted

@jpsim
Copy link
Collaborator

jpsim commented Jan 10, 2017

What in the world happened here? https://circleci.com/gh/realm/SwiftLint/82

Building branch
Cloning Alamofire
Linting 5 iterations of Alamofire with branch: 1..2..3..4..5
Cloning Aerial
Linting 5 iterations of Aerial with branch: 1..2..3..4..5
Cloning SourceKitten
Linting 5 iterations of SourceKitten with branch: 1..2..3..4..5
Cloning Moya
Linting 5 iterations of Moya with branch: 1..2..3..4..5
Cloning realm-cocoa
Linting 5 iterations of realm-cocoa with branch: 1..2..3..4..5
Switched to branch 'master'
From github.com:realm/SwiftLint
   1d2b3fd..480acd7  master     -> origin/master
Building master
<unknown>:0: error: build had 1 command failures
error: exit(1): /Applications/Xcode-8.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swift-build-tool -f /Users/distiller/SwiftLint/.build/debug.yaml
Linting 5 iterations of Alamofire with master: 1..2..3..4..5
Linting 5 iterations of Aerial with master: 1..2..3..4..5
Linting 5 iterations of SourceKitten with master: 1..2..3..4..5
Linting 5 iterations of Moya with master: 1..2..3..4..5
Linting 5 iterations of realm-cocoa with master: 1..2..3..4..5

image

@marcelofabri
Copy link
Collaborator Author

What in the world happened here?

I have no idea 😱

@jpsim
Copy link
Collaborator

jpsim commented Jan 10, 2017

Stepping back a bit. Wouldn't we be better off just building the cache at the very end of the lint process, from a single thread, avoiding the need for locking? Having this at the Linter level is nice architecturally, but maybe we should split that up into a new Linter model that coordinates many SingleFileLinters (aka what is Linter today).

@jpsim
Copy link
Collaborator

jpsim commented Jan 10, 2017

LOL. I understand why OSSCheck is reporting massive perf regressions. It somehow can't build master, so it's running this PR thinking that it's master after things have been cached from the first few branch runs. So #1147 is biting us here.

But also, we should disable the cache when running OSSCheck.

@marcelofabri
Copy link
Collaborator Author

Stepping back a bit. Wouldn't we be better off just building the cache at the very end of the lint process, from a single thread, avoiding the need for locking?

Maybe, this implementation started before we had multithread lint. We could measure the lock cost to see if it's worth the refactoring and the "architectural cost".

@jpsim
Copy link
Collaborator

jpsim commented Jan 10, 2017

Maybe, this implementation started before we had multithread lint. We could measure the lock cost to see if it's worth the refactoring and the "architectural cost".

Let's get this in, and look into avoiding locking later.

Awesome work! 🎉

@@ -179,6 +185,7 @@ public struct Configuration: Equatable {
}
self.init(dict: dict)!
configurationPath = fullPath
hash = dict.bridge().hashValue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is returning a value of 3 for Realm. Seems suspicious: https://github.com/realm/realm-cocoa/blob/master/.swiftlint.yml

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should revert to use the String hash then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, apparently NSDictionary hash just returns the count. I've tested dict.description.hashValue locally and that seems to work surprisingly well.

Copy link
Collaborator

@jpsim jpsim Jan 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintains the same value if items are re-ordered, and if comments are added/edited.

@@ -269,7 +276,8 @@ private func validKeys(ruleList: RuleList) -> [String] {
.swiftlintVersion,
.useNestedConfigs,
.warningThreshold,
.whitelistRules
.whitelistRules,
.cachePath
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is also otherwise alphabetized.

return (cachedViolations, ruleTimes)
}

public init(file: File,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this declaration doesn't need to be broken up in multiple lines.

return path.map(URL.init(fileURLWithPath:)) ?? defaultCacheURL(options: options)
}

private func defaultCacheURL(options: LintOptions) -> URL {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much of this should be moved into SwiftLintFramework IMHO, so that the cache is used even if not going through the SwiftLint CLI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be used, you'd just need to setup by yourself since we depend on LintOptions to resolve the cache path

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if overriding the default. Wouldn't be much different than the path we pass to Configuration: https://github.com/realm/SwiftLint/blob/master/Source/SwiftLintFramework/Models/Configuration.swift#L162

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do that on a later moment? 😬

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

marcelofabri and others added 8 commits January 10, 2017 20:49
* master:
  fix typos in ProjectMock configuration files
  update SourceKitten to 0.17
  build SwiftLint in release mode when running benchmarks
  run Circle CI's Xcode 8.2 image
  don't pass CI if Danger crashes
  Fix triggering example and add couple more
  Use range parameter if exists
  CHANGELOG entries must break at the 80th char
  Fix indentation
  Update CHANGELOG
  Delete extra variable
  Clean up code and improve overall logic
  Add range to avoid whole file lookup
  Clean up regex pattern
  Delete print command
  Fix test case clause
  Ignore tuples
  Append test to allTests array
  Add CHANGELOG entry
  Add UnusedOptionalBinding rule
and group perf improvements at the top of the "Enhancements"
fail the build if it couldn't be built even after retrying
@jpsim jpsim merged commit 4f225a4 into realm:master Jan 11, 2017
@jpsim
Copy link
Collaborator

jpsim commented Jan 11, 2017

🎉 this is gonna be 'uuuge 🎉

@marcelofabri
Copy link
Collaborator Author

Awesomeee 💯

@marcelofabri marcelofabri deleted the cache branch January 11, 2017 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Keep track of already linted/autocorrected files.
4 participants