Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Suspiciously slow calls to Sequential.predictSoftly #25

Closed
LostMekka opened this issue Dec 19, 2020 · 7 comments
Closed

[Bug] Suspiciously slow calls to Sequential.predictSoftly #25

LostMekka opened this issue Dec 19, 2020 · 7 comments
Labels
bug Something isn't working
Milestone

Comments

@LostMekka
Copy link
Contributor

Calls to Sequential.predictSoftly start out fairly slow and each call takes a bit more time than the last one.

My example model is quite small, but the first call already takes about 25ms on my machine. (for comparison: fitting 1.000 random data points with 10 epochs takes 415ms) After calling predictSoftly 10.000 times, each successive call takes a third of a second.

image

Code to reproduce:

@OptIn(ExperimentalTime::class)
fun main() {
    Sequential.of(
        Input(36),
        Dense(36),
        Dense(36),
        Dense(36),
        Dense(36),
        Dense(36),
        Dense(16),
        Dense(8),
        Dense(3),
    ).use { model ->
        model.compile(
            optimizer = Adam(),
            loss = Losses.SOFT_MAX_CROSS_ENTROPY_WITH_LOGITS,
            metric = Metrics.MSE,
        )
        model.init()
        val features = FloatArray(36) { Random.nextFloat() }
        var predictionCalls = 0
        var predictionTimeOfBatch = Duration.ZERO
        val predictionTimes = mutableListOf<Double>()
        repeat(100_000) {
            val timing = measureTimedValue { model.predict(features) }
            predictionCalls++
            predictionTimeOfBatch += timing.duration
            predictionTimes += timing.duration.inMilliseconds
            if (predictionCalls % 100 == 0) {
                val csv = predictionTimes
                    .withIndex()
                    .joinToString("\n") { (i,t)->"${i + 1},$t" }
                File("timing.csv").writeText(csv)
                println("$predictionCalls calls done. (${predictionTimeOfBatch / 100} per call)")
                predictionTimeOfBatch = Duration.ZERO
            }
        }
    }
}
@quickstep24
Copy link

quickstep24 commented Dec 20, 2020

a.)
Sequential.internalPredict does not close the tensors it receives from Session.run.
b.)
Dataset.serializeToBuffer can be optimized for this special case when there is exactly one FloatArray. It is faster to wrap this FloatArray into a FloatBuffer than to copy it into a newly allocated buffer.
c.)
There should be a means to call predictSoftly on a batch of inputs. At the moment this is only possible for predict.
d.)
Looking at the underlying Java implementation, why are FloatArray values copied to a FloatBuffer at all? It seems to me that FloatArray and Array<FloatArray> are valid inputs for creating a Tensor.

@zaleslaw
Copy link
Collaborator

@LostMekka thanks a lot for your performance experiment, It looks great.
Also, looks like @quickstep24 is right and his suggestions will be fixed in the next 0.1.1 bug fix release.

  1. Missed closed tensor - it's a bug, of course.
  2. I'd like an idea for optimization for plain FloatArray.
  3. About FloatBuffer usage - need to revisit its usage (I have some thoughts about small performance improvements related to FloatBuffer usage), need to measure and choose the best option here

@zaleslaw zaleslaw changed the title Suspiciously slow calls to Sequential.predictSoftly [Bug] Suspiciously slow calls to Sequential.predictSoftly Dec 21, 2020
@zaleslaw zaleslaw added the bug Something isn't working label Dec 21, 2020
@zaleslaw zaleslaw added this to the 0.1.1 milestone Dec 21, 2020
@LostMekka
Copy link
Contributor Author

c.)
There should be a means to call predictSoftly on a batch of inputs. At the moment this is only possible for predict.

That would be awesome as well. My current pet project to try out this library is a board game AI that very crudely learns through self play, where the neural net is the search heuristic. The AI plays a semi-random game sequence and for each pair of successive moves I create a training data point. (needs 1 predictSoftly per data point) Building these self-play datasets would probably greatly benefit from batch soft-predicting 😄

Should I add a second issue for this?

@zaleslaw
Copy link
Collaborator

Great idea for pet-project, so, hope it will be helpful, please create a separate ticket for this case as an feature request.
So, I will release it in 0.1.1 but it takes time and will be released at the mid of January (not earlier, sorry)

@zaleslaw
Copy link
Collaborator

Dear @quickstep24

Looking at the underlying Java implementation, why are FloatArray values copied to a FloatBuffer at all? It seems to me that FloatArray and Array<FloatArray> are valid inputs for creating a Tensor.

The reason to use are the following:

  1. No Tensor.create method which give us ability to pass plain FloatArray with shape.
  2. The shape of created tensor depends on the input data and could have different number of dimensions
  3. Reshape of plain FloatArray to multidimensional array is required before passing to Tensor
  4. Why not prepare FloatBuffers if underlaying TF Java API it requires..
  5. After upgrade on TF 2.x Java API we will have deal with totally different system of TensorTypes (I hope the migration will coming soon)

@quickstep24
Copy link

It is probably me misinterpreting the documentation of org.tensorflow.Tensor.create(Object obj, Class<T> type) , where the example is given as:

 // Valid: A 3x2 matrix of floats.
 float[][] matrix = new float[3][2];
 Tensor<Float> m = Tensor.create(matrix, Float.class);

I did assume that this constructor would also work on 1-dim float arrays, but I have never tried.

@zaleslaw
Copy link
Collaborator

@quickstep24 I will explain my message. Yes, you can pass the 1-dim floatArray. But I need to pass a shape (to transform it to 2d/3d/4d) - but in this case this method requires 4d float array. It is not useful for me in this case, because the shape could be unknown and I want more generic code. But I agree, that in future need to minimize overhead with copying, will keep in mind such places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants