-
Notifications
You must be signed in to change notification settings - Fork 18
Bug: performance problem in build-bud step of nodejs and nodejs loopback pipeline #121
Comments
Attach the Nodejs Loopback build-bud log |
@tseelbach - Please help with assessing this issue and assign it to the appropriate owner. |
For |
Looks like there is a fix in the latest buildah to improve performance of the copy step. Will pull that in and see if there is any improvement in this release. |
There is a fix in the latest buildah that reduces the time taken to perform a COPY step within the docker file, however the appsody/appsody-buildah docker image that the pipelines uses is unable to utilise this later version as other fixes that are required (to prevent mount issues) have not been implemented. This means that we have to continue using buildah v1.9.0 until the fixes for the mount issue has been implemented in the later code and then a newer version of the appsody/appsody-buildah image will be generated for use by the pipelines. |
It is under debugging |
Gireesh and Kyle Christianson would need to help with this this issue. |
The node pipelines were failing with the latest buildah. Looks like there are some new fuse prereqs for it to work. @aadeshpa could you please look into this when you are back. Maybe first try to see if we get a performance improvement with the java ones and the new buildah. Thanks! |
Per last discussion with Kyles, these were observed:
I would suggest we:
can you share the pipeline log that has timestamps, to start with? |
I tried testing pipelineruns(build-push-pipeline) using old appsody buildah image
cc: @gpunathi@in.ibm.com , @kvijai82 |
thanks @aadeshpa . It would be interesting to see what are the values for loopback, for which this issue was raised originally. |
I tested with nodejs-loopback and below is the step in the logs where it stays stuck and then after 60 min, the pod itself terminates and the pipelinerun fails.
then at step
|
thanks @aadeshpa . I am able to recreate this outside of the pipeline, and have raised a detailed bug report with buildah community. the problem seems to raise from the fact that we have huge (~200 MB) node_modules folder belonging to the `loopback dependency to be copied to the image. As we are building the dependency inside the |
I ran into this issue while building a 1G container. I used buildah --debug bud . . to verify and saw the .local files where getting put in the container so the buildah process would. run until the filesystem filled. I put the .local in .dockerignore and was able to buildah bud the image. |
I went passed ahead of the error to download the
|
@aadeshpa - please see comments #121 (comment) and #121 (comment) . To resolve this issue quickly, we need to isolate platforms where it works and where it does not, out of the box or with tweaking, and based on count, the versions where it works etc. we may take a call about whether this approach (fuse bug fix and buildah bug fix) is acceptable or not. Pipeline testing can be complex; and as you and @jdmcclur also have pointed out, there could be unrelated issues hiding somewhere else (OS upgrade issue / Dockerfile syntax issue etc). We want to isolate all of those and diagnose and resolve the |
Looking for an env to test with. Requested a Fyre env to install Kabanero and try to use the Codeready containers on it. then try to install Kabanero to test. |
@marikaj123 Below are the environment setup options we tried and first 3 were unsuccessful. Since we need RHEL 8.0 to test this fix from @gireeshpunathil and fyre env only provides the OCP4.2 cluster with RHEL 7.6:
|
I will debug item (2) of the above. |
4(continue from previous comment). we are working with @stevenschader , he is helping us setup an fyre env with RHEL 8.0 and code ready container which will give us OCP 4 in that container, and then we will try to install kabanero on top of it to test this fix. |
we were able to get fyre env with code ready container installed on top of RHEL 8.1 and on that CRC container which had OCP 4 in the container. I was successfully able to run our kabanero install in this environment. Now i ran first pipeline nodjes-exp on it and it was successful. @steven Schader helped me get the fyre node setup with all that installed on it. cc: @smcclem @bsulliv |
@gireeshpunathil : I tested with your test container This means for nodejs-express pipeline we are getting same error result in RHEL 7.6 and RHEL 8.1 with Yum upgrade error with appsody buildah container having the fix. Environment OS
Nodjes-express Pipelinerun log snipped with error
Also i tested for another pipeline
|
@marikaj123 : after discussing with @gireeshpunathil he said above nodejs express issue of yum upgrade seems to be separate issue which we saw once in the past for one of the version of Issue : appsody/stacks#629 |
@gireeshpunathil : since we were blocked by above issue to test with your image
So we could verify the test of running just |
@gireeshpunathil - any idea why it works for you on rhel8 and not @aadeshpa? What is different? |
@jdmcclur - one or more of:
We are in the process of:
|
This issue seems to be side-effect of the fuse-overlay issue fix which is not present in the host os where we were testing it which is RHEL8.1 node. Marked it as closed for now. If needed after we know correct environment requirements to test this performance bug issue we can re-open it in collections repo (https://github.com/kabanero-io/collections/issues, not https://github.com/appsody/stacks/issues)and not in the appsody-stack repo where it was created. |
@gireeshpunathil : I got another fyre env with single node of ubuntu 18.04.3, where i installed docker and ran your standalone container of
Steps Followed
Output : I was able to successfully build the appsody project in 5 to 6 min time. This means first time we could test the fix from @gireeshpunathil in Ubuntu 18.04.3 fyre environment. As a second experiment I tested by running standalone container with old buildah image which we use in pipeline logs snipped
|
One more datapoint. I drove the pipeline against the appsody nodejs loopback stack which is built on top of Ubuntu. Thought we would avoid the yum upgrade issue in this and get further but unfortunately it also failed pretty early.
|
Need to conect with RH. |
@gireeshpunathil @kvijai82 : attaching the redhat ticket we opened. |
One of the replies from above ticket: Message (Associate) Hello, My name is Jake and I am a senior member of the containerization team and I am assisting on this case. With regards to fuse-overlay, it is currently not shipping in RHEL. However, it is scheduled to release with RHEL 7.8 in the RHEL Extras repository. I do not have a shareable ETA for the 7.8 release, however I have added this case to the bug report tracking the release of that package so that we may continue to provide updates on that front. Regards, Jake Hunsaker, RHCA Infrastructure II |
@gireeshpunathil : I tested by following the steps from this issue as you posted : containers/buildah#2047 a) multiple folders and files after creating them [root@svtcrc-389745-1 test_copy_multiple_small_folders]# find ./foo | wc -l Results b) copy tar ball of multiple folder structure from point a) Result: c) copy single 5MB file in txt format d) copy single tar file the above point c) 5MB txt file format real 0m2.239s NOTE: As shown in point a result it took 3 min for 2111 folders and files in them to copy, and when I zipped it as tar ball in point b) the copy result took around 2 seconds. cc: @kvijai82 |
@marikaj123 : We had call with @kvijai82 @gireeshpunathil and @neeraj.laad@uk.ibm.com and Neeraj said they will try to make some copy command changes in the Appsody Stack Docker file to optimize it ,and we will be having a checkpoint with them on Thursday. Docker file : |
We got some help from @neeraj.laad@uk.ibm.com where he gave us updated appsody nodejs-loopback stack image which has updated COPY commands in dockerfile. I tried to test it with Result However when we tried updating the Dockerfile for nodejs-loopback collections stack and build image out of it and then tried to do
|
After the discussion today, @groeges confirmed that nodejs-loopback collection stack is not going to get used from kabanero 0-6-0 so , we are not investigating on nodejs-loopback collection. We are shifting gears to investigate whether there can me any changes made in nodjes-express and nodejs collection stack Dockerfile to increase their build performance from the help of @neeraj.laad@uk.ibm.com and appsody team. Today I will try to test modified nodejs-express collection stack image and note the time taken to just run |
Test in standalone container
Result
Result
there seems to be significant difference with just my COPY commands changes. from 12 min to 4 min. |
@steven.groeger @kilnerm: Performance bug fix in Nodejs and Nodejs-express stacks Summary: Then I incorporated that COPY changes in our Kabanero Collections repository nodejs-express stack and created my own image , which I used it in testing the build step and also via kabanero build push pipelinerun . There was significant reduce in pipelinerun time to almost half the original run time. Original build-push-pipelinerun without the changes in the nodejs-express collection stack image
build-push-pipelinerun with fixed nodejs-express collection stack image
Below are the pull request in Appsody stack from @neeraj.laad |
PR for the nodejs and nodejs-express Dockerfile changes to improve performance: |
nodejs-build-deploy-pipeline
andnodejs-loopback-build-deploy-pipeline
runbuild-bud
step take over 40 minutes to complete.The text was updated successfully, but these errors were encountered: