Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Json with more error checking and parsing #360

Closed
wants to merge 13 commits into from

Conversation

hecon5
Copy link
Contributor

@hecon5 hecon5 commented Sep 15, 2022

Since the last PR didn't go as expected with a performance hit, I'm trying to track down the performance limiting items.

To do this, I opted to move the thing to a class (for now), which allows for easier default options settings.

As of now, I'm getting a minimum parse time of 3 seconds for the giant vcs-index file with even the original parser, not 0.51s reported by @joyfullservice. I'm not sure what is going on, because it should be considerably faster, so I'm putting it up here for @joyfullservice to hopefully test on his end and see if his results match mine.

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 15, 2022

As I test, I decided to fuzz the index you provided @joyfullservice, and write a tool to fuzz one in the future.

This allows us to build huge json files and not have them compromise any data publicly.

A sample:
image

If you're ok with it, I'd like to put the new testing file in the \testing\ directory.

@joyfullservice
Copy link
Owner

Here is what I am getting after building from your branch:

--------------------------------------------------
                PERFORMANCE REPORTS
--------------------------------------------------
Category                      Count     Seconds
--------------------------------------------------
TestLoadingFileOld            5         2.7879
TestLoadingFileNew            5         2.7376
TestLoadingFileNew2           5         2.7547
--------------------------------------------------
TOTALS:                       15        8.2802
--------------------------------------------------

--------------------------------------------------
Operations                    Count     Seconds
--------------------------------------------------
Read File                     2         0.0102
Parse JSON                    15        8.1622
--------------------------------------------------
Other Operations                        0.1185
--------------------------------------------------

…x without exposing the file names (potentially releasing confidential or private data).
… of this function in my environment; 1-2 seconds worth in ultra large json files.
… string charachter; stop execution if that's not true. Add chunk size increases for performance tuning.
@hecon5
Copy link
Contributor Author

hecon5 commented Sep 15, 2022

Interesting; my mind is BLOWN, because it's a solid 3-5 seconds per loop (15-17) for me on the same code.

@joyfullservice
Copy link
Owner

I did a little more testing, and noted that you are doing 5 iterations for each of these tests. The 0.5 seconds I was getting on the performance side was for a single iteration of loading JSON content. Here is a comparison after adding in a couple other tests for side-by-side comparison.

--------------------------------------------------
                PERFORMANCE REPORTS
--------------------------------------------------
Category                      Count     Seconds
--------------------------------------------------
TestLoadingFileOld            5         2.7620
TestLoadingFileNew            5         2.6337
TestLoadingFileNew2           5         2.7546
TestLoadingFileOriginal       5         2.7193
AdamsOriginalTest             5         2.8005
--------------------------------------------------
TOTALS:                       25        13.6701
--------------------------------------------------

--------------------------------------------------
Operations                    Count     Seconds
--------------------------------------------------
Read File                     7         0.0509
Parse JSON                    25        13.4363
--------------------------------------------------
Other Operations                        0.1972
--------------------------------------------------

From this, it appears that there is no appreciable performance difference between the different approaches taken. As to the actual performance on my machine, I am running an older i5 (Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz) and the file is located on my Samsung EVO SSD with Rapid Mode on and no disk encryption. (Although we are probably looking at in-memory processing here.)

Hopefully that gives you a little more comparative data for evaluating the performance enhancements. 😄

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 15, 2022

Well, after looking into it, I think it may actually be Access's fault on my end.

Intel(R) Xeon(R) E-2276M CPU @ 2.80GHz 2.81 GHz, 32GB Ram, with 1TB SSD.

Access 365, 64bit, Version 2202. And then I found this, which indicates I might have a version susceptible to this... https://answers.microsoft.com/en-us/msoffice/forum/all/vba-code-is-very-slow-in-office-365-version-of/e9b7bc57-baa3-4c73-b229-47f0a460ccd8

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 27, 2022

After several days of head banging, I will probably end up reverting all the string buffer tooling. However; I've found that if I turn it into a class, it becomes a lot more useful, and can handle automatic property loading (useful in my other project). I'll be cleaning this up shortly.

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 27, 2022

@joyfullservice: can you pull this in and see if the new test routine still works on your end? I made some tweaks that help (me) a little bit, but before I roll those into the formal PR, I want to ensure I didn't break anything else.

@hecon5
Copy link
Contributor Author

hecon5 commented Nov 14, 2022

@joyfullservice bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants