Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing: Adds ITrace as the default tracing implementation for CosmosDiagnostics #2097

Merged
merged 58 commits into from
Feb 8, 2021

Conversation

bchong95
Copy link
Contributor

Tracing: Adds ITrace as the default tracing implementation for CosmosDiagnostics

Description

This PR removes CosmosDiagnosticsContext in favor of ITrace, which has an easier to read pretty print along with other benefits mentioned here:

#1841

For Compute, they will need to update their diagnostics visitors, which should be simple, since ITrace is a tree like structure, so visibility is a first class concept. ITrace will then expose unstructured data like QueryMetrics and ClientSideRequestStats through the Data properties. Users can then visit on the TraceDatum class to get the derived class and it's appropriate properties.

Follow up work items are to expose ITrace in all our async APIs. Currently we only provide it in our FeedOperations, but eventually there will be a need for point operations.

Tests have been updated and new baseline tests are added, so that we can see what the text output looks like at all times.

Here is the text output for each of the TraceData:

Point Operation Statistics

[Point Operation Statistics]
Activity ID: 00000000-0000-0000-0000-000000000000
Status Code: OK/WriteForbidden
Response Time: 2020-01-02T03:04:05.0060000
Request Charge: 4
Request URI: http://localhost.com
Session Tokens: RequestSessionToken / ResponseSessionToken

QueryMetrics

[Query Metrics]
Retrieved Document Count                 :           2,000             
Retrieved Document Size                  :       1,125,600 bytes       
Output Document Count                    :           2,000             
Output Document Size                     :       1,125,600 bytes       
Index Utilization                        :          100.00 %           
Total Query Execution Time               :           33.67 milliseconds
  Query Preparation Times
    Query Compilation Time               :            0.06 milliseconds
    Logical Plan Build Time              :            0.02 milliseconds
    Physical Plan Build Time             :            0.10 milliseconds
    Query Optimization Time              :            0.01 milliseconds
  Index Lookup Time                      :            0.36 milliseconds
  Document Load Time                     :            9.58 milliseconds
  Runtime Execution Times
    Query Engine Times                   :           33.55 milliseconds
    System Function Execution Time       :            0.05 milliseconds
    User-defined Function Execution Time :            0.07 milliseconds
  Document Write Time                    :           18.10 milliseconds
Index Utilization Information
  Utilized Single Indexes
    Filter Expression: FilterExpression
    Index Spec: IndexDocumentExpression
    Index Impact Score: IndexImpactScore
    ---
  Potential Single Indexes
    Filter Expression: FilterExpression
    Index Spec: IndexDocumentExpression
    Index Impact Score: IndexImpactScore
    ---
  Utilized Composite Indexes
    Index Spec: 
    Index Impact Score: IndexImpactScore
    ---
  Potential Composite Indexes
    Index Spec: 
    Index Impact Score: IndexImpactScore
    ---

Client Side Request Stats

[Client Side Request Stats]
Start Time: 0001-01-01T00:00:00.0000000
End Time: 9999-12-31T23:59:59.9999999
Contacted Replicas
  http://someuri1.com/: 1
  http://someuri2.com/: 1
Failed to Contact Replicas
  http://someuri1.com/
  http://someuri2.com/
Regions Contacted
  http://someuri1.com/
  http://someuri2.com/
Address Resolution Statistics
┌──────────────────────┬──────────────────────┬────────────────────────────────────┐
│Start Time (utc)      │End Time (utc)        │Endpoint                            │
├──────────────────────┼──────────────────────┼────────────────────────────────────┤
│  1/1/0001 12:00:00 AM│12/31/9999 11:59:59 PM│                http://localhost.com│
│  1/1/0001 12:00:00 AM│12/31/9999 11:59:59 PM│                http://localhost.com│
└──────────────────────┴──────────────────────┴────────────────────────────────────┘
Store Response Statistics
  Start Time: 0001-01-01T00:00:00.0000000
  End Time: 9999-12-31T23:59:59.9999999
  Resource Type: Document
  Operation Type: Query
  Store Result
    Activity Id: 00000000-0000-0000-0000-000000000000
    Store Physical Address: http://storephysicaladdress.com/
    Status Code: 0/Unknown
    Is Valid: True
    LSN Info
      LSN: 1337
      Item LSN: 15
      Global LSN: 1234
      Quorum Acked LSN: 23
      Using LSN: True
    Session Token: 42
    Quorum Info
      Current Replica Set Size: 4
      Current Write Quorum: 3
    Is Client CPU Overloaded: False
    Exception
    Microsoft.Azure.Documents.InternalServerErrorException: Unknown server error occurred when processing this request., Windows/10.0.19042 cosmos-netstandard-sdk/3.15.4

CPU History

[CPU History]
(0001-01-01T00:00:00.0000000 42.000), (0001-01-01T00:00:00.0000000 23.000)

@kirankumarkolli
Copy link
Member

kirankumarkolli commented Dec 28, 2020

Existing diagnostics was reasonable. Similar we had on Java and working great.
@bchong95 can you please set-up a review meeting to go-over the goals and the motivation for common understanding before this PR gets merged. #Resolved

Copy link
Member

@kirankumarkolli kirankumarkolli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for the goals review meeting follow-up.

@kirankumarkolli
Copy link
Member

Sync-up with Samer, Fabian, Jake, Tim

  1. Compare older and new diagnostics
  2. Perf validation (Existing benchmark's, point operation's)

With-out serialization
With Serialization

sboshra
sboshra previously approved these changes Feb 6, 2021
@sboshra sboshra merged commit 540070c into master Feb 8, 2021
@sboshra sboshra deleted the users/brchon/TracingRemoveCosmosDiagnostics branch February 8, 2021 17:54
@j82w j82w mentioned this pull request Feb 8, 2021
bchong95 added a commit that referenced this pull request Feb 8, 2021
j82w added a commit that referenced this pull request Apr 9, 2021
…s was not included in exception scenarios (#2375)

The ITrace was only adding the client side request stats to the ITrace on success scenarios. If there was any exception the ITrace would not be included. This changes the logic to always add the client side request stats to the ITrace.

The regression was introduced in 3.17.0 with PR #2097
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants