Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve logging and tracing support in future release? #1061

Closed
jackc opened this issue Aug 7, 2021 · 28 comments
Closed

How to improve logging and tracing support in future release? #1061

jackc opened this issue Aug 7, 2021 · 28 comments
Labels
RFC requesting feedback for proposed change
Milestone

Comments

@jackc
Copy link
Owner

jackc commented Aug 7, 2021

The original design goal for logging was simply to log what queries are being run and how long they take.

There have been a few requests to support additional logging / tracing points or new ways of configuring what is currently logged (#853, #998). The current logging system works fine for me, but as I'm considering the possibility of a new major release I'd like to open the door to breaking changes to the logging / tracing interface.

As I do not have a first-hand need for this I do not have any opinion on what these changes might be, and I would probably not implement them myself, but I wanted to bring up the possible opportunity for those who may have the interest and ability while there is a window open for possible breaking changes.

@jackc jackc added v5 RFC requesting feedback for proposed change labels Aug 7, 2021
@Tochemey
Copy link

Tochemey commented Sep 8, 2021

Hello @jackc thanks for such feature. That will help a lot

@HakShak
Copy link

HakShak commented Sep 13, 2021

@kszafran
Copy link

kszafran commented Oct 6, 2021

Is anyone working on this? I'm considering switching to the pgx interface to take advantage of the binary protocol, but if it stops us from integrating with e.g. DataDog in the future (DataDog/dd-trace-go#697) that could be a problem.

@jackc jackc added this to the v5 milestone Nov 15, 2021
@jackc jackc removed the v5 label Nov 15, 2021
@Tochemey
Copy link

@jackc any progress on this.

@jackc
Copy link
Owner Author

jackc commented Nov 26, 2021

@jackc any progress on this.

I have not heard of anyone working on this.

@obitech
Copy link

obitech commented Dec 27, 2021

I'm wondering how practical it would be to provide some sort of hooks or middleware chains which could be triggered before or after an Exec is called on the underlying PgConn. Similar to the BeforeAcquire functionality in a Pool. These hooks could just receive a context which is used for span propagation. WDYT @jackc ?

@jackc
Copy link
Owner Author

jackc commented Dec 27, 2021

I'm wondering how practical it would be to provide some sort of hooks or middleware chains which could be triggered before or after an Exec is called on the underlying PgConn. Similar to the BeforeAcquire functionality in a Pool. These hooks could just receive a context which is used for span propagation. WDYT @jackc ?

I like the idea of hooks, but there would need to be a lot more than before / after pgconn.Exec. There's also pgconn.ExecPrepared and pgconn.ExecParams. Then those methods only send the queries. They don't read the results. There would need to be something after the results are received. And maybe when the first result comes back. Also, there are batch requests.

I could see something like:

type BeforeQueryHook interface {
  BeforeQuery(...)
}

Every hook point could have its own interface. The logger or tracer could implement as many or as few hooks as it wanted to. This would allow adding or even removing hooks in a backwards compatible way.

@obitech
Copy link

obitech commented Dec 27, 2021

Couple more questions since I'm not very familiar with the inner workings of pgx:

  • Would there be interfaces for every command like BeforeQueryHook, AfterQueryHook, BeforeQueryRowHook, and so on?
  • I assume those hooks would need to be present for Conns, Pools and Txs ?

I could try looking into a PR (if you want) but would probably need some guidance along the way

@jackc
Copy link
Owner Author

jackc commented Dec 27, 2021

Would there be interfaces for every command like BeforeQueryHook, AfterQueryHook, BeforeQueryRowHook, and so on?

It would be a separate interface for each hook. But I don't think there would need to be a separate one for each method. Query, QueryRow, and QueryFunc at least could all use the same hooks. Maybe Exec as well. My guess would be that hooks for send query, first response received, and query complete would be applicable to all queries.

I assume those hooks would need to be present for Conns, Pools and Txs ?

I don't think so. Txs and pools ultimately call methods on Conn so *pgx.Conn is probably the only thing that would need to be modified. Potentially pgxpool would have a different set of hooks to track acquire time.

@mrsufgi
Copy link

mrsufgi commented Feb 16, 2022

Well I guess hooks would be the best approach, but wrapping the driver is a simpler approach - kind of similar to the work they did on: https://github.com/uptrace/opentelemetry-go-extra/tree/main/otelsql

Tracing do work if you use stdlib.OpenDB but I guess everyone would benefit from a native pgx.Conn wrapper instead (for advanced pgx functionalities such as CopyFrom)

I do think that this should not be done as part of this repo as this is an extension (wrapper) on top of the pgx driver. I know this means that it's more "brittle" but necessary but would make things simpler to maintain (as @jackc goals are different from "tracing").

As for tracing engine, I do believe the community should focus on OpenTelemetry (and not OC/DD/Jaeger etc) as its de-facto standard for tracing.

As this is a major blocker for my team and I, I will try the find the time and develop this myself.
( we use sqlc and otelsql / OpenTelemetry with the old lib/pg driver )

My approach would be wrapping pgx.Conn kind of what otelsql did.

For everyone else interested in my current setup (sqlx, pgx driver[stdlib] and otelsql) ->

func (cfg *PostgresConfig) String() string {
	return fmt.Sprintf("host=%s port=%s user=%s password=%s dbname=%s sslmode=disable", cfg.Host, cfg.Port, cfg.User, cfg.Password, cfg.DB)
}

func NewPostgres(ctx context.Context, pgcfg ...PostgresConfig) *sqlx.DB {
	cfg := newPostgresConfigDefault()
	if len(pgcfg) > 0 {
		// TODO: merge pgcfg with default vaules
		cfg = &pgcfg[0]
	}

	dcfg, err := pgx.ParseConfig(cfg.String())
	if err != nil {
		log.S(ctx).Fatalf("unable to parse postgres connection string: %v", err.Error())
	}


	conf := otelsql.OpenDB(stdlib.GetConnector(*dcfg))

	conn, err := sqlx.NewDb(conf, "pgx"), nil
	if err != nil {
		log.S(ctx).Fatalf("socket server unexpected error %v", err)
	}

	return conn
}

(Note that I can't use otelsqlx [uptrace] because I doesn't support OpenDB using an existing connector, this is why I do sqlx.NewDb to create sqlx connection)

And in my repositories I do something like:

func NewPgProjectsRepository(conn *sqlx.DB) *PgProjectsRepository {
	r := &PgProjectsRepository{
		conn:   conn,
		db:     db.New(conn),
	}

	return r
}

P.S

I'm 100% down for replacing sqlx with pgx but not sure it covers its functionalities with parsing to structs or don't know how to convert pgx.Conn to stdlib to "downgrade".

In one of my attempts I tried using "AquireConn"/conn.Raw on stdlib but it kinda of a deadend since I dont know when to release a connection and want it to be used like I do with database/sql conn

@treuherz
Copy link
Contributor

The crucial thing about any hooks-based solution as I see it is ensuring consistency of the context sent through. OpenTelemetry, and I believe its predecessor frameworks, all work by stuffing the span and correlation information into the ctx. Pgx is already great about making sure a passed-in context is preserved, but for these hooks to be able to correlate calls within the path of a single request, the right context needs to be passed in and out of the hooks in the right order.

If a bit more background on what information is needed is helpful, the OpenTelemetry spec provides a list of keys for traces on database calls. Obviously these would actually be populated by an otel integration library, but hopefully this is a helpful indication of what info the hooks will want to be able to collect, beyond the basics like timings and SQL content.

@duxing
Copy link

duxing commented May 21, 2022

any update on this topic?
similar to many people in the thread I would love to use otel or ddtrace to trace pgx but neither is well supported at the moment. I still prefer pgx over pg but would like to see the improvement on the observability side

jackc added a commit that referenced this issue Jul 16, 2022
Replaces existing logging support. Package tracelog provides adapter for
old style logging.

#1061
@jackc
Copy link
Owner Author

jackc commented Jul 16, 2022

I just pushed a big change to the v5-dev branch (78875bb). I figured tracing would show up some time and I really didn't want to have both tracing and logging interfaces. So I've implemented a tracing interface and created an adapter that converts normal loggers to the tracing interface.

There are various interfaces a tracer can implement. e.g.

// QueryTracer traces Query, QueryRow, and Exec.
type QueryTracer interface {
	// TraceQueryStart is called at the beginning of Query, QueryRow, and Exec calls. The returned context is used for the
	// rest of the call and will be passed to TraceQueryEnd.
	TraceQueryStart(ctx context.Context, conn *Conn, data TraceQueryStartData) context.Context

	TraceQueryEnd(ctx context.Context, conn *Conn, data TraceQueryEndData)
}

As I mentioned in my original post, the simple logging that pgx supports in v4 was enough for my needs at the time. The tracelog package adapts old loggers to the new tracing interface. This works well enough for me. But as far as tracing per se, I still don't need that. So my design was based on the discussion in the thread and general experience -- not first hand need or specific experience.

This is the last major change planned for v5. I plan to release the (hopefully) final alpha release within the next few days (maybe even tonight), then a beta release a couple weeks after that, and the general release a few weeks after that.

All that to say, I'd suggest those who care about this feature trying it soon. Once v5 is released the interface will be frozen for a long time.

@obitech
Copy link

obitech commented Jul 19, 2022

I've tried this out locally and I have no complaints so far regarding the API. Thank you @jackc!

Anyone who wants to give this a spin, I've created https://github.com/exaring/otelpgx based on the v5.0.0-alpha.5 tag.

@jnst
Copy link

jnst commented Jul 24, 2022

@jackc
I would like to implement Datadog tracing, but at the same time I need the log output of the newly provided tracelog package.

Copy and paste code from the tracelog package when implementing the QueryTracer interface is redundant.
Shouldn't ConnConfig.Tracer be of type slice?

Or I think Logger and Tracer should exist separately.

@jackc
Copy link
Owner Author

jackc commented Jul 30, 2022

@jnst I think the tracer implementation wrapping another tracer (middleware style) makes more sense than a slice of tracers. That gives application code the ability to control which tracers run and in what order.

@jnst
Copy link

jnst commented Aug 3, 2022

@jackc
The current design has the following problems.

  • Log output is practically always used, so tracelog must be wrapped and implemented.
  • Wrapping and implementing code can be complex and violates the principle of single responsibility.
  • It is even more complicated in cases where three or four tracers are needed.
  • Unit tests are also complicated by the need to use mock.

The practice of executing operations one at a time in a slice structure is often seen in server application interceptors.

I am using Datadog in my product and am trying to support pgx to visualize database processing times in a distributed tracing.
I have actually implemented the following code. It seems a bit verbose and I would appreciate if you could improve it.

func (t *pgxTracer) TraceQueryStart(ctx context.Context, conn *pgx.Conn, data pgx.TraceQueryStartData) context.Context {
	ctx = t.logTracer.TraceQueryStart(ctx, conn, data)

	var c *pgx.ConnConfig
	if conn != nil {
		c = conn.Config()
	}
	span := t.startSpan(ctx, "query", data.SQL, c)
	if span != nil {
		return tracer.ContextWithSpan(ctx, span)
	}
	return ctx
}

Click here for the entire code.

@obitech
Copy link

obitech commented Aug 3, 2022

To come to @jackc's support, I don't see how these are problems to be honest.

Log output is practically always used, so tracelog must be wrapped and implemented.

This might be true in your case, but loads of people don't need logs.

Wrapping and implementing code can be complex and violates the principle of single responsibility.

Wrapping middlewares is a standard procedure and used in many popular router implementations in Go.

Unit tests are also complicated by the need to use mock.

This is an implementation detail and I don't see how a slice of tracers would simplify this.

Also you link to a blog post, but this is just one opinion. If you look at a hugely popular post on HTTP services, wrapping is encouraged. Also chi, one of the most popular HTTP routers in the Go ecosystem is using wrapping middleware handlers.

I don't have stakes in this as I'm just a (happy) user of pgx, but I understand why the proposed Tracer interfaces have been chosen: they keep the pgx code simple while still enabling lots of use cases for its users. It should not be pgx's task to make your code less verbose.

@jackc
Copy link
Owner Author

jackc commented Aug 5, 2022

@jnst

Multiple tracers can be implemented in several ways. I happen to prefer the wrapping style like the HTTP middleware mentioned above. But a slice of tracers can itself be easily implemented as a Tracer. Something like this should do what you want:

type MultiQueryTracer struct {
	Tracers []QueryTracer
}

func (m *MultiQueryTracer) TraceQueryStart(ctx context.Context, conn *Conn, data TraceQueryStartData) context.Context {
	for _, t := range m.Tracers {
		ctx = t.TraceQueryStart(ctx, conn, data)
	}

	return ctx
}

func (m *MultiQueryTracer) TraceQueryEnd(ctx context.Context, conn *Conn, data TraceQueryEndData) {
	for _, t := range m.Tracers {
		t.TraceQueryEnd(ctx, conn, data)
	}
}

@mrkagelui
Copy link

is anyone working on datadog integration? if not I'll try to (FYI - DataDog/dd-trace-go#697 (comment))

@egon12
Copy link

egon12 commented Oct 22, 2022

Hi @jackc, I try to create the Trace for newrelic right now.

And try to unit test it. For now function.

cmd, err := conn.Exec(ctx, query, args...)

It doesn't have any problem.

But for

rows, err := conn.Query(ctx, query, args...)

I can't see where is TraceQueryEnd is called.
I try to find it in Query function here

pgx/conn.go

Line 624 in 3e825ec

func (c *Conn) Query(ctx context.Context, sql string, args ...any) (Rows, error) {

And not found TraceQueryEnd in this function.
Is it expected?

@egon12
Copy link

egon12 commented Oct 24, 2022

After more reading on the code, I found out that TraceQueryEnd is called when we call rows.Close().
Sorry for asking unnecessary question.

@mrkagelui
Copy link

Hi @jackc, would it be possible to get your opinion (or anyone else's on this thread) on DataDog/dd-trace-go#1537 ?

@mrkagelui
Copy link

mrkagelui commented Oct 25, 2022

latest update, as dd-trace-go supports three versions back (1.17 as of now), that integration has to wait, but still could anyone please comment on the API as well as data collected?

@jackc
Copy link
Owner Author

jackc commented Oct 28, 2022

@mrkagelui I haven't used used DataDog, so my opinion probably isn't super valuable, but I took a quick look and nothing in the pgx parts looked out of the ordinary.

@jackc
Copy link
Owner Author

jackc commented Oct 28, 2022

I'm going to close this issue since the feature was merged into v5 and v5 has now been released. For any bugs or new features we can create new issues or discussions.

@jackc jackc closed this as completed Oct 28, 2022
@mrkagelui
Copy link

@mrkagelui I haven't used used DataDog, so my opinion probably isn't super valuable, but I took a quick look and nothing in the pgx parts looked out of the ordinary.

Thank you so much! Yes I wanted to see if the data collected makes sense and it's the correct use of the interface and data

@zaydek
Copy link

zaydek commented Oct 29, 2024

If anyone's struggling with this, I threw this together based on a comment in this issue and another (which are in the code file). This uses the slog logger interface.

https://gist.github.com/zaydek/91f27cdd35c6240701f81415c3ba7c07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC requesting feedback for proposed change
Projects
None yet
Development

No branches or pull requests