Update retraining to work with RunMe #110

jlewi · 2024-05-16T23:41:35Z

We need to update our retraining code to work with the data produced by RunMe

Related to Support using Foyle's AI capabilities as a plugin stateful/runme#574

jlewi · 2024-05-26T18:35:08Z

Needed Changes

We need to update the Logs Analyzer to process the RunMe logs.

To Process the RunMe logs there's a couple things we need to do

We need to get the log entries associated with Execute Requests
~~We need to build a map from RunMe ULID to Foyle BlockIDs~~
- ~~We need to parse out the (ULID,ID) mapping from the LogEntry~~
- ~~This requires updating combineEntriesForTrace to store the ULID as part of the trace~~
buildTraces needs to support RunMe Execute Requests
- Option 1: Reuse ExecuteTrace to represent a RunMe execute request
- Option 2: Define a new RunmeExecuteTrace
- I think Option 2 is more consistent with where we are headed. For a RunMe trace different protos are used and we should log those rather than doing some sort of Lossy conversion to Foyle protos. If we just add those protos as fields to ExecuteTrace then it becomes harder to tell what type of trace we have and therefore what fields should be set
- Option 2 is more consistent with introducing a proto which will use a OneOf to combine types that are unique for each type of trace Use Pebble to store traces #91
- For RunMe execute traces we could use the _id field. This is a ULID but that shouldn't matter.
We will need to change buildTraces because right now we iterate over all the traces (Generate and Execute) and add them to the associated block log
- Now we need to do the following
- Iterate over all the Generate Traces
  - Create a mapping from RunMe ULID to BlockID
- Iterate over all the ExecuteTraces
  - Map ULID to BlockID
  - Lookup BlockID in BlockLog map.

Configuration

I think we should define a LearnerConfig and add it to our Config this should allow people to configure additional log locations.

jlewi · 2024-05-27T14:53:41Z

Alternative: Use ULIDs for BlocKIDs

An alternative is to change Foyle to use ULIDs for blockIDs.
I think having a single consistent ID will simplify things and avoid the headaches caused by constantly needing to map back and forth.
Its possible RunMe can work with UUIDs but we can just use ULIDs. IDs are string and we don't much care what they are as long as they are unique.

* RunMe uses ULIDs for cell ids. If we switch to using ULIDs we can be consistent with what RunMe does and avoid having two different types of ids for blocks. This will avoid a lot of headaches. Related to #110

jlewi · 2024-05-27T16:52:30Z

Right now each type of trace is stored in a flat JSONL file

foyle/app/pkg/analyze/analyzer.go

Line 135 in 23a7b1e

genEnc := json.NewEncoder(genFile)

We rely on the filename to know the type of trace.
If we don't fix #91 first; we will have to add another set of JSONL files.

Switch to using Pebble as a Key Value store to store traces and block logs * To support that we create a Trace proto which uses a oneof field to store the fields for the different types of traces * This will make it easier to support using RunMe logs for training #110 Fix #91

jlewi · 2024-05-28T02:18:02Z

With #121 we should be code complete but will need to do some E2E testing to see if its all working.

* Define a combineRunmeEntries function to generate the RunmeTrace * Define a RunMe trace type * Update Analyzer to parse the RunMe logs and construct traces * Update Analayzer to update the BlockLog based on the RunMe trace * Set the executed command Fix #110

jlewi · 2024-05-28T14:34:36Z

So how do we verify everything is working

Use RunMe to generate a completion
Correct the completion
Train
Verify example is added to the training set
Generate a similar completion
Verify the example is used in the prompt

Since the RunMe changes haven't been released yet. We need to use our locally built version of RunMe.

jlewi · 2024-05-28T23:24:57Z

No examples are produced based on the foyle logs.
When I looked at the blocklog for block 01HZ0N1228NJ7PSRYB6WEMH08M there was no exec id.

jlewi · 2024-05-28T23:42:40Z

I didn't update the learning code to include the directory of the RunMe logs

* Fix training on RunMe logs #110 * We need to allow Foyle to be configured with the directory of the RunMe logs * Add LearnerConfig to Foyle config with the logs directory * Fix a bug in analyze preventing BlockLogs from being updated with the executed block when using RunMe * The bug was that when looking for the last execution trace; we weren't considering RunMe traces * Update the RunMe developer guide * Update the docs

jlewi mentioned this issue May 16, 2024

Support using Foyle's AI capabilities as a plugin stateful/runme#574

Closed

3 tasks

jlewi mentioned this issue May 27, 2024

Use ULIDs for block ids #120

Merged

This was referenced May 27, 2024

Support Learning using RunMe logs #121

Merged

Use Pebble and Protos For Traces and BlockLogs #122

Merged

jlewi closed this as completed in #121 May 28, 2024

jlewi reopened this May 28, 2024

jlewi mentioned this issue May 29, 2024

Fix retraining for RunMe #124

Merged

jlewi closed this as completed May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update retraining to work with RunMe #110

Update retraining to work with RunMe #110

jlewi commented May 16, 2024

jlewi commented May 26, 2024 •

edited

Loading

jlewi commented May 27, 2024

jlewi commented May 27, 2024

jlewi commented May 28, 2024

jlewi commented May 28, 2024

jlewi commented May 28, 2024

jlewi commented May 28, 2024

Update retraining to work with RunMe #110

Update retraining to work with RunMe #110

Comments

jlewi commented May 16, 2024

jlewi commented May 26, 2024 • edited Loading

Needed Changes

Configuration

jlewi commented May 27, 2024

Alternative: Use ULIDs for BlocKIDs

jlewi commented May 27, 2024

jlewi commented May 28, 2024

jlewi commented May 28, 2024

jlewi commented May 28, 2024

jlewi commented May 28, 2024

jlewi commented May 26, 2024 •

edited

Loading