Skip to content

Using SqlMigrator

Adam O'Neil edited this page Mar 9, 2021 · 21 revisions

Use the SqlMigrator<T> class to copy data row-by-row within or across databases. The T generic argument specifies your identity column type. Supported types are int, long, and Guid. SqlMigrator is intended for cloning or deep copy operations over related tables where you need to preserve relationships on copied rows as well as execute transformations on individual records. To get started, create a migrator object like this. (This example assumes a fictional GetConnection method that opens a SqlConnection in your app.)

using (var cn = GetConnection())
{
    var migrator = await SqlMigrator<int>.InitializeAsync(cn);

    // now you're ready to migrate
}

This will create a schema in your database called migrate and a table called KeyMap_int that tracks the identity values of every record the migrator processes. This is how SqlMigrator ensures that it handles every record exactly once, and how it maps the source and generated identity values on each row processed.

There are two main methods to know about: CopySelfAsync and CopyAcrossAsync. CopySelf operates on a single connection, used for cloning data within a database. CopyAcross does the same thing but to a second connection. Both methods call a lower-level implementation method CopyRowsAsync that accepts a DataTable argument of source data.

Let's walk through an example where you need to copy select rows from three related tables: Parent, Child and GrandChild. We start with the top-most table. This is adapted from an existing integration test, re-written slightly to be slightly more concise. This is based on a very simple database structure defined here. Note that the sample data is a little wacky -- it comes from my TestDataGen package. I used a simple incremented numeric suffix concatenated to my already absurd generated names to assure uniqueness.

await migrator.CopySelfAsync(cn, "dbo", "Parent", "Id", 
    "[Id]=@id", new { id = 3 }, 
    onEachRow: (cmd, row) =>
    {
        cmd["Name"] = row["Name"].ToString() + " - copy";
    });

This migrates rows from [dbo].[Parent] WHERE [Id]=3. Since this is our top-most table in the relationship, we are treating this as a single record migration. We're using the onEachRow callback to append the word "copy" to the generated record. The first callback argument cmd is a SqlCmdDictionary from my Dapper.CX library. This is used to build SQL INSERT and UPDATE statements dynamically. In the example above cmd["Name"] refers to the Name column in the resulting INSERT. The row argument of the callback is a DataRow.

If we look in the Key Map table at this point, we should see a mapping of the record we started with whose Id is 3 mapped to a new record Id. In this case the generated row has an Id of 11.

img

Likewise, when we look in the dbo.Parent table itself for these Id values, we see this -- both the source row and the generated row with the new Id value 11:

img

It gets more interesting when we move to the first level of child records. Here, we copy records from the Child table, again using our parent Id of 3. This time we use the mapForeignKeys argument to indicate that the ParentId column will get its values from the NewIds generated from dbo.Parent -- our previous step.

await migrator.CopySelfAsync(cn, "dbo", "Child", "Id", 
    "[ParentId]=@id", new { id = 3 }, 
    mapForeignKeys: new Dictionary<string, string>()
    {
        { "ParentId", "dbo.Parent" }
    });

When we look in the Key Map table, we see this:

img

We can compare the generated data side by side to see that the generated names look exactly like the original, but notice the Ids and more crucially the Parent Ids are different, and agree with the Parent mapping above of Id 3 -> 11.

img

Lastly, let's migrate the GrandChild records. This will work much the same as with the Child migration, but we map foreign keys a little differently, and we have to get a little creative with the criteria argument so that we can keep using our original ParentId of 3:

await migrator.CopySelfAsync(cn, "dbo", "GrandChild", "Id", 
    "[ParentId] IN (SELECT [Id] FROM [dbo].[Child] WHERE [ParentId]=@id)", new { id = 3 },
    mapForeignKeys: new Dictionary<string, string>()
    {
        { "ParentId", "dbo.Child" }
    });

The GrandChild table has a lot more rows, so I won't picture the whole Key Map:

img

We can do a similar side-by-side comparison of source and generated rows to see that the generated rows have the same names, but the right ParentId.

img

The main thing to note here is that the ParentId values in GrandChild are mapped from the generated Child Id values.

What's Next?

In this example, we had a simple database structure with a single foreign key lineage. In a real application, you'd likely have many foreign keys in a single table that all had to be preserved in cloned data, while ensuring no cross-parent leakage. You can map any number of foreign keys within a single CopySelfAsync call. The mapForeignKeys argument is a dictionary, so by definition it can have many keys in it that correspond to columns you need to remap.

Clone this wiki locally