"Mo"
Published on

Adding Database Backup Support to Testcontainers

Authors

Testcontainers is a relatively new library that can help a lot with working with resources like databases by using containerization. From the website:

Testcontainers is an open source library for providing throwaway, lightweight instances of databases, message brokers, web browsers, or just about anything that can run in a Docker container.

Again from the website, this is all it takes to spin up an ephemeral Redis instance:

RedisContainer redisContainer = new RedisBuilder().Build(); await redisContainer.StartAsync();

Even better, it is cross-platform!

Using Testcontainers for unit tests

It seems the primary use case for Testcontainers is for integration tests, such as cases where you’re testing code that has a dependency on an external service like Redis or a database. Here’s a sample copy/pasted from the Testcontainer website:

using Testcontainers.PostgreSql;

namespace Customers.Tests;

public sealed class CustomerServiceTest : IAsyncLifetime
{
    private readonly PostgreSqlContainer _postgres = new PostgreSqlBuilder()
        .WithImage("postgres:15-alpine")
        .Build();

    public Task InitializeAsync()
    {
        return _postgres.StartAsync();
    }

    public Task DisposeAsync()
    {
        return _postgres.DisposeAsync().AsTask();
    }

    [Fact]
    public void ShouldReturnTwoCustomers()
    {
        // Given
        var customerService = new CustomerService(new DbConnectionProvider(_postgres.GetConnectionString()));

        // When
        customerService.Create(new Customer(1, "George"));
        customerService.Create(new Customer(2, "John"));
        var customers = customerService.GetCustomers();

        // Then
        Assert.Equal(2, customers.Count());
    }
}

Some things to note… the PostgreSqlBuilder will be creating a Postgres container for us using the image postgres:15-alpine. The Build method is the step that builds the Docker image for us. Then the StartAsync call starts up the container, with the DisposeAsync stopping it and cleaning up the image resources. Outside of that, this class also provides a GetConnectionString method to get the connection details to the newly running Postgres container.

Sounds pretty easy. And it is! Some caveats that might catch you, though:

  • In the sample code, the CustomerService has the database creation code as well as the seeding code. It is a best practice to have database creation and seeding logic, but real life happens. I’ve frequently worked on projects where the developer machine set up begins with “restore from this database backup.”
  • Another minor thing is that the connection string isn’t database-specific… for example, with SQL Server, there isn’t the Database=MYDATABASE line… which is obvious, because the container doesn’t have the database. But this could cause some issues if you’re using EF Core.

The first minor issues brought up a question for me… could I use Testcontainers with a database backup?

Can Testcontainers be used with a database backup?

TLDR - yes. But it takes a tiny bit of work.

As far as I can tell, you have two options.

  1. Use the normal mcr.microsoft.com/mssql/server:2022 image, but then use SQL commands to restore the backup prior to running your code. The drawback is that the backup step runs every time, but it is an option.
  2. Use a Docker image that already has the backup included as part of it. Then the backup is part of the image

In either case, you’ll have to write a bit of code to support this (either C# or code in your Dockerfile).

My preferred approach at the moment is to do this as part of the Dockerfile.

Here’s a sample Dockerfile that works off of a test database I created as a proof of concept:

FROM mcr.microsoft.com/mssql/server:2022-CU14-ubuntu-22.04 AS setup

# See https://stackoverflow.com/a/69494857/4570

# TODO: perhaps drop this, but it does make it convenient...
ENV ACCEPT_EULA=Y
ENV MSSQL_SA_PASSWORD="Password1234!"

FROM setup AS data
COPY --chmod=0755 Backups/SampleDatabase.bak /
RUN ( /opt/mssql/bin/sqlservr & ) | grep -q "Service Broker manager has started" \
    && /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P $MSSQL_SA_PASSWORD -C -Q 'RESTORE DATABASE SampleDatabase FROM DISK = "/SampleDatabase.bak" WITH MOVE "SampleDatabase" to "/var/opt/mssql/data/SampleDatabase.mdf", MOVE "SampleDatabase_Log" to "/var/opt/mssql/data/SampleDatabase_log.ldf", NOUNLOAD, STATS = 5' \
    && pkill sqlservr

FROM setup

# Copy the prepopulated data tree, but not the backup file
COPY --from=data /var/opt/mssql /var/opt/mssql

As I in the comments, I found the steps to both start the SQL Server daemon and then to run sqlcmd via https://stackoverflow.com/a/69494857/4570.

The first stage specifies the default Testcontainer SQL image (i.e. mcr.microsoft.com/mssql/server:2022-CU14-ubuntu-22.04, which is the current default). The final stage copies the /var/opt/mssql directory over as that is where the data directory is for SQL Server in the Ubuntu image.

It assumes that the database backup is under the Backups/SampleDatabase.bak location. It also has the chmod=0755 parameter so that the backup file is accessible.

The RESTORE DATABASE command has to have the correct names in place so this will vary depending on your database. If you’re not sure, you can run RESTORE FILELISTONLY FROM DISK = '/path/to/your/database.bak to get the output that will be needed.

Some things that aren’t ideal:

  • Including the MSSQL_SA_PASSWORD… it is better to pass these in via environment variables instead of hard-coding the environment variables
  • The script doesn’t handle failure well yet. For example, if the RESTORE DATABASE command is wrong because the MOVE commands have the wrong names, it will currently fail transparently.

The command to build it is: docker buildx build -t testcontainer-withbackup . This tags it with testcontainer-withbackup which is then used with the WithImage("testcontainer-withbackup") in the .NET code.

Finally, here’s the C# code that shows it working:

using Microsoft.Data.SqlClient;
using Testcontainers.MsSql;
using Xunit;

namespace TestContainerSample.Tests;

public class MsSqlTestContainerTests : IAsyncLifetime
{
    private const string Password = "Password1234!";

    private readonly MsSqlContainer _sqlContainer = new MsSqlBuilder()
        // specifying the custom image here
        .WithImage("testcontainer-withbackup")
        .WithPassword(Password)
        .Build();

    public async Task InitializeAsync()
    {
        await _sqlContainer.StartAsync();
    }

    public Task DisposeAsync() => _sqlContainer.DisposeAsync().AsTask();

    [Fact]
    public async Task TestDatabaseConnection()
    {
        var connectionString = _sqlContainer.GetConnectionString();

        await using var connection = new SqlConnection(connectionString);
        await connection.OpenAsync();

		// NOTE: the `Use SampleDatabase` here...
        await using var command = new SqlCommand("USE SampleDatabase; SELECT COUNT(*) FROM dbo.Stuff", connection);
        var result = await command.ExecuteScalarAsync();

        Assert.NotNull(result);
        Assert.IsType<int>(result);
    }
}

One last thing to note… the connection string doesn’t include the database name. Note above how I’ve got USE SampleDatabase; prior to the SELECT statement. If you’re using EF Core, you might want to do something like this instead:

_sqlContainer.GetConnectionString().Replace("master", "your-databasename-here");

I’ve got this all up in a GitHub repo at https://github.com/drmohundro/spike-mssql-testcontainer-from-backup/ - feel free to check it out. Hope this helps someone!