Postgresql Parallel Integration Tests in GO Application



Integration tests is one of the levels of the testing pyramid . Usually they require more time, because in them we do not replace anything with simulations of real components. To reduce the time for such tests, we can run them in parallel. Here I will specifically talk about such tests for Postgresql.



Ideally, each test should be independent, so they will not affect each other. In other words, each test function has its own state. This is a good sign to use parallel tests. To get my personal data set for each test function, I created a function that, when starting a test, creates a temporary circuit, loads data into it and destroys the circuit after the test is completed. Each created schema contains a hash in the name to prevent name conflicts.





Helper function



Let's start with a helper function to display errors in the tests. I took Ben Johnson's helper functions (Ben Johnson), which helped me save a few lines of code and make my mistakes more clear and detailed.



Test data



To run the database integration test, test data must be provided. The Go test tool has good support for loading test data from files. First, go build skips folders called "testdata". Secondly, when you run “go test”, it changes the current folder to the package folder. This allows you to use the relative path to the testdata folder to load the test data set.



Creating a database connection for the test



package database import ( "math/rand" "strconv" "testing" "time" _ "github.com/lib/pq" "database/sql" ) const ( dbPort = 5439 dbUser = "postgres" dbPassword = "postgres" dbName = "test" ) func CreateTestDatabase(t *testing.T) (*sql.DB, string, func()) { connectionString := fmt.Sprintf("port=%d user=%s password=%s dbname=%s sslmode=disable", dbPort, dbUser, dbPassword, dbName) db, dbErr := sql.Open("postgres", connectionString) if dbErr != nil { t.Fatalf("Fail to create database. %s", dbErr.Error()) } rand.Seed(time.Now().UnixNano()) schemaName := "test" + strconv.FormatInt(rand.Int63(), 10) _, err := db.Exec("CREATE SCHEMA " + schemaName) if err != nil { t.Fatalf("Fail to create schema. %s", err.Error()) } return db, schemaName, func() { _, err := db.Exec("DROP SCHEMA " + schemaName + " CASCADE") if err != nil { t.Fatalf("Fail to drop database. %s", err.Error()) } } }
      
      







Calling “CreateTestDatabase” to create a connection to the test database and create a new data schema for the tests. This function returns the database connection, the name of the created schema, and the purge function to delete this schema. For a test, it is better to fail the test than return an error to the caller. (Note: The return of the cleanup function is based on Mitchell Hashimoto's Advanced Testing with Go talk ).



Download Test Dataset



I used the “.sql” files. One (1) sql contains data for one (1) table. It includes creating a table and populating it with data. All sql files are stored in the “testdata” folder. Here is an example sql file.



 CREATE TABLE book ( title character varying(50), author character varying(50) ); INSERT INTO book VALUES ('First Book','First Author'), ('Second Book','Second Author') ;
      
      





And here is the intricate part. Because each function runs in its own unique data scheme, we cannot just execute (write) a query in these sql files. We must specify the schema before the table names in order to create a table or insert data into the desired temporary schema. For example, the CREATE TABLE book ... should be written as CREATE TABLE uniqueschema.book ... and the INSERT INTO book ... need to be changed to INSERT INTO uniqueschema.book .... I used regular expressions to modify queries before executing. Here is the test data download code:



 package datalayer import ( "bufio" "fmt" "io" "os" "regexp" "testing" "database/sql" "github.com/Hendra-Huang/databaseintegrationtest/testingutil" //     (   ,  79) ) //        var schemaPrefixRegexps = [...]*regexp.Regexp{ regexp.MustCompile(`(?i)(^CREATE SEQUENCE\s)(["\w]+)(.*)`), regexp.MustCompile(`(?i)(^CREATE TABLE\s)(["\w]+)(\s.+)`), regexp.MustCompile(`(?i)(^ALTER TABLE\s)(["\w]+)(\s.+)`), regexp.MustCompile(`(?i)(^UPDATE\s)(["\w]+)(\s.+)`), regexp.MustCompile(`(?i)(^INSERT INTO\s)(["\w]+)(\s.+)`), regexp.MustCompile(`(?i)(^DELETE FROM\s)(["\w]+)(.*)`), regexp.MustCompile(`(?i)(.+\sFROM\s)(["\w]+)(.*)`), regexp.MustCompile(`(?i)(\sJOIN\s)(["\w]+)(.*)`), } //      func addSchemaPrefix(schemaName, query string) string { prefixedQuery := query for _, re := range schemaPrefixRegexps { prefixedQuery = re.ReplaceAllString(prefixedQuery, fmt.Sprintf("${1}%s.${2}${3}", schemaName)) } return prefixedQuery } func loadTestData(t *testing.T, db *sql.DB, schemaName string, testDataNames ...string) { for _, testDataName := range testDataNames { file, err := os.Open(fmt.Sprintf("./testdata/%s.sql", testDataName)) testingutil.Ok(t, err) reader := bufio.NewReader(file) var query string for { line, err := reader.ReadString('\n') if err == io.EOF { break } testingutil.Ok(t, err) line = line[:len(line)-1] if line == "" { query = addSchemaPrefix(schemaName, query) _, err := db.Exec(query) testingutil.Ok(t, err) query = "" } query += line } file.Close() } }
      
      







Test creation



Before starting each test, a test database with a unique name for the scheme will be created and the execution of the cleanup function to delete this scheme will be delayed. The schema name will be inserted into the request in the test. The most important thing in this implementation is that the database connection must be customizable to change the connection from the real database to the connection with the test database. Add “t.Parallel ()” at the beginning of each test function to indicate to the test environment the need to run this test in parallel.

Below is the full code:



 //            "integration" (. build flags) // +build integration package datalayer import ( "context" "testing" "github.com/Hendra-Huang/databaseintegrationtest/database" "github.com/Hendra-Huang/databaseintegrationtest/testingutil" ) func TestInsertBook(t *testing.T) { t.Parallel() db, schemaName, cleanup := database.CreateTestDatabase(t) defer cleanup() loadTestData(t, db, schemaName, "book") // will load data which the filename is book title := "New title" author := "New author" // those 2 lines code below are not a good practice // but it is intentional to keep the focus only on integration test part // the important part is database connection has to be configurable insertBookQuery = addSchemaPrefix(schemaName, insertBookQuery) // override the query and add schema to the query err := InsertBook(context.Background(), db, title, author) // will execute insertBookQuery with the provided connection testingutil.Ok(t, err) } func TestGetBooks(t *testing.T) { t.Parallel() db, schemaName, cleanup := database.CreateTestDatabase(t) defer cleanup() loadTestData(t, db, schemaName, "book") getBooksQuery = addSchemaPrefix(schemaName, getBooksQuery) books, err := GetBooks(context.Background(), db) testingutil.Ok(t, err) testingutil.Equals(t, 2, len(books)) }
      
      







Note: Under “TestGetBooks,” I assume that the query will return 2 books, as I have brought so much test dataset into “testdata / book.sql” although there is an insert test above. If we do not share the circuit between the two tests, “TestGetBooks” will fail, because now 3 rows in the table, 2 from the test, 1 from the test insert above. This is the advantage of separate schemes for tests - their data is independent, and therefore the tests are independent of each other.



The project example I posted here github . You can copy it to yourself, run the test and see the result.



Conclusion



For my project, this approach reduces test time by 40–50%, compared to sequential tests. Another advantage of running tests in parallel is that we can avoid some errors that can happen when an application processes several competitive actions.



Have a nice test.



- Picture from medium.com/kongkow-it-medan/parallel-database-integration-test-on-go-application-8706b150ee2e



All Articles