Skip to main content

Tests

The Flow web application automatically performs basic tests to validate the configurations of captures and materializations. As your Data Flows grow in breadth and scope, and as requirements change or new contributors get involved, more robust tests are invaluable for ensuring the correctness of your data products.

You can use Flow tests to verify the end-to-end behavior of any modified schemas or derivations included in your Data Flow. At their most basic, you feed example documents into a collection, and then verify that documents coming out of a derived collection meet your test's expectation:

tests:
acmeCo/tests/greetings:
- ingest:
description: Add people to greet.
collection: acmeCo/people
documents:
- { userId: 1, name: "Zelda" }
- { userId: 2, name: "Link" }

- verify:
description: Ensure people were greeted.
collection: acmeCo/greetings
documents:
- { userId: 1, greeting: "Hello Zelda" }
- { userId: 2, greeting: "Hello Link" }

A test is a sequence of one or more steps, each of either an ingest or a verify type.

  • ingest steps add one or more documents to a collection.
  • verify steps make assertions about the current contents of a collection.

All steps must complete successfully in order for a test to pass.

Ingest

ingest steps add documents to a named collection. All documents must validate against the collection's schema, or a catalog build error will be reported.

All documents from a single ingest step are added in one transaction. This means that multiple documents with a common key will be combined prior to their being appended to the collection. Suppose acmeCo/people had key [/id]:

tests:
acmeCo/tests/greetings:
- ingest:
description: Zeldas are combined to one added document.
collection: acmeCo/people
documents:
- { userId: 1, name: "Zelda One" }
- { userId: 1, name: "Zelda Two" }

- verify:
description: Only one Zelda is greeted.
collection: acmeCo/greetings
documents:
- { userId: 1, greeting: "Hello Zelda Two" }

Verify

verify steps assert that the current contents of a collection match the provided document fixtures. Verified documents are fully reduced, with one document for each unique key, ordered under the key's natural order.

You can verify the contents of both derivations and captured collections. Documents given in verify steps do not need to be comprehensive. It is not an error if the actual document has additional locations not present in the document to verify, so long as all matched document locations are equal. Verified documents also do not need to validate against the collection's schema. They do, however, need to include all fields that are part of the collection's key.

tests:
acmeCo/tests/greetings:
- ingest:
collection: acmeCo/people
documents:
- { userId: 1, name: "Zelda" }
- { userId: 2, name: "Link" }
- ingest:
collection: acmeCo/people
documents:
- { userId: 1, name: "Zelda Again" }
- { userId: 3, name: "Pikachu" }

- verify:
collection: acmeCo/greetings
documents:
# greetings are keyed on /userId, and the second greeting is kept.
- { userId: 1, greeting: "Hello Zelda Again" }
# `greeting` is "Hello Link", but is not asserted here.
- { userId: 2 }
- { userId: 3, greeting: "Hello Pikachu" }

Partition selectors

Verify steps may include a partition selector to verify only documents of a specific partition:

tests:
acmeCo/tests/greetings:
- verify:
collection: acmeCo/greetings
description: Verify only documents which greet Nintendo characters.
documents:
- { userId: 1, greeting: "Hello Zelda" }
- { userId: 3, greeting: "Hello Pikachu" }
partitions:
include:
platform: [Nintendo]

Learn more about partition selectors.

Tips

The following tips can aid in testing large or complex derivations.

Testing reductions

Reduction annotations are expressive and powerful, and their use should thus be tested thoroughly. An easy way to test reduction annotations on captured collections is to write a two-step test that ingests multiple documents with the same key and then verifies the result. For example, the following test might be used to verify the behavior of a simple sum reduction:

tests:
acmeCo/tests/sum-reductions:
- ingest:
description: Ingest documents to be summed.
collection: acmeCo/collection
documents:
- {id: 1, value: 5}
- {id: 1, value: 4}
- {id: 1, value: -3}
- verify:
description: Verify value was correctly summed.
collection: acmeCo/collection
documents:
- {id: 1, value: 6}

Reusing common fixtures

When you write a lot of tests, it can be tedious to repeat documents that are used multiple times. YAML supports anchors and references, which you can implement to re-use common documents throughout your tests. One nice pattern is to define anchors for common ingest steps in the first test, which can be re-used by subsequent tests. For example:

tests:
acmeCo/tests/one:
- ingest: &mySetup
collection: acmeCo/collection
documents:
- {id: 1, ...}
- {id: 2, ...}
...
- verify: ...

acmeCo/tests/two:
- ingest: *mySetup
- verify: ...

This allows all the subsequent tests to re-use the documents from the first ingest step without having to duplicate them.