Skip to content

Guide to writing solid tests

Peter Johnson a.k.a. insertcoffee edited this page Jul 2, 2015 · 36 revisions

Test one thing at a time

It's tempting to make a 'super' test which covers many different features.

Unfortunately this can lead to brittle tests, as the case will fail on multiple conditions; making debugging and subsequent re-factoring a pain.

Ask yourself "what am I testing here?" and focus on asserting that single behaviour, try to avoid testing as much of the rest of the system as you possibly can.

Creating separate tests for other features allows you to file them in the correct directory and group and run similar cases easily.

Remember to focus on the quality, succinctness and strength of the test not the quantity of assertions it makes, tests should be flexible enough to continue passing as unrelated changes are made to the system, and allow us to avoid repetitive re-factoring just to keep them passing.

The case below is a good example of this, it is very strict in that the query must return these exact 2 records in the top 2 results. If a change is made to any of the properties the test will fail.

We can actually change it to be even more strict while making the test less brittle. I am assuming that we are testing that "yankee stadium in the bronx should be the top result for the query 'yankee stadium'"

// bad
"in": {
  "input": "yankee stadium"
},
"expected": {
  "properties": [
    {
      "layer": "osmway",
      "name": "Yankee Stadium",
      "alpha3": "USA",
      "admin0": "United States",
      "admin1": "New York",
      "admin1_abbr": "NY",
      "admin2": "Bronx County",
      "local_admin": "Bronx",
      "locality": "New York",
      "neighborhood": "West Concourse",
      "text": "Yankee Stadium, Bronx, NY"
    },
    {
      "layer": "geoname",
      "name": "Yankee Stadium",
      "alpha3": "USA",
      "admin0": "United States",
      "admin1": "New York",
      "admin2": "Bronx",
      "text": "Yankee Stadium, Bronx, NY",
      "admin1_abbr": "NY"
    }
  ],
  "priorityThresh": 2
}

// good
"in": {
  "input": "yankee stadium"
},
"expected": {
  "properties": [ { "text": "Yankee Stadium, Bronx, NY" } ],
  "priorityThresh": 1
}

Avoid ambiguous place names

One of our current pain points is the term 'New York' which refers to both a city and a state, as well as many other regions and venues. (it also contains the term 'york' to make things even more confusing)

At some point in the future we will define some truths about which records are more 'important' and should rank higher, in the mean-time we should avoid assuming that the term 'London' refers to either 'London, UK' or 'London, Ontario' or 'London, Tulare County, CA' etc. etc.

// bad
"in": {
  "input": "springfield"
},
"expected": {
  "properties": [ { "text": "Springfield, Windsor County, VT" } ],
  "priorityThresh": 1
}

// bad
"in": {
  "input": "new york, ny"
},
"expected": {
  "properties": [ { "text": "New York City, Manhattan, NY" } ],
  "priorityThresh": 5
}

// better
"in": {
  "input": "new york city, ny"
},
"expected": {
  "properties": [ { "text": "New York City, Manhattan, NY" } ],
  "priorityThresh": 2
}

// best
"endpoint": "suggest/coarse",
"in": {
  "lat": 9.91,
  "lon": 78.10,
  "input": "Singarpuram, Madurai, Tamil Nadu"
},
"expected": {
  "properties": [ {
    "text": "Singarpuram, Madurai, Tamil Nadu",
    "layer": "neighborhood"
  } ],
  "priorityThresh": 2
}

Don't assume all users are in the same geography as you

Following on from the above, some terms are very common and there is no 'correct' answer without a geographic bias.

// literally 'main station' (rail)
"in": {
  "input": "hauptbahnhof"
},
"expected": {
  "properties": [ { "text": "Hauptbahnhof, Bremen" } ],
  "priorityThresh": Infinity
}

It's also worth noting the cultural differences such as whether an address is specified as number, street or street, number


Do not copy->paste the actual to the expected!

It shouldn't have to be mentioned but I've seen it a fair few times, the tests are a source of truth that the system is working correctly, each case should be carefully checked before being committed.

A good example is this test case I found in our repo for railway stations near the mapzen nyc office. The top result is for the centroid of the BMT Broadway Line which means we are telling users there is a train station where there isn't one.

Please fact check the test case is logically correct before committing it, this can be as simple as looking it up in openstreetmap and dropping the geojson in geojson.io to give it a quick sanity check.

// bad
"in": {
  "lat": 40.744243,
  "lon": -73.990342,
  "categories": "transport,transport:station"
},
"expected": {
  "properties": [
    {
      "name": "BMT Broadway Line",
      "alpha3": "USA",
      "admin0": "United States",
      "admin1": "New York",
      "admin1_abbr": "NY",
      "admin2": "New York County",
      "local_admin": "Manhattan",
      "locality": "New York",
      "neighborhood": "Flatiron District",
      "text": "BMT Broadway Line, Manhattan, NY",
      "category": [
        "transport",
        "transport:rail"
      ]
    }
  ]
}

Categories

A special note on categories, they are likely to change and the order of elements in the array is not guaranteed, it's best to avoid testing categories unless you have to.


Feedback App

A special note on the feedback app, these test cases should be carefully checked before merging. There are a bunch of test cases which don't really make much sense except to ensure the system behaves the same as it did the day of the test.

// what magic is powering this?
"in": {
  "input": "65c dana st"
},
"expected": {
  "properties": [
    {
      "layer": "osmnode",
      "name": "Cambridge St @ Dana St",
      "alpha3": "USA",
      "admin0": "United States",
      "admin1": "Massachusetts",
      "admin1_abbr": "MA",
      "admin2": "Middlesex County",
      "local_admin": "Cambridge",
      "locality": "Cambridge",
      "neighborhood": "Mid Cambridge",
      "text": "Cambridge St @ Dana St, Cambridge, MA"
    }
  ],
  "priorityThresh": 1
}

there are many many more


Attach comments and metadata to your tests

You can add arbitrary properties to your test and they will be ignored by the parser, use them to help others to understand your tests or remind your future self what you were doing.

// bad
{
  "id": "10",
  "status": "pass",
  "endpoint": "reverse",
  "in": {
    "lat": 51.47945855891035,
    "lon": -3.2018280029296875
  },
  "expected": {
    "properties": [{
      "admin1": "Cardiff"
    }]
  },
  "unexpected": {
    "properties": [{
      "admin1": "Caerdydd#Cardiff"
    }]
  }
}

// good
{
  "name": "address type",
  "priorityThresh": 3,
  "description": [
    "These tests ensure that address records are stored in a layer which is distict from the POI layers",
    "see: https://github.com/pelias/openstreetmap/issues/29",
    "see: https://github.com/pelias/pelias/issues/60"
  ],
  "tests": [
    {
      "id": "1",
      "user": "missinglink",
      "endpoint": "search",
      "description": [
        "Ensure address records are stored in the 'osmaddress' layer."
      ],
      "in": {
        "input": "102 Fleet Street",
        "lat": 51.53177,
        "lon": -0.06672
      },
      "expected": {
        "properties": [{
          "id": "address-osmnode-1401849738",
          "layer": "osmaddress"
        }]
      }
    },

this is also super important if you want to mark a test as status: fail

{
  "id": 5,
  "status": "fail",
  "issue": "https://github.com/pelias/api/issues/149",
  "type": "dev",
  "user": "Randy",
  "in": {
    "input": "new york, new york"
  },
  "expected": {
    "properties": [
      "New York City, New York"
    ]
  }
}