What have I learned when search failed

TL;DR

In this post I will describe what I learned when search application feature failed.

Currently I have following testing mission: write executable specification in cucumber for all features of my client's web application.

By executable specification I mean following ruby gems:


While I was running cucumber features for application search feature, one of automated test failed. Replicating the issue by "hand", I realized that problem is with application itself, not in my automated check code.

Search just stopped working on integration and production environments. There was no user interface error displayed, but http 500 that I discovered using network tab in Chrome developer tools. I started to investigate how search feature was implemented. I used codebase stored in github repository and realized that application connects to elasticsearch (key value pairs) in order to return requested data.

Application uses Heroku infrastructure, so I first observed heroku application log:

`heroku logs -t -a app_name`

With that I also confirmed that search error is related to elasticsearch availability.
Application infrastructure also uses Bugsnag tool. It is very useful, because every application exception is logged in web application with all important variables values at the time of exception. Elasticsearch was provided as Heroku plugin. There was elasticsearch provider ticket reporter, with very useful feature. I started entering ticket summary by copy/pasting exception text. And ticket tool found out that that type of issues was reported in the past. I also got information that at that time, elasticsearch infrastructure was not available because of provider maintenance work.

In order to conclude. Application that I am testing does not have extensive architecture document, but in agile environment this is not needed. Because application uses cloud architecture provider, Heroku. With few clicks in Heroku admin interface, I got all the needed information about application search infrastructure. By testing search feature that failed, I learned about application architecture infrastructure. I identified important application risk, that elasticsearch provider will do maintenance work that will last for some time and maintenance work will not be announced!


Labels: