Stashboard Clone

Notice: This article is maintained for historical purposes. The Bixly Automate/Nebri OS platform that Bixly built, and this article features, is no longer available. Checkout Serverless Providers as an alternative.

What is Stashboard?

For those of you that don’t know, Stashboard is an open source status page that runs on Google App Engine. It can be customized to display the status of any API or SaaS services. In this day and age, a page like this can be very beneficial for determining what pre-defined services (APIs or SaaS) are down and why or when they have experienced problems. You can check out a live demo here.

Why should we make a clone?

We can see that Stashboard has the potential to be very useful. So why should we worry about making a clone that works with Nebri? The first reason is to show how much faster this is with Nebri! Also, there are two downfalls to Stashboard in its current state. While it’s nice to be able to see if a service is down, if we want any actions to be taken due to certain status updates, we must manually check the statuses and take said actions. With Nebri, we can set up events that trigger whenever we receive a status that indicates one of our services is down, taking the manual checks out of the equation. We can also take this the other way and send notifications when a previously down service has been restarted and is running again.

The other downfall is the way statuses are updated;  according to Stashboard‘s documentation and source code, statuses can only be updated by incoming requests. Meaning you have to make an API request to Stashboard to update any status. Since Stashboard is open source, we could clone it and add code to handle these issues. That being said, the code needed to expand Stashboard currently would be much more extensive vs creating a clone in Nebri.

Writing a Stashboard clone

OK! Let’s get started, shall we? If you would like to follow along with the code base that I wrote, check out the github repo! Let’s start by straight up cloning Stashboard‘s functionality into a Nebri instance, then we’ll talk about the improvements I mentioned earlier. So, let’s break down our problem and list what we need to make this a success.

  • Models to store Services and Statuses (I’ll be using nebrios-models)
  • Some basic util functionality for retrieving and setting information
  • Protected API endpoints for updating statuses, creating new services, and retrieving information
  • Cards for updating statuses, creating new services, and displaying information
    • This one isn’t required if you only want to use this via our API, but I’ll show how to do it anyway. 😀

Well, let’s start at the top of the list.

Notice that ServiceStatus has both status_string and running. status_string can be set to anything that you like. Personally, I would set it to human readable statuses that make sense at a glance, but can include more information if needed. I also included a boolean running. I added this so there would be less ambiguity when it comes to testing if a service is actually up or not instead of making a guess depending on status_string. Now that we have models created, let’s look at the basic functionality that we’ll need for getting and setting attributes. The main thing to remember here is to make everything JSON serializable,which is why I added helper methods to the models. This will bite you in the butt otherwise. So, let’s take a look at the code for retrieving information.

This may look a little daunting at first, but I promise it’s not as crazy as it seems. When I wrote get_info, I assumed that it would be used for multiple request types. So, we can send a service name and get all info about that service, or not send a service name, and get info about all services. Now let’s take a look at setting up new services.

This function may be a little confusing due to the nested try statements. The reason that I built this function in this manner is because this will be used by both external requests and requests made from Nebri cards. We try to extract data from the request in two different ways so we can handle if the data is in dictionary format or model format. This functionality is pretty similar to updating statuses.

Now we have our base utils set up and ready to go. Let’s look at API endpoints next. NOTE: if you aren’t using cards, many of these API endpoints are not needed. Instead of pasting them in one big chunk, let’s break them apart like our util file. First, let’s look at endpoints for getting info. For information on setting up authentication in your Nebri instance, see this github repo.

So, now we have endpoints for getting info about a specific service or all services. Seems pretty basic, right? One thing to note in external requests is we check both request.POST and request.BODY for our appropriate data. The other thing is in our form endpoints (which will be used solely by Nebri cards), we check to ensure the request is authenticated. If the request originated in Nebri, it will be authenticated. Otherwise, it may be an external app trying to hit an endpoint that it shouldn’t.

Our endpoints for creating services look very similar to the endpoints for getting information, don’t they? Guess what? So do updating statuses.

You may be asking at this point, ok so why didn’t we put all the functionality in an api endpoint instead of a utils file? Reusability! In Nebri, it’s not considered kosher to import API functions in a rule script. So, we put functionality that will be used by both in a libraries util file.

So, now we have all of our base functionality that essentially clones Stashboard‘s current functionality. If you aren’t using cards, you can skip to the next section. 🙂 In my example app, I’ve created three cards to assist with displaying information, creating services, and updating statuses.

The above card script is relatively primitive in that it only lists services and their current status. This can be expanded to display all statuses, the most recent four statuses, or really anything you would like. If you scroll back up to get_info, you can see that the function returns all defined services with a list of all statuses. It’s up to you to decide how to display the information you get.

This is the most basic card in this example. Filling out the form and submitting will create a new service for you. From there, you are welcome to update the status of said service via this next card.

So that’s it! All you need to duplicate Stashboard‘s current functionality inside your Nebri instance. To compare code, see Stashboard’s github repo. In total, Stashboard uses 18,029 lines of Python code, while our version uses a measly 618 lines including utilized libraries. If we take out libraries and look at just functionality code, Stashboard clocks in at 1,656 loc, while we’re currently at 246. For all code line counts, I use CLOC. To set up and use the same script as I am, see the top answer in this stackoverflow question. Now, let’s take a look at our proposed improvements.


Adding event handling

So, we have our basic functionality all done. All rule script code in this section can be found in the examples directory of this repo. Let’s say that any time a service goes down, we want to receive an email stating which service is down and the status. First, I added a field alerted to our ServiceStatus model. This will help with debouncing and will keep us from getting duplicate alerts.

Now let’s create a rule script that listens to running. If running is False and alerted is False, we should alert the user that this service is down.

It’s that simple. This script will run for each service status that is created. Notice I’m checking self.kind == 'servicestatus'. This is a fail safe to only apply this code to the correct model. If there is ever another model created with a field called running and we didn’t include the kind check, those instances would also have this action taken on them. This example is pretty basic, but multiple rule scripts can be added for more complex behavior.

This script will only trigger if the associated service name is Amazon AWS. Let’s look at one more example for when a service comes back online.

That’s all there is to event handling. It’s as simple (or complex) as you want to make it! If you’re paying attention to lines of code, we’re at 653 including examples scripts at this point.

Adding external monitoring

Here’s the fun part. While Stashboard only handles incoming status updates, we’re going to add a rule script that will check external APIs or SaaS to see if they are down or not. We’ll be using requests for this functionality. In order to install requests on your Nebri instance, you must ssh and pip install. I chose to ssh in and pip install through PyCharm. For a detailed tutorial on using PyCharm with Nebri, see Fernando’s explanation. See Google App Engine documentation for more information on adding libraries to App Engine applications like Stashboard.

So, let’s add a function to the Service model to check if service name is a url. We’ll also add a new field do_monitor. This field is to trigger rule scripts to wake up.

Now that we have the ability to determine whether or not our service is a url, let’s set up a rule script to check if the server is up.

Notice we’re allowing redirects in these requests. If a service has moved and redirects, we need to follow to see if the actual service is available. Instead of handling it ourselves, let’s just let requests handle it. So, we’re looking for a status code in the 2xx range for a request to be successful and to indicate that the given service is available.

Alright, we have our rule script set up and listening to do_monitor. Now we need to set up a drip to actually set do_monitor so this thing will run. First, let’s set up another rule script that will be triggered by our drip.

Now let’s set up a drip to trigger this script. Drips can be created by clicking ‘Advanced’ in the left hand sidebar, then selecting ‘Drips’.

So, from our cron syntax, we can see that this drip will run every 15 minutes. The interval that this runs at is completely up to you. So, every time this drip is triggered, it sets our stashboard_setup_monitoring KVP, which triggers the rule script that updates do_monitor on all of our services that are urls. That’s it!

After all this, our Nebri instance not only keeps track of our service statuses, but it alerts us for any reason that we can think of to watch, and checks all services that are urls on a regular basis. All that in 684 lines of code. If you add requests’ code base in, we’re at 12,717 lines of code.

Now what? Well, if you are using cards, this is done. Your functionality is ready to go and use. If you want to use this from an external app, there are Nebri clients created by yours truly available for use.