Services HOT Session: Quality Gates

Welcome to the second APAC Services Hands on Training session. This session will focus on quality gates.

ace logo

Please ensure:

api token permissions

Today we will:

Before Lunch

Lunch

After Lunch

Throughout the Day

A quality gate is a definition of quality that a piece of software should meet. If it does not meet the quality criteria, the software should not proceed and should be returned to the developer for a fix.

A quality gate should be a concrete, non-negotiable contract of quality for a given service.

A quality gate can combine many metrics that go to build up your quality signature. Your definition of "quality" is not restricted to performance metrics.

Think: In Product vs. Open Source

- Production: In Product (SLO & Error Budget tracking)

- CI/CD: Keptn (DT Supported version soon?)

dt-vs-keptn

Dynatrace SLO screen

dynatrace-slo-screen

ops-problem-meme

Today we will be using Keptn to build quality gates but...

⚠️ A Quality Gate does not require Keptn, but Keptn makes defining one easier

✔️ Tip: Discuss the concept and advantages of quality gates with customers. Don't focus on Keptn

✔️ Tip: Quality gates can (and should) encompass metrics from any tools (via Dynatrace?)

✔️ Tip: Quality gates are NOT only performance / availability based

keptn logo

Think of Keptn as an intelligent middleware that receives events from your environment and passes those events to "services" which then react to those events.

Think Events

It helps to think in terms of conceptual events and not specific tooling. Consider process of running and evaluating a quality gate:

Assuming my service is deployed and I have traffic running against the service, I need to:

The bridge of a ship is the control room. The Keptn's bridge is your control room to oversee everything happening inside Keptn.

Access your bridge by going to http://keptn.VMIP.nip.io/bridge

keptns bridge

Take a 10 minute break and we will get hands when we return.

Hints for practical: All URLs, usernames and passwords are stored in ~/installOutput.txt

Tell Keptn which tool it should use to retrieve Service Level Indicators (SLIs).

Install the dynatrace-sli-service. Remove https:// and any trailing slashes from DT_TENANT.

Set some environment variables:

export DT_API_TOKEN=***
export DT_TENANT=dtmanaged.dynatrace.training/e/***

Check that you've set both of these correctly:

echo $DT_API_TOKEN
echo $DT_TENANT

Now create the secret. Keptn will use these details to authenticate with Dynatrace.

kubectl -n keptn create secret generic dynatrace --from-literal="DT_API_TOKEN=$DT_API_TOKEN" --from-literal="DT_TENANT=$DT_TENANT"

Install the service:

kubectl apply -n keptn -f https://raw.githubusercontent.com/keptn-contrib/dynatrace-sli-service/0.7.1/deploy/service.yaml

Verify that the pod is running in the keptn namespace. Look for the dynatrace-sli-service pod:

kubectl get pods -n keptn
NAME                                     READY   STATUS    RESTARTS   AGE
...                                      ...     ...       ...        ...
dynatrace-sli-service-595564cb65-xpx2j   2/2     Running   0          18s

We need to model our customer system inside Keptn. Keptn has 3 levels of configuration:

Project

The top level grouping. In our case, it makes sense to create one project per customer.

Stage

This corresponds to our logical stages. Our customers have two stages: staging and production

Service

Typically this models the microservice. Our customers have one service in each environment: The web service.

Create a new file called shipyard.yaml. A shipyard file is the way that Keptn models the stages inside your project.

In our case, we want one stage: staging

stages:
  - name: "staging"

Now create a Keptn project for Customer A and use the shipyard file you defined in the previous step:

keptn create project customer-a --shipyard=shipyard.yaml
$ keptn create project customer-a --shipyard=shipyard.yaml
...
Starting to create project
ID of Keptn context: ...
Project customer-a created
Stage staging created
Project successfully created

customer-a-project

Now we create our staging-web service for customer-a:

keptn create service staging-web --project=customer-a
$ keptn create service staging-web --project=customer-a
Starting to create service
ID of Keptn context: ...
Creating new Keptn service staging-web in stage staging

customer-a-project

Tell Keptn to use the dynatrace-sli-service to receive metrics from Dynatrace for the customer-a project:

keptn configure monitoring dynatrace --project=customer-a

Our quality gate will evaluate a single SLI:

First make sure you understand how this metric is pulled out of Dynatrace.

Use the Dynatrace metrics API v2 to pull the 95th percentile figure for the web service in staging for customer-a.

Set the metricSelector to:

builtin:service.response.time:percentile(95)

Set the entitySelector to:

type(SERVICE),tag(customer:customer-a),tag([KUBERNETES]stage:staging)
{
  "totalCount": 1,
  "nextPageKey": null,
  "result": [{
      "metricId": "builtin:service.response.time:percentile(95)",
      "data": [{
          "dimensions": [ "SERVICE-..."],
          "timestamps": [
            ...
            1606888800000
          ],
          "values": [
            ...
            837.3563350144964,
            921.5641993975936
          ]
        }]
    }]
}

Validate

Navigate to your customer-a service in staging in Dynatrace and notice that the SERVICE-* ID matches the dimension in the REST API call. Proof that you've pulled the metrics for the correct service.

Store this metric as code so that we can tell Keptn to use it.

Here we can use some special variables:

Create a new file called sli.yaml. Do not modify the content below:

---
spec_version: '1.0'
indicators:
  response_time_p95: "builtin:service.response.time:percentile(95)?scope=type(SERVICE),tag(customer:$PROJECT),tag([KUBERNETES]stage:$STAGE)"

Add this SLI file to the relevant Keptn project and stage.

Add File to Keptn

keptn add-resource --project=customer-a --stage=staging --service=staging-web --resource=sli.yaml --resourceUri=dynatrace/sli.yaml
$ keptn add-resource --project=customer-a --stage=staging --service=staging-web --resource=sli.yaml --resourceUri=dynatrace/sli.yaml
Adding resource sli.yaml to service staging-web in stage staging in project customer-a
Resource has been uploaded.

So far, we've told Keptn:

We haven't told Keptn:

Create & Upload SLO file

Create a new file called slo.yaml with the following content:

spec_version: '1.0'
comparison:
  compare_with: "single_result"
  include_result_with_score: "pass"
  aggregate_function: avg
objectives:
- sli: response_time_p95
  pass:
  - criteria:
    - "<=+10%"
    - "<200"
  warning:
  - criteria:
    - "<=500"
total_score:
  pass: "90%"
  warning: "50%"

Add File to Keptn

Add a file to Keptn:

keptn add-resource --project=customer-a --stage=staging --service=staging-web --resource=slo.yaml --resourceUri=slo.yaml

You'll see a success message:

$ keptn add-resource --project=customer-a --stage=staging --service=staging-web --resource=slo.yaml --resourceUri=slo.yaml
Adding resource slo.yaml to service staging-web in stage staging in project customer-a
Resource has been uploaded.

Trigger an evaluation using the keptn command line:

keptn send event start-evaluation --project=customer-a --stage=staging --service=staging-web --timeframe=2m

Refresh the Keptn's bridge and notice that the evaluation is successful:

first keptn evaluation

Time to push version 2 of our code to Customer A in staging.

kubectl set image -n customer-a deployment/staging-web front-end=adamgardnerdt/perform-demo-app:v2 --record

Refresh the customer-a staging URL and you should see a green banner.

customer a staging v2

Notice that the page takes longer to load. There is a delay on this page. This delay will cause our quality gate to fail.

Wait for a few minutes for Dynatrace to receive new data before progressing to the next step.

Request a new quality evaluation from Keptn. This time, it should fail because the page is taking too long to load.

keptn send event start-evaluation --project=customer-a --stage=staging --service=staging-web --timeframe=2m

failed build

So far we've relied on the keptn CLI to run evaluations. That's not usually the way things are done. More likely you will want to integrate Keptn into your shell scripts or build pipelines.

For this, we have a few options but first we'll look at the API.

Retrieve the Keptn API key:

kubectl get secret keptn-api-token -n keptn -ojsonpath={.data.keptn-api-token} | base64 --decode

For convenience, the demo system saves it for you in ~/installOutput.txt:

cat ~/installOutput.txt

Navigate to the Keptn API page:

http://keptn.VMIP.nip.io/api

Authenticate with your token and experiment with the GET endpoints.

Use the evaluation endpoint to request a new Keptn evaluation. This is the equivalent of this CLI command:

keptn send event start-evaluation --project=customer-a --stage=staging --service=staging-web --timeframe=2m

Your details will be:

project = customer-a
stage = staging
service = staging-web
timeframe = 2m

The minimum payload body is:

{
  "timeframe": "2m"
}

If you have an API utility such as Postman you can also try a POST request to:

http://keptn.YOURIP.nip.io/api/v1/project/PROJECTNAME/stage/STAGENAME/service/SERVICENAME/evaluation

Header values:
x-token: YOURKEPTNAPIKEY
Content-Type: application/json

Record the keptnContext

However you choose to call the Keptn API, you receive a 200 OK response and a payload which contains a value called keptnContext

{
  "keptnContext": "ee4fb3ac-8a7b-48d2-bc35-a784fb1d4b43",
  "token": "***"
}

Keptn will run the evaluation asynchronously. It may take some time to complete the evaluation so Keptn provides an ID by which you can retrieve your evaluation at a later time. keptnContext is that ID.

Retrieve Evalation

Using the Select a definition dropdown, go to mongodb-datastore and use the GET /event with your Keptn Context to pull all events with that Keptn context.

Notice that you receive multiple events. In fact, using only the Keptn context ID, you get the full Purepath of events which corresponds to what you see in the bridge.

Every event in the chain shares the same Keptn Context. Use the context to grab a complete history of that "chain of events":

bridge events

So far we have:

Keptn Core

As you know, Keptn is event based. For the purposes of this session, you can consider Keptn's core to be responsible for receiving and placing events onto a topic in a publisher / subscriber type model.

These events can then be used by Keptn's services.

All possible events are listed here

Keptn Services

Keptn services are additional pieces of functionality (think of them like apps) that listen for one (or more) events, react to those events and (optionally) emit events.

Anyone can create new Keptn services.

Basic Workflow Example

keptn-services-basic

Complex Workflow Example

keptn-services-complex

For example, when we ask Keptn to start an evaluation, we send the following event: sh.keptn.event.start-evaluation

Keptn services that are configured to listen for that event can then react.

Keptn's service architecture makes it completely flexible in terms of what happens and when.

Out of the Box Services

You have already been using Keptn services - some are installed for you by default. Look again at an evaluation in Keptn's bridge. Notice that there are a number of services already mentioned:

ootb keptn services

Take a look at what's already installed with:

kubectl get deployments -n keptn

Here are mine:

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
bridge                     1/1     1            1           47h
dynatrace-sli-service      1/1     1            1           45h
eventbroker-go             1/1     1            1           47h
api-service                1/1     1            1           47h
api-gateway-nginx          1/1     1            1           47h
mongodb                    1/1     1            1           47h
lighthouse-service         1/1     1            1           47h
shipyard-service           1/1     1            1           47h
mongodb-datastore          1/1     1            1           47h
remediation-service        1/1     1            1           47h
configuration-service      1/1     1            1           47h

We have now interacted with Keptn via the CLI and the API. But a more realistic scenario would be:

Discuss: How could we achieve this?

Our developer has decided that they want JIRA tickets for each evaluation. We've looked and found a Keptn service which does just that: the JIRA Service.

💡 You will need a free trial JIRA account to proceed. Sign up here

By now you should know:

Follow the instructions on the JIRA Service readme.

Ask Keptn for an evaluation, either via the API or CLI.

keptn send event start-evaluation --project=customer-a --stage=staging --service=staging-web --timeframe=2m

You can check the progress of the evaluation with:

keptn get event evaluation-done --keptn-context ***

JIRA Results

When your evaluation is completed, refresh the JIRA board and you should see a new ticket in your backlog.

jira ticket

You've successfully created a quality gate as code and

Do you have any thoughts, ideas, comments or questions?

Here are some ideas to extend your research:

Useful Links: