As a developer advocate, one of the largest challenges I face is how to teach people to use our company’s products. To do this well, you need to create workshops and disposable environments so your students can get their hands on the actual technology. As an IBM employee, I use the IBM Cloud, but it is designed for long-term production usage, not the ephemeral infrastructures that a workshop requires.
We often create systems to work around the limitations. Recently in updating the deployment process of such a system, I realized I had created a full serverless stack — completely by accident. This blog post details how I accidentally built an automated serverless automation and introduces you to the technology I used.
Enabling automation with Schematics
Before describing the serverless application, I am going to pivot and talk about a feature of IBM Cloud that most people don’t know about. It’s called IBM Cloud Schematics, and it’s a gem of our cloud. Here’s a description of the tool:
Automate your IBM Cloud infrastructure, service, and application stack across cloud environments. Oversee all the resulting jobs in one space.
And it’s true! Basically, it’s a wrapper around Terraform and Ansible, so you can store your infrastructure state in IBM Cloud and put real RBAC in-front of it. You can leverage the cloud’s Identity and Access Management (IAM) system and built-in permissions. This removes the tedium of dealing with Terraform state files and gives infrastructure teams the ability to only focus on the declaration code.
Why I built this serverless application
This brings me to using this application on our cloud. For workshops and demos, I was told that I had to move away from “classic” clusters and move to virtual private clouds (VPCs). There is a bunch of Terraform code floating around so I found some and edited it into a VPC, connected it to shared object storage, and added all the clusters needed for a workshop into that same VPC. The results is that now every workshop is a VPC, giving participants their own IP space and walled garden of resources. This is a huge win for us.
Here’s a look at the flow of how the application interacted with Schematics to create these VPCs:
The request process
Someone enters a GitHub Enterprise issue on a specific repository.
The GitHub Issue validator receives a webhook from GitHub Enterprise and parses the issue for the different options. It also checks for any possible options that could be more then allowed, or the correct formatting of the actual issue. If everything is accepted, the validator tags the issue with scheduled to know it’s ready to be created.
The cron-issue-tracker polls against the issues every 15 mins with “scheduled” tag.
If it’s within 24 hours of the start time, the API calls the grant-cluster-api and requests creation of grant-cluster application.
It calls either the classic or VPC Code Engine APIs to spin up the required clusters via the /create API endpoint.
If it is a classic request, it will call the AWX backend. or VPC request, If the request is a VPC request, it will call the Schematics backend to request the clusters.
When the cron-issue-tracker reads 24 hours after the “end time” it removes the grant-cluster application and destroys the clusters via the /delete API endpoint.
I used the vpc-gen2-openshift-request-api: A flask API to run a code-engine job as the starting point of the serverless application. I discovered that, after giving a bunch of Terraform code to Schematics, the next natural step was to figure out a way to trigger the request via an API. This is where the IBM Code Engine platform comes into play.
If you view the GitHub repo above, you’ll see that our Schematics request is wrapped as a Code Engine job (line 21 in app.py). Because of that, all I had to do was curl a JSON data string to our /create endpoint and it kicked it off. Now I had the ability to run something like:
This enabled us to figure out how to get requests shipped to the API.
The second core part of this project was to validate the GitHub Enterprise issue. With the help of Steve Martinelli, I took an IBM Cloud Functions application he created to parse a standard GitHub issue and pulled out options from it.
For instance, the request gives you these options to fill out:
• event short name: openshift-workshop
• start time: 2021-10-02 15:00
• end time: 2021-10-02 18:00
• clusters: 25
• cluster type: OpenShift
• workers: 3
• worker type: b3c.4×16
• region: us-south
This Cloud Function receives on a webhook from GitHub Enterprise on any creation or edit of the issue and checks it against some parameters I set. For instance, I set a parameter that there had to be fewer than 75 clusters and the start and end times have to be formatted in a specific way and be within 72 hours of each other. If a function doesn’t match my parameters, the application comments on the issue and asks the submitter to update the issue.
If everything is parsed correctly, the validator adds the tag of scheduled to the issue so our next application can take ownership of it.
As I created this microservice, I realized I had a full serverless application brewing. After some deeper research into Code Engine, I discovered that there was a cron system built into the technology. So, now that I can parse the issues with webhooks, I can take that same framework and create a cron that checks the start and end time and do something for us. This freed me up to move away from having to schedule the time for one of us to spin up the required systems. Using the cURL to our vpc-gen2-request-api gave me my clusters at a reasonable time.
I also needed a system to check out the clusters, and that’s where the final microservices came into play.
The grant-cluster-api microservice completed my application puzzle. This microservices is a Code Engine job that spun up a serverless application with all the required settings parsed from the GitHub issue automatically 24 hours before the start time, and 24 hours after the end time. It also changed the tags and labels on the issue so now the cron-issue-tracker knew what to do when it walked through the repository.
As you can see from the diagram, this application consists of a bunch of small APIs and functions that do the work of a full application. Users have one and only one interface into the stack and the GitHub Issue. When everything is set up correctly, the bots do the work for us. I have portions that I can extend off in the future, but everything is based off that first flask application when I realized all you had to do was send a JSON blob of data and now you can request exactly what you need.