Releasing Artifact Registry assets across Organizations and Projects with serverless
Guillermo Noriega
Infrastructure Cloud Consultant
Vipul Raja
Infrastructure Cloud Consultant
Have you ever wondered if there is a more automated way to copy Artifact Registry or Container Registry Images across different projects and Organizations? In this article we will go over an opinionated process of doing so using serverless components in Google Cloud and its deployment with Infrastructure as Code (IaC).
This article assumes knowledge of coding in Python, basic understanding of running commands in a terminal and the Hashicorp Configuration Language (HCL) i.e. Terraform for IaC.
In this use case we have at least one container image residing in an Artifact Registry Repository that has frequent updates to it, that needs to be propagated to external Artifact Registry Repositories inter-organizationally. Although the images are released to external organizations they should still be private and may not be available for public use.
To clearly articulate how this approach works, let's first cover the individual components of the architecture and then tie them all together.
As discussed earlier, we have two Artifact Registry (AR) repositories in question; let’s call them “Source AR” (the AR where the image is periodically built and updated, the source of truth) and “Target AR” (AR in a different organization or project where the image needs to be consumed and propagated periodically) for ease going forward. The next component in the architecture is Cloud Pub/Sub; we need an Artifact Registry Pub/Sub topic in the source project that automatically captures updates made to the source AR. When the Artifact Registry API is enabled, Artifact Registry automatically creates this Pub/Sub topic; the topic is called “gcr” and is shared between Artifact Registry and Google Container Registry (if used). Artifact Registry publishes messages for the following changes to the topic:
-
Image uploads
-
New tags added to images
-
Image deletion
Although the topic is created for us, we will need to create a Pub/Sub subscription to consume the messages from the topic. This brings us to the next component of the architecture, Cloud Run. We will create a Cloud Run deployment that will perform the following:
-
Parse through the Pub/Sub messages
-
Compare the contents of the message to validate if the change in the Source AR warrants an update to the Target AR
-
If the validation conditions are met, then the Cloud Run service moves the latest Docker image to the Target AR
Now, let’s dive into how Cloud Run integrates with the Pub/Sub AR topic. For Cloud Run to be able to read the Pub/Sub messages we have two additional components; an EventArc trigger and a Pub/Sub subscription. The EventArc trigger is critical to the workflow as it is what triggers the Cloud Run service.
In addition to the components described above, the below prerequisites need to be met for the entire flow to function correctly.
-
Cloud SDK needs to be installed on the users’ terminal so that you can run gcloud commands.
-
The project Service Account (SA) will need “Read” permission on the Source AR.
-
The Project SA will need “Write” permission on the Target AR.
-
VPC-SC requirements on the destination organization (if enabled)
- Egress Permissions to the target repository from the SA running the job
- Ingress permission for the account running the 'make' commands (instructions below) and writing to Artifact Registry or Container Registry
- Ingress Permissions to read the PUB/SUB GCR Topic of the source repository
- Allow [project-name]-sa@[project-name].iam.gserviceaccount.com needs VPC-SC Ingress for the Artifact Registry method
- Allow [project-name]-sa@[project-name].iam.gserviceaccount.com needs VPC-SC Ingress for CloudRun method
- var.gcp_project
- Var.service_account
Below we talk about the Python code, Dockerfile and the Terraform code which is all you need for implementing this yourself. We recommend that you open our Github repository while reading the below section where all the Open Source code for this solution lives. Here’s the link: https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/devops/inter-org-artifacts-release
What we deploy in Cloud Run is a custom Docker container. It comprises of the following files:
-
App.py: This file contains the variables for the source and target containers as well as the execution code that will be triggered to run based on the Pub/Sub messages and contains the following Python code.
-
Copy_image.py: this file contains the copy command app.py will leverage in order to run the gcrane command required to copy images from source AR to target AR.
-
Dockerfile: This file contains the instructions needed to package gcrane and the requirements needed to build the Cloud Run image
Since we have now covered all of the individual components that are associated with this architecture, let’s walk through the flow that ties all the individual components together.
Let’s say your engineering team has built and released a new version of the Docker Image “Image X”, per their release schedule and added the “latest” tag to it. This new version is sitting in the Source AR and when the new version gets created, the AR Pub/Sub topic updates the message that reflects that a new version of the “Image X” has been added to the source AR. This automatically causes the EventArc trigger to poke the Cloud Run service to scrape the messages from the Pub/Sub subscription.
Our Cloud Run service will use the logic written in the App.py image to check if the action that happened in Source AR matches the criteria specified (Image X with tag “latest”). If the action matches and warrants a downstream action, Cloud Run triggers Copy_image.py to execute the gcrane command to copy the image name and tag from the Source AR to the Target AR.
In the event that the image or tag does not match the criteria specified in App.py, (for eg. Image Y tag: latest) the Cloud Run process will give back an HTTP 200 reply with a message “The source AR updates were not made to the [Image X]. No image will be updated.” confirming no action will be taken.
Note: Because the Source AR may contain multiple images and we are only concerned with updating specific images in the Target AR we have integrated output responses within the Cloud Run services that can be viewed in the Google Cloud logs for troubleshooting and diagnosing issues. This also prevents unwanted publishing of images not pertaining to the desired image(s) in question.
Why did we not go with an alternative approach?
-
Versatility: The Source and Target AR’s were in different Organizations
-
Compatibility: The Artifacts were not in a Code/Git repository compatible with solutions like Cloud Build.
-
Security: VPC-SC perimeters limit the tools we can leverage while using cloud native serverless options.
-
Immutability: We wanted a solution that could be fully deployed with Infrastructure as Code.
-
Scalability and Portability: We wanted to be able to update multiple Artifact Registries in multiple Organizations simultaneously.
-
Efficiency and Automation: Avoids a time-based pull method when no resources are being moved. Avoids human interaction to ensure consistency.
-
Cloud Native: Alleviates the dependency on third-party tools or solutions like a CI/CD pipeline or a repository outside of the Google Cloud environment.
If your Upstream projects where the images are coming from all reside in the same Google Cloud Region or Multi-region, a great alternative to solve the problem is Virtual repositories.
How do we deploy it with IaC?
- We have provided the Terraform code we used to solve this problem.
- The following variables will be used in the code. These variables will need to be replaced or declared within a .tfvars file and assigned a value based on the specific project.
- var.gcp_project
- Var.service_account
In conclusion, there are multiple ways to bootstrap a process for releasing artifacts across Organizations. Each method would have its pros and cons, the best one for the approach would be determined by evaluating the use case at hand. The things to consider here would be, if the artifacts can reside in a Git repository, if the target repository is in the same Organization or a child Organization and if CI/CD tooling is preferred.
If you have gotten this far it’s likely you may have a good use case for this solution. This pattern can also be used for other similar use cases. Here are a couple examples just to get you started:
-
Copying other types of artifacts from AR repositories like Kubeflow Pipeline Templates (kfp)
-
Copying bucket objects behind a VPC-SC between projects or Orgs
Learn more
-
Our solution code can be found here: https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/devops/inter-org-artifacts-release
-
GCrane: https://github.com/google/go-containerregistry/blob/main/cmd/gcrane/README.md
-
Configuring Pub/Sub GCR notifications: https://cloud.google.com/artifact-registry/docs/configure-notifications