This document describes code lifecycle in Dataform and ways to configure compilation and execution within Dataform.
About code lifecycle in Dataform
Dataform code lifecycle consists of the following phases:
- Development
- You develop a SQL workflow in a Dataform workspace.
- Compilation
Dataform compiles the SQL workflow code in your workspace to SQL in real time, creating a compilation result of the workspace that you can execute in BigQuery. Dataform uses settings that you defined in your workflow settings file to create the compilation result.
Dataform compilation is hermetic to ensure compilation consistency, meaning that the same code compiles to the same SQL compilation result every time. Dataform compiles your code in a sandbox environment with no internet access. No additional actions, such as calling external APIs, are available during compilation.
- Execution
In a workflow invocation, Dataform executes the workspace compilation result in BigQuery.
To tailor Dataform code lifecycle to your needs, you can configure the compilation result to influence where and how Dataform executes your SQL workflow. Then, you can manually trigger or schedule executions to influence when Dataform executes your whole SQL workflow or its selected elements.
Ways to configure Dataform compilation
By default, Dataform uses settings in the workflow settings file to create compilation results. You can override the default settings with compilation overrides to create custom compilation results. You can then manually trigger execution of a custom compilation result, or schedule executions.
Dataform provides the following options of configuring compilation results:
- Workspace compilation overrides
- You can configure compilation overrides that apply to all workspaces in a repository. You can use workspace compilation overrides to create isolated development environments.
- Release configurations
- You can create release configurations to configure templates for creating compilation results of a Dataform repository. You can then create a workflow configuration to schedule executions of compilation results created in a selected release configuration.
- Dataform API compilation overrides
- You can pass Dataform API requests in the terminal to create and execute a single compilation result with compilation overrides.
Configure workspace compilation overrides
With workspace compilation overrides, you can create compilation overrides for all workspaces in a Dataform repository. You can create one configuration of workspace compilation overrides per repository.
When you manually trigger execution in a workspace in a repository with workspace compilation overrides, Dataform applies these overrides to the compilation result of the workspace.
You can configure the following workspace compilation overrides:
- Google Cloud project in which Dataform executes the contents of the workspace
- Table prefix
- Schema suffix
You can use workspace compilation overrides to create isolated development
environments by isolating workspace compilation results in BigQuery
with dynamic compilation overrides. Dynamic table prefix
and schema suffix compilation overrides contain the ${workspaceName}
variable.
When you trigger execution in a workspace, Dataform replaces the
${workspaceName}
variable with the name of the current workspace, creating
compilation overrides unique to the workspace.
Keep in mind that you cannot schedule executions of compilation results created with workspace compilation overrides.
Create release configurations
With release configurations, you can configure templates of settings for creating compilation results of repositories.
In a release configuration, you can configure compilation overrides of workflow settings, compilation variables, and the frequency of creating compilation results of your whole repository.
In a release configuration, you can configure the following compilation overrides:
- Google Cloud project
- Table prefix
- Schema suffix
- Value of a compilation variable
You can create multiple release configurations in a Dataform repository, one for each stage of your development lifecycle, creating isolated repository compilation results.
You can then create workflow configurations to schedule executions of compilation results created in a selected release configuration.
You can also manually trigger execution of a compilation result in a selected release configuration.
Configure a single compilation result with Dataform API compilation overrides
By passing Dataform API requests in the terminal, you can configure compilation overrides for a single compilation result.
In the compilationResults.create
request, you can create a single compilation result of a Dataform
workspace or a specified Git comittish.
In the CodeCompilationConfig
object of the
compilationResults.create
request, you can configure compilation overrides
for the compilation request.
You can configure the following Dataform API compilation overrides:
- Google Cloud project
- Table prefix
- Schema suffix
- Value of a compilation variable
Keep in mind that Dataform API compilation overrides apply to a single compilation result and a single execution. You cannot use them to schedule Dataform executions.
You can execute a compilation result in the
workflowInvocations.create
request.
Ways to configure Dataform execution
Dataform provides the following options of configuring execution:
- Manual execution in a workspace
- You can manually trigger instant execution of a SQL workflow in a Dataform workspace, outside of any schedule. You can execute selected actions in the SQL workflow.
- Workflow configurations
- You can schedule executions of compilation results created in a selected release configuration. You can select SQL workflow actions to execute, and set the frequency and time zone of executions.
Trigger instant execution in a workspace
In a Dataform workspace, you can manually instant execution of the SQL workflow in your workspace, outside of any schedule.
You can manually execute the following elements of the SQL workflow in your workspace:
If your repository contains workspace compilation overrides, you can view what compilation overrides Dataform will apply to the workspace compilation result.
Create workflow configurations
With workflow configurations, you can schedule executions of compilation results from a selected release configuration. You can create multiple workflow configurations in a Dataform repository.
In a workflow configuration, you can configure the following execution settings:
- Applied compilation release configuration
- Selection of SQL workflow actions to be executed
- Schedule and time zone of executions
You can select the following SQL workflow actions to be executed:
- All actions
- Selected actions
- Actions with selected tags
Then, during a scheduled execution of your workflow configuration, Dataform deploys your selection of actions from the applied compilation result to BigQuery.
Dataform release configurations and workflow configurations let you configure compilation and schedule executions within Dataform, without the need to rely on additional services.
Expiration of lifecycle resources
Dataform stores compilation results and workflow invocations for a specific period of time.
Expiration of workflow invocations
Workflow invocations expire after 90 days, or when you manually delete them.
In a workflow configuration, you can view a list of most recent workflow invocations created by the configuration. When a workflow invocation created by a workflow configuration expires, Dataform removes that workflow invocation from the list of recent invocations.
Expiration of compilation results
Expiration of compilation results depends on the way they are created: in a development workspace, in a release configuration, or by a workflow invocation.
When you develop a SQL workflow in a Dataform workspace, Dataform compiles your code into a compilation result in real-time to provide query validation. Compilation results created this way expire after 24 hours.
In a release configuration, the latest compilation result becomes the live compilation result. A new compilation result replaces the current live compilation result. Dataform retains the live compilation result until it is replaced with a new compilation result. A replaced compilation result expires in up to 24 hours.
Dataform removes expired compilation results from from the list of past compilation results on the Details page of a release configuration.
Dataform retains compilation results created by workflow invocations for the whole life of the workflow invocation, up to 24 hours after workflow invocation expires or is deleted.
What's next
- To learn about best practices for code lifecycle in Dataform, see Managing code lifecycle.
- To learn how to configure Dataform workspace compilation overrides, see Create workspace compilation overrides.
- To learn how to configure a single compilation result with Dataform API compilation overrides, see Configure compilation overrides with the Dataform API.
- To learn how to create Dataform release configurations, see Create a release configuration.
- To learn how to manually trigger execution in a workspace, see Trigger execution.
- To learn how to create workflow configurations, see Schedule executions with workflow configurations.