Databricks CLI — a few examples
The last post described Databricks Jobs. Today I want to say something about Databricks CLI using a few examples. After you configured your CLI — the instruction is here: Databricks CLI | Databricks on AWS. You can work with Databricks using a command prompt.
Here is the first example:
We want to run the job created in the last post: Databricks Workflows/Jobs
A link to a code repository is placed at the end of the article.
At the beginning let’s check what jobs have been created:
Now, we know now what is a job name and an ID.
Let’s get more detailed info about the job. We got a JSON response and all the job details.
Now, it is the time to run the job:
We got a response comprises a run_id. Let’s check the status.
The job is running. If we want to get some details regarding this run. We can get a JSON file.
Enough about jobs. The second example shows how to work with workspaces.
There is a folder ETL_DBR created and it contains three notebooks inside.
Now, we want to copy all the notebooks from the ETL_DBR into a ETL_DBR_SECOND folder.
In order to do this:
- Create a new folder ETL_DBR_SECOND in the workspace,
- Export the ELT_DBR folder and all the content to the specified location — in this case it is a local PC hard drive: ./ETL_DBR_LOCAL_COPY,
- Import previously exported folder into the ETL_DBR_SECOND to the workspace.
Now we can go to the last example. We are going to work with clusters.
At the beginning it is a good idea to list all the clusters. We can do it in two ways. Less and more detailed.
Here are three basic commands. Start, restart and terminate cluster.
And at the end, we will change a cluster auto termination time.
A new configuration is saved inside a JSON file.
Let’s use this file through the CLI.
Here is the code: Databricks_CLI_Github