A Guide to Databricks Command-Line Interface(CLI)

databrics
Reading Time: 2 minutes

Databricks Command Line Interface is one of the handiest CLI which can come to the rescue whenever you are working with Databricks notebook. It is an easy-to-use interface for the Databricks platform which is built upon the Rest API 2.0.

To Get Started With Databricks CLI

Installing the DBCLI

For installing the CLI you must have these prerequisite in your local machine:

  • Python2(version 2.79 & above) or Python3(version 3.6 & above)
  • PIP(Preferred Installer Program).
# For installing the CLI
pip install databricks-cli

#For Updating the CLi
pip install databricks-cli --upgrade

#In order to get the version of your DBCLI
databricks -v

Setting Up the Authentication

Now in order to access your resource from the cloud, you need to set up your databricks cloud credentials details in your local machines in a filename ~/.databrickscfg. You can store your credentials using the command

#This command will let you add the host name 
databricks configure --token

#Output of the above command
Databricks Host (should begin with https://):

Instead of the token, you can also use your username and password of your databricks but it’s not recommended by databricks. CLI with the version above 0.8.0 supports the following environment variables:

  • DATABRICKS_HOST
  • DATABRICKS_USERNAME
  • DATABRICKS_PASSWORD
  • DATABRICKS_TOKEN

Following is the example for using the CLI.

databricks --profile AZURE fs cp -r <src> <desitnation>

Useful Databricks CLI Commands

There are many databricks command which a user can use like Cluster Policies Cli, Cluster Cli, Tokens, DBFS Cli and many more but I’ll talk about the mostly used CLI commands which can be really helpful, so first I wanna talk about:

DBFS CLI

Databricks File System CLI is specially used for performing basic performation like move, delete, copy etc.

But you need to be careful while using the DBFS CLI for the operation i.e containing the files more than 10k which can leads to timeout situation.

The alternative in this type of situation will be File system utility (dbutils.fs).

  • Listing the Contents of A File
databricks --profile AWS fs cat dbfs:/my-temp/aTextFile.txt

// Output
This is the DB CLI Blog
  • Copy a File
databricks --profile AZURE dbfs:/my-temp/aTextFile.txt /home/ravi/myNas
  • Creating a Directory
databricks fs mkdirs dbfs:/tmp/new-dir
  • Deleting a File
databricks fs rm dbfs:/my-temp/aTextFile.txt

Conclusion

Thanks for making to the end of the blog, if you still want to learn more about Databricks CLI you can refer to their official documentation by clicking the link mentioned below.

If you like my writing sense then feel free to check my more blogs by clicking here.

Reference

https://docs.databricks.com/getting-started/overview.html

knoldus

Written by 

Hi community, I am Raviyanshu from Dehradun a tech enthusiastic trying to make something useful with the help of 0 & 1.