SQL Server 2019 - Connecting to Databricks Cluster using SSMS or Azure Data Studio RRS feed

  • Question

  • Has anyone used SSMS v18.2 or Azure Data Studio to connect to a DataBricks Cluster and so query on DataBricks tables and/or the DataBricks File System (dbfs)?

    Would like to know how you can set this up to show a DataBricks server in connections and use PolyBase to connect to dbfs

    I can connect to ADLS using the PolyBase commands like as follows:

    -- Scoped Credential
        IDENTITY = '<MyId>@<Id2>/oauth2/token',
        SECRET = '<MySecret>';
    -- External Data Source
        TYPE = HADOOP,
        LOCATION = 'adl://',
        CREDENTIAL = myScopedCredential
    Something similar to setup for dbfs?
    What IDENTITY used for Scoped Credential?

    Tuesday, August 20, 2019 1:23 PM

All replies

  • Hi Ernest,

    There are some steps to enable user access via token based authentication for Databricks: Enable Token-based Authentication (link) but, if I understand your question you are attempting to access dbfs with SSMS or Azure Data Studio? Neither client application will allow you to connect directly with Azure Databricks. You can connect an SQL Database or Data Warehouse instance to Databricks and query the dbfs from either client application while connected to Azure SQL Database or Data Warehouse.

    Other methods for connection:

    Connecting to Microsoft SQL Server and Azure SQL Database with the Spark Connector (link)

    Connecting to SQL Databases using JDBC (link)

    Please let me know if I have not understood your question correctly, or if yo have additional questions.



    Wednesday, August 21, 2019 5:56 PM
  • Hi,

    Adding to what Mike has already highlighted, there can be few more solution to it

    1. Mount blob storage to dbfs: Mounting blob storage to DBFS allows you to access objects in blob storage as if they were on the local file system.

    2. You can access DBFS objects with some more options, using the Databricks CLI, DBFS API, Databricks file system utilities (dbutils.fs), Spark APIs, and local file APIs. On your local computer you access DBFS objects using the Databricks CLI or DBFS API. In a Spark cluster you access DBFS objects using Databricks Utilities, Spark APIs, or local file APIs.

    3. If you are only interesting to query from SSMS then move this data to Sql server after step 1 or from other tools (i.e. Azure databricks/ADF)

    4. Write you own custom code in c# or python or any other programming method to connect to dbfs and get the data.


    Thursday, August 22, 2019 4:06 AM