Microsoft SQL Server¶
Connecting to Microsoft SQL Server databases from the command line¶
Microsoft provides the sqlcmd tool for connecting the SQL Server databases from the command line. You need to install it on your Faculty server first:
$ curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
$ curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list | sudo tee /etc/apt/sources.list.d/msprod.list
$ sudo apt-get update
$ sudo apt-get install mssql-tools unixodbc-dev
The last step above will ask you to accept some license agreements. The command
line tool gets installed to /opt/mssql-tools/bin/sqlcmd
.
To connect to a SQL Server database, you need to know:
its server: this is usually a string like
customers.mydomain.com
;its port: the default port for SQL Server databases is 1433;
the name of the database that you want to connect to. If this is a new server, the only database on it is likely to be named
master
;your username and password for the database server (note that this is different to your Faculty username). If you are unsure of these, you should ask your database administrator.
You can then connect to the database with:
$ /opt/mssql-tools/bin/sqlcmd -S SERVER,PORT -U USERNAME -d DATABASE_NAME
You should then be prompted for a password. If the port is the default 1433,
you may omit the ,PORT
part of the server argument.
For instance, I can connect to database demos
running on host
customer-database.example.com
with username admin
:

If you get stuck, run /opt/mssql-tools/bin/sqlcmd -? | less
in a terminal
to view a list of all available options.
Connecting to Microsoft SQL Server databases from Python¶
To connect to Microsoft SQL Server databases from Python, you can use the Python module pymssql. You need to install it on your Faculty server first:
$ conda install pymssql --channel conda-forge
Note
We recommend installing pymssql from the ‘conda-forge’ conda channel rather from pip or the standard conda repositories to ensure you get all the required dependencies.
import pymssql
connection = pymssql.connect(
server='customers.mydomain.com', # host on which the database is running
database='database_name', # name of the database to connect to
user='username', # username to connect with
password='password' # password associated with your username
)
cursor = connection.cursor()
cursor.execute('SELECT * FROM customers')
customers = list(cursor.fetchall())
print('We have {} customers'.format(len(customers))) # This is data science!
connection.close()
Note
We close the connection to allow the database server to reclaim resources. This can be critical in a Jupyter notebook, since the kernel remains alive for a long time.
The pymssql connection object can also be used in a with
statement to
ensure it gets closed automatically:
import pymssql
with pymssql.connect(
server='customers.mydomain.com', database='database_name',
user='username', password='password'
) as connection:
cursor = connection.cursor()
cursor.execute('SELECT * FROM customers')
customers = list(cursor.fetchall())
print('We have {} customers'.format(len(customers)))
You can also use the read_sql()
function in pandas to read the result of a
query directly into a DataFrame:
import pymssql
import pandas
with pymssql.connect(
server='customers.mydomain.com', database='database_name',
user='username', password='password'
) as connection:
df = pandas.read_sql('SELECT * FROM customers', connection)
print(df)
Note
We recommend avoiding pasting database passwords and other connection details in many notebooks in a project. Have a look at Factoring connection details into a package for a recommended strategy for managing database connection details.