Access Amazon Redshift Database from Python

Amazon has definitely made significant gain from the cloud movement in the past decade as more and more company are ditching their own data server in favor of theirs. There is a very good reason to do that. Cheaper, faster and easy access from anywhere.

Now how do we retrieve data in Redshift and do data analysis from Python. It is very simple to do that. The information that you’ll need ahead is: usename, password, url to redshift and port number (default is 5439).

I’ll show you how to connect to Amazon Redshift using psycopg2 library. First install library ‘psycopg2’ using : pip install psycopg2.

Then use the following Python code to define your connections.

def create_conn(*args, **kwargs):

import psycopg2
config = kwargs[‘config’]
try:
con = psycopg2.connect(dbname = config[‘dbname’], host=config[‘host’],
port = config[‘port’], user=config[‘user’],
password=config[‘pwd’])
return con
except Exception as err:
print(err)

keyword= getpass(‘password’)   # type in password  or you can read from a saved json file.

config = {‘dbname’: ‘lake_one’,
‘user’:’username’,
‘pwd’:’keyword’,
‘host’:'[your host url].redshift.amazonaws.com’,
‘port’:5439}

How to use this and do fun stuff:

con = create_conn(config = config)
data = read_sql(“select *  from mydatabase.tablename;”,con, index_col = ‘date’)
print(data.columns)
data.plot(title=’My data’, grid = True,figsize=(15,5))

con.close() # close connection

Simple as that. Now it’s your turn to create fantastic analysis.

Advertisements