Automate tabular financial datatable into vectorized sequential data

A lot of times, we receive time-related data in a table format and we want convert it into a simple data format with one column of datetime and the other as value. See this sample table:1

Now we want to convert this dataset into another format which can be easier to visulize and convert to other data structure like xts or timeSeries object. The converted data will be like:

2

Let’s look at a sample unemployment rate from Depart of labor.

sampleData <- read.csv(‘table_date.csv’)
sampleData

3

Method in R, there are two common ways to do it, first:

tableDataFlat <- as.vector(t(sampleData[1:nrow(sampleData),2:ncol(sampleData)]))
dates <- seq.Date(as.Date(‘2005-01-01’),as.Date(‘2017-12-01′),’month’)
newTS <- data.frame(dates=dates,value=tableDataFlat)
head(newTS)

4

the second way in R:

tableDataFlat <- c(t(as.matrix(sampleData[1:nrow(sampleData),2:ncol(sampleData)])))
newTS <- data.frame(dates=dates,value=tableDataFlat)
head(newTS)

Now we can do visualization and analysis more conveniently.

plot(newTS)

5

Method in Python:

In python, it is even more simple. Flatten the data matrix by using:

import numpy as np
import pandas as pd
df = pd.read_csv(‘table_date.csv’)
data = df.values
data_flat = data.flatten()
dates = pd.date_range(start = ‘2005-01-01’, end = ‘2017-12-01′,freq=’M’)
new_df = pd.Dataframe({date:dates,value:data_flat})
new_df.set_index(date)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s