Automate tabular financial datatable into vectorized sequential data

A lot of times, we receive time-related data in a table format and we want convert it into a simple data format with one column of datetime and the other as value. See this sample table:1

Now we want to convert this dataset into another format which can be easier to visulize and convert to other data structure like xts or timeSeries object. The converted data will be like:

2

Let’s look at a sample unemployment rate from Depart of labor.

sampleData <- read.csv(‘table_date.csv’)
sampleData

3

Method in R, there are two common ways to do it, first:

tableDataFlat <- as.vector(t(sampleData[1:nrow(sampleData),2:ncol(sampleData)]))
dates <- seq.Date(as.Date(‘2005-01-01’),as.Date(‘2017-12-01′),’month’)
newTS <- data.frame(dates=dates,value=tableDataFlat)
head(newTS)

4

the second way in R:

tableDataFlat <- c(t(as.matrix(sampleData[1:nrow(sampleData),2:ncol(sampleData)])))
newTS <- data.frame(dates=dates,value=tableDataFlat)
head(newTS)

Now we can do visualization and analysis more conveniently.

plot(newTS)

5

Method in Python:

In python, it is even more simple. Flatten the data matrix by using:

import numpy as np
import pandas as pd
df = pd.read_csv(‘table_date.csv’)
data = df.values
data_flat = data.flatten()
dates = pd.date_range(start = ‘2005-01-01’, end = ‘2017-12-01′,freq=’M’)
new_df = pd.Dataframe({date:dates,value:data_flat})
new_df.set_index(date)