Dealing with legacy code contains ‘xrange’ in Python 2.7

Python 3.x has been around since 2008 but 2.7.x is still around and continues used in current development. While doing machine learning, one of the most used function is ‘xrange’ in loops. But ‘xrange’ has been replaced with ‘range’ in 3.x. Here is a good practice for writing code that’s compatible with both Python 2 and 3.x.

try:

xrange

except NameError:

xrange = range

For Python 2.7 die-hard fans switching to 3.x, you can define ‘xrange’ as following:

def xrange(x):

return iter(range(x))

Cheers.

Advertisements

Understand blockchain with Simply Python Code

Everybody knows Bitcoin now but not everyone knows how blockchain technology works. The blockchain is like a distributed ledger which is a consensus of replicated, shared and synchronized digital data geographically spread across multiple sites and there is no centralized data storage. This is different from centralized and decentralized storage. See illustration image here:

blockchain

In another word, blockchain is a public data storage where every new data is stored in a ‘block’ container and inserted into an immutable chain with past data. In terms of bitcoin or other coins, these data are a series of the transaction record. Of course, the data stored here can be anything. The blockchain technology is supposed to be more secure and hack-proof since the computation resources required to hack it is unimaginable.

Here, I’ll show a simple Python code to demonstrate how blockchain works:

The code structure is like this shown in Eclipse:

Screen Shot 2017-12-10 at 12.34.35 PM

The Python code is shown below:Screen Shot 2017-12-10 at 12.35.57 PMScreen Shot 2017-12-10 at 12.35.40 PMScreen Shot 2017-12-10 at 12.35.49 PM

Let’s look at the blockchain created:

Screen Shot 2017-12-10 at 12.39.43 PM.png

As seen in the code, each block contains the hash of the previous block. And this makes it’s hard to modify the blockchain. In practice, there are other restriction to make each new block harder to generate. For example, you can restrict new block to all start with nth zero in the new hash. The more leading zero will make it harder to generate a new block. The way it is distributed requires that a new legitimate block need to be voted ‘valid’ by at least 51% of public storage holder.

 

 

How to clear all in python Spyder workspace

While doing data analysis, sometimes we want clear everything in current workspace to have a fresh environment. It is similar to Matlab’s ‘clear all’ function. Here is how the function looks like (clear_all.py):

def clear_all():
“””Clears all the variables from the workspace of the spyder application.”””
gl = globals().copy()
for var in gl:
if var[0] == ‘_’: continue
if ‘func’ in str(globals()[var]): continue
if ‘module’ in str(globals()[var]): continue

del globals()[var]
if __name__ == “__main__”:
clear_all()

Converting local time to UTC and vice verse in Python

When dealing with global data time series, we often encounter data in different time zones. Here I’ll share with the python scripts that created to address this issue:

  1. Converting from local to UTC

# e.g. local_to_utc(t.timetuple())

import time,calendar
import datetime

def local_to_utc(t_tuple):
secs = time.mktime(t_tuple)
utcStruct = time.gmtime(secs)
return datetime.datetime(*utcStruct[:6])

2. Converting from UTC to local time

# e.g.: utc_to_local(t.timetuple()):

import time
import calendar
import datetime
def utc_to_local(t_tuple):
secs = calendar.timegm(t_tuple)
localStruct = time.localtime(secs)
return datetime.datetime(*localStruct[:6])

Pandas– ValueError: If using all scalar values, you must pass an index

For Python users, we all know that it is very convenient to create a data frame from a dictionary. For example:

df = pd.DataFrame({‘Key’:[‘a’,’b’,’c’,’d’], ‘Value’:[1,2,3,4]})

It works beautifully when the values is a list/dict with multiple columns. However, you may encounter into syntax errors ValueError: If using all scalar values, you must pass an index” when you try to convert the following dictionary to a data frame.

dict_test = {

‘bacon’:’pig’,

‘pulled pork’:’pig’,

‘pastrami’: ‘cow’,

‘honey ham’:’pip’,

‘nova lox’: ‘salmon’

}

df = pd.DataFrame.from_dict(dict_test)

Why is that?

While pandas create data frame from a dictionary, it is expecting its value to be a list or dict. If you give it a scalar, you’ll also need to supply index. In this example, the values are ‘pig’ instead of [‘pig’].

How to fix it:

  1. Change the data to:

dict_test = {

‘bacon’:[‘pig’],

‘pulled pork’:[‘pig’],

‘pastrami’: [‘cow’],

‘honey ham’:[‘pip’],

‘nova lox’: [‘salmon’]

}

2. Get the list items from the dictionary and add ‘list’ for Python 3.x.

pd.DataFrame.from_dict(list(dict_test.items()), columns = [‘food’,’animal’])

3. Specify the orientation with ‘index’.

pd.DataFrame.from_dict(dict_test, orient = ‘index’)

4. Pass the Series constructor instead:

s = pd.Series(dict_test, name = ‘animal’)

s.index.name = ‘Food’

df = pd.DataFrame(s)

Lorenz Attractor: A demo for butterfly effect and super computational efficiency of implementing C code in R

The Lorenz attractor(by Cliffor Alan Pickover) is an attractor that arises in a simplified system of equations describing the two-dimensional flow of fluid of uniform depth, with an imposed temperature difference, under gravity, with buoyancy, thermal diffusivity, and kinematic viscosity. The full equation are:

1

where ψ is a stream function, defined such that the velocity component U=(u,w)U=(u,w)

In the early 1960s, Lorenz accidentally discovered the chaotic behavior of this system. One of his chaotic attractors is defined:


2

grew for Rayleigh numbers larger than the critical value,. Furthermore, vastly different results were obtained for very small changes in the initial values, representing one of the earliest discoveries of the so-called butterfly effect.

In the iterative calculation, the n+1 th position depends on n_th position and the four parameters (a,b,c,d). Let’s do a simulation of 10 million iterations with (x0,y0)=(0,0) and a = -1.24, b=-1.25, c=-1.81, d=-1.91.

R Code:

“`{r lorenzor}
require(Rcpp)
require(ggplot2)
require(dplyr)
#define the theme
my.theme = theme(legend.position = ‘none’,
panel.background = element_rect(fill=’black’),
axis.ticks = element_blank(),
panel.grid = element_blank(),
axis.title = element_blank(),
axis.text = element_blank()

)

# define cpp function
cppFunction(‘DataFrame createTrajectory(int n, double x0, double y0,double a, double b, double c, double d){
// create the columns
NumericVector x(n);
NumericVector y(n);
x[0]=x0;
y[0]=y0;
for(int i=1; i<n; ++i){
x[i]=sin(a*y[i-1]) + c*cos(a*x[i-1]);
y[i]=sin(b*x[i-1]) + d*cos(b*y[i-1]);
}
// return a data frame
return DataFrame::create(_[“x”]=x,_[“y”]=y);
}’)

createTrajectoryR <- function(n,x0,y0,a,b,c,d){

#implementation with R

x = rep(0,n+1)
y = rep(0,n+1)
x[1] = x0
y[1] = y0
for (i in seq(2,n+1)){
x[i] <- sin(a*y[i-1])+ c*cos(a*x[i-1])
y[i] <- sin(b*x[i-1]) +d*cos(d*y[i-1])
}

return(data.frame(x=x,y=y))
}

#Initial parameters for dynamic system

a = -1.24
b = -1.25
c = 1.81
d = 1.91

system.time(df_C <- createTrajectory(10000000,0,0,a,b,c,d))

system.time(df_R <- createTrajectoryR(10000000,0,0,a,b,c,d))

#png(“./lorenzor_attractor.png”,units =’px’,width = 1600, height = 1600, res = 300)
# plot results from c
ggplot(df_C, aes(x,y)) + geom_point(color=’white’,shape=46,alpha=0.1) + my.theme
# plot results from R
ggplot(df_R, aes(x,y)) + geom_point(color=’white’,shape=46,alpha=0.1) + my.theme
#dev.off()
“`

End of R code


Runtime comparison

3

The R runtime is more than 5 times of C code.

How does Lorenz Attractor System look like?


4

Butterfly effect:

By slightly changing the parameters, you’ll get a vastly different solution.

5

ref: http://mathworld.wolfram.com/LorenzAttractor.html

Having fun with colormap (jet)

One of my favorite colormap in MATLAB is ‘jet’ and how do we duplicate this colormap in other languages.

cm = linspace(Color.HSV(0,1,1), Color.HSV(330,1,1),64)

for HSV, just a linspace in H(0..330).

colormap_hsv

ReverseHsvColormapExample_01

cm = RGB{Float64}[
RGB(
clamp(min(4x – 1.5, -4x + 4.5) ,0.0,1.0),
clamp(min(4x – 0.5, -4x + 3.5) ,0.0,1.0),
clamp(min(4x + 0.5, -4x + 2.5) ,0.0,1.0))
for x in linspace(0.0,1.0,64)]

colormap_jet

ReverseJetColormapExample_01

Ref: http://cresspahl.blogspot.de/2012/03/expanded-control-of-octaves-colormap.html