Data for finance and portfolio optimization

We provide a dataset for portfolio optimization and other finance applications. It covers 10 years, from January 2006 to December 2016, and comprises a set of 52 popular exchange traded funds (ETFs) and the US central bank (FED) rate of return (here is the list of assets).

The data consists of three data files, tables of

For convenience we provide scripts in Python, Julia, and Matlab that load data and define

  • an assets array of dimension n

  • a dates array of dimension T

  • a prices matrix of dimension T by n, in dollars

  • a volumes matrix of dimension T by n, in dollars

  • a returns matrix of dimension T by n (non-dimensional).

To run these scripts, download the three data files above, and place them in the same directory as the script.

Asset risk and return

The following plot shows the average return and standard deviation, annualized, for each asset in the dataset.


Cumulative value of sample portfolios

The plot below shows the value in time of sample portfolios (initial value $10000) for the time period spanned by the dataset. The SPY is an ETF that tracks the S&P500 stock market index. The 1/n portfolio puts equal weight on all assets (but the plot does not include the transaction costs due to rebalancing). Finally, USDOLLAR is the US central bank rate of return.


Experts zone

Here are the Python scripts we used to download the data, in combination with this list of assets. You can modify the dates and list of assets to generate your own data. We used this other script to make the interactive plots.