## Data for finance and portfolio optimization
We provide a dataset for portfolio optimization and other finance applications. It covers 10 years, from January 2006 to December 2016, and comprises a set of 52 popular exchange traded funds (ETFs) and the US central bank (FED) rate of return (here is the list of assets). The data consists of three data files, tables of daily market returns (computed from the close prices).
For convenience we provide scripts in Python, Julia, and Matlab that load data and define an `assets`array of dimension`n`a `dates`array of dimension`T`a `prices`matrix of dimension`T`by`n`, in dollarsa `volumes`matrix of dimension`T`by`n`, in dollarsa `returns`matrix of dimension`T`by`n`(non-dimensional).
To run these scripts, download the three data files above, and place them in the same directory as the script. ## Asset risk and returnThe following plot shows the average return and standard deviation, annualized, for each asset in the dataset.
## Cumulative value of sample portfoliosThe plot below shows the value in time of sample portfolios (initial value $10000) for the time period spanned by the dataset. The SPY is an ETF that tracks the S&P500 stock market index. The 1/n portfolio puts equal weight on all assets (but the plot does not include the transaction costs due to rebalancing). Finally, USDOLLAR is the US central bank rate of return.
## Experts zoneHere are the Python scripts we used to download the data, in combination with this list of assets. You can modify the dates and list of assets to generate your own data. We used this other script to make the interactive plots. |