In light of the government shutdown and the everlasting budget crisis, I've been doing some reading about healthcare costs. Some interesting tidbits from the literature: RAND conducted a largescale social experiment where they provided individuals with one of two health insurance plans: 1) a free plan with no outofpocket costs, or 2) a 95% coinsurance rate (with an annual maximum outofpocket cost of $1000). They found that outpatient expenses were ~67% higher on the free plan as compared to the 95% plan, and the difference was attributed to the number of visits rather than the cost of each visit. Thus, cost sharing reduces a significant amount of moral hazard; they estimate that the societal welfare losses incurred from moving to the free plan from the 95% plan was ~$37  60 billion in 1984 as against an expenditure of ~$200 billion on these services for the under65 population. See their report here. On the other hand, it's important to keep in mind that costsharing creates adverse incentives: poor people may avoid routine checkups or tests in order to save on copay, causing their health to deteriorate and possibly incurring higher healthcare costs later. The literature on this topic is large, but the answers are unsatisfactory. To arrive at a reasonable answer, one needs to characterize the value of these extra visits in terms of improved health outcomes for the patient. On a related note, we can also look at cutting costs while providing the same care. For instance, Horrock et al. provide some experimental evidence that nurses may provide better quality and more satisfying care relative to doctors. Their estimates find that the resulting health outcomes are similar. Of course, nurses require less training and are cheaper than doctors, suggesting that savings can be made by moving towards nurseprovided care for routine physician visits.
Hello world! It's been a long time since I updated this old thing. As you may know, I'm spending my summer interning at eBay Search Science and planning my upcoming wedding (T  12 days!). I'm not allowed to talk about my work at eBay, but the experience has exposed me to some wellknown problems in search and rec systems that I can talk about. In general, auction search is a much harder problem than regular search since every item is duplicated like crazy (there are hundreds of thousands of iPhones on sale at eBay at any given time, and each one is slightly different from the next by current bid price, auction ending time, seller rating, etc.) Moreover, items don't last long enough to build a decent click history on which to do recommendations. Item titles vary wildly by seller (recently I saw an item named "flip flops boots sandals pumps" which was actually a pair of wedge boots  I've no idea how any textual search engine could actually infer what the seller was selling) thereby further aggravating the problem. This of course makes adding diversity of search results a tricky matter. The explorationexploitation tradeoff in rec systems is common knowledge these days. There are a ton of papers discussing multiarmed bandit approaches (see this paper by Yash Deshpande and Andrea Montanari for a RL approach). Diversity of search results has important revenue ramifications for the recommendation system (see this paper which looks at the problem from a revenuemaximizing point of view for the firm). However, it has even more important effects on products. For example, a restaurant that gets an initially low rating on Yelp may shut down permanently (although interestingly this turns out to be not such a bad problem on Yelp due to the 'warm start' effect: see here). An initially unpopular item on Amazon may never become popular. A lessaddressed problem in the literature is: how much are we as consumers losing out because of the lack of diversity in search results? As discussed here, recommendation systems can increase individuallevel diversity while decreasing aggregate diversity. That is, a recommender's bias towards popular items can actually worsen search results and lower net social utility. The obvious solution is to of course add a small amount of randomness to the search results. But naively proposing this solution to any search engine will likely get you laughed at (from personal experience). No search company wants to waste valuable impressions that could result in added revenue, and they have no real financial incentive to do so since the adverse effects are directly targeted to the item manufacturers and consumers. Moreover, the amount of added randomness you need to get significant improvements is rather large (I refer you back to any multiarmed bandit paper on this topic). So here's my question to you: how much do item manufacturers and consumers lose due to the search engine's design decisions, and should they be making some financial contributions to incentivize the search engine to add randomness and thereby increase net social utility?
Well here we are, with just two weeks left in the winter quarter. It's hard to believe that it's been almost 3/4 of a year since I started my PhD! I've learned so much over the last few quarters  it's almost painful to think back to when I first started working with Amin over the summer, and couldn't understand what a 'model' was. Reflecting back, the main thing I wanted to get out of my first year were: (1) exploring many different kinds of projects, and (2) finding an advisor. I think (1) has worked out pretty well so far  I've had the chance to play around with some largescale datasets (AERS, Venture Lab), work on something more theoretical (LASSO), do something hacky (zeroshot learning), and I've started looking at more econCS type stuff (mechanism design) for next quarter. When I came here in June, I was awed by all the different kinds of research in machine learning / data mining / convex optimization / statistics / networks  I was confused about what to do and felt like I had to try all these different things out. But now, I'm quite happy with the breadth of things I've explored, and I'm starting to feel like it's time to settle down (just in time too)! As for (2), I think I will likely align with Mohsen (if he hasn't changed his mind since the last time we talked about this). I think I've been quite lucky on this front: I don't have any of the complaints or issues with him that other grad students seem to have with the prototypical advisor. Mohsen spends significant time meeting with me, shows quite a bit of concern about my progress and happiness with my work, listens to my notsogreat ideas and works them out with me, nonjudgmentally explains simple things to me when I don't understand them, and is usually willing to work on anything that I'm interested in. I'll end this post with something less philosophical and more researchrelated. Tibshirani gave a very interesting talk about his latest work with significance estimates for LASSO feature selection. While exciting, this work leaves quite a lot of open questions. First, they can only compute significance for the last feature that is added to the model for any given shrinkage value. Second, their estimates are algorithmdependent (they use LARS) which feels quite unsatisfying. Third, it's not quite clear to me how these results will extend to other algorithms like ElasticNet, although they claim that they do. Apparently this is a big problem in the statistics community today, and Andrea has also had some results along these lines.
I'm happy to say that this has been a fruitful week! Some of the ideas that Mohsen and I discussed for LASSO with pairwise correlations are looking promising (well, at least compared to last week). There's also a recent paper with similar ideas, so I'm hopeful that we're going in the right direction. Meanwhile, our first meeting with Ramesh about healthcare mechanism design research will be this week, so I spent some time this weekend reading about the topic. Apparently this project is still quite young and in stealth mode, so unfortunately, I don't think I can describe it in detail here. In other news, after I explained my frustrations with the AERS data to Andrea, he came up with a brilliant idea for me to try instead  we're going to try applying collaborative filtering to remove biases in the drugADE predictions. I remember being quite unsatisfied with the current algorithm, and I think this is really the right approach. I haven't actually tried it out yet, but hopefully, I'll have some results to share on this front soon! I've also been looking more into inverse and adversarial RL. For the inverse RL front, I think there really is promise in trying to incorporate suboptimal experts into the noregret learning framework. As for adversarial RL, I think I'm going to try my algorithms on a Pacman game!! Ever since I talked to Vivek about his RL Tetris game, I've wanted to build a game myself to test out some strategies. Well, it turns out that Pieter Abbeel runs a reinforcement learning class at Berkeley where he has already written some publiclyavailable base code for testing RL algorithms on Pacman. Lucky me! :)
This was one of those weeks in research where everything I tried seemed to reel back and laugh at me. I hope this spell doesn't continue! First, I thought I had a great idea for improving LASSO by weighting standardized features with correlations with the label. This seemed great in the limit of orthonormal features because the correlation essentially returns the weight. However, after extensive simulations and reading Candes' paper, I have convinced myself that this in fact a bad idea except in certain special regimes since features with small correlations are muted  I won't go into details but I just wanted to let people know in case someone else decides to try this. I also tried to resolve my results from last week with Cami et al. for my AERS data project. Unfortunately, after some tedious data cleaning and experimentation, I'm still quite puzzled as to how they can predict drugADE correlations with an AUROC of 0.9 using just the product of the degree of two nodes. This project has been particularly frustrating since every paper uses different sources of data (some freely available and some not) and then proceeds to select different subsets of these datasets for their particular approach, making it very difficult to compare results. For instance, even though Cami et al. are only performing a very simple univariate logistic regression, they use a payperuse dataset derived from a public dataset, and it's difficult to infer what preprocessing has been done inbetween. After an extensive discussion with Mohsen about my failure with weighted LASSO, we decided to start looking in new directions. Some topics of interest are: 1) Incentive mechanism design for the healthcare project. This is something Mohsen and Ramesh work together on, and I'm all set to hear more about it in a Skype meeting with them next week. Mechanism design is quite the buzz after the Economics Nobel Prize went to Al Roth this year. 2) Adversarial RL. See here. What do you do when your future training set depends on the actions of your adversary? How do you train an algorithm on data that can be gamed? This sort of stuff has applications in ad auctions for instance. To get started, I decided to write a literature survey on the topic for my RL class final project. 3) Robust optimization. I talked to Dan Iancu about a possible project involving dynamic programming and robust optimization. Of course, Mohsen had some ideas on this topic as well. (I am continually amazed by how he has ideas on everything! Someday I hope to spew research ideas on every random research topic as well.) I don't want to go into much more detail about each of these things here, although I'm sure I'll write more about it if and when I work on them. For now, I'm not quite done with LASSO and I have a few other ideas that I want to try. Also, our ICML paper has been submitted finally, although I regret that I didn't spend very much time helping edit the paper in it's final moments. Here's to hoping it gets accepted!
I haven't updated in a while so here's a summary of things I've been thinking about in the last week: 1) Mohsen and I tested Elastic Net, which produces significant improvements over LASSO on correlated data  unfortunately it also outperforms the algorithms we have come up with so far. Elastic Net is basically a hack that interpolates between L1 and L2 regularization, so it's surprising that it does so well. Our next step is to look at the graph of pairwise correlations and to try to incorporate them into our model in a more systematic way. 2) I've been trying to improve my initial results on the AERS data for predicting drugADE interactions, but I've had significantly worse luck than Cami et al. Andrea suggested a few ways to clean up my data, but yesterday I realized that not all of it is being imported correctly into R. In particular, I think R doesn't read missing values correctly despite the existing delimiters. Hopefully, this can be fixed by a simple substitution in the text files using SED before the import. On a side note, I talked to Mohsen about using media hype to predict false reports in AERS data, and he thought it would be an interesting direction to pursue. However, I'm going to wait until I find someone to work with, who can take a lead on parsing articles from some popular medical media site (e.g. MNT). Or in Mohsen's words, I'm going to wait until someone hands me some nice clean data and do a logistic regression on it :p. 3) A large portion of my week was dedicated to brainstorming ideas for my final projects. Osbert and I are doing a survey on inverse RL literature using an online learning framework (see this paper) for our CS 229T project, and I was hoping to get this to tie in with my project for MSandE 338 somehow. However, apparently RL is quite different from inverse RL so here are some new ideas I've been thinking about: a) Using deep learning for value function approximation. If this improves performance, it could be quite useful to several application papers in the literature that use either ADP or RL. b) Using RL for making sequential clinical decisions. See this monster of a paper. Mohsen suggested I also look at decisionmaking under adversarial conditions (see here). There has also been some interest in using RL for solving combinatorial optimization problems, which I thought was pretty cool (see here). c) Solving one of Ben's conjectures from class. I'm thinking of making a list of these to share but I've been too lazy. One interesting one is introducing the mixing time of a Markov network into the convergence analysis for RealTime Dynamic Programming (RTDP). Usually, we just use the time horizon as the time scale, which produces terrible bounds since we would like to converge to the optimum well before we stop caring (i.e. sublinear in the time horizon). Ben's idea to fix this problem is to redo the analysis with the mixing time, which is welldefined and small for most realworld networks. d) For the inverse RL problem, Osbert and I are interested in looking at both the suboptimal expert case as well as the case with multiple experts. There have been some simulation results on the topic from Andrew Ng's group, but nothing theoretical as far as we know. In other news, the ICML deadline is coming up next week, so I should really start helping Richard et al. edit the paper and such. That's all for today, folks!
Hello again! I didn't do very much yesterday (one of the perks of being a grad student is that I can take arbitrary days off if I'm feeling tired), but I buckled down today to prepare for my meeting with Andrea tomorrow. First, I wrote a script to preprocess the FDA datasets and import them into R. Then, I selected the network covariates and performed logistic regression to reproduce the results from this paper. As I was doing this, I realized that there seem to be several problems with their analysis: 1) Their definition for a ADEdrug association is very loose. Any ADE that is reported with a drug even once is considered an association. But some cases report 60 or 70 drugs with a given ADE and there's a strong chance that some ADEdrug pair is simply making a random appearance. Other papers that I've seen only consider pairs that have appeared a minimum number of times (e.g. 10). 2) Doctors indicate whether a certain drug in a report is a 'primary suspect,' 'secondary suspect,' 'interacting,' or 'concomitant.' This information is not used at all. 3) Synthetic correlations in the data have not been removed. See this paper. Finally, all this talk about synthetic correlations caused by doctor biases, makes me want to explore how recent media trends on medicine affect the resulting biases that can be seen in the database. It would be interesting to quantify the effect of media hype on our daily patient lives.
I spent the weekend catching up on CS229T for the first homework assignment. I've been so caught up with other stuff that I hadn't been paying attention since the first lecture. I seem to be in the unfortunate dilemma where the math is not too hard to follow but the concepts and definitions are quite tricky. I ended up finishing the problem set by manipulating algebra and without understanding anything. I think I've come to terms with several key ideas since then, but I still have a lot of trouble understanding the intuition for online learning algorithms (usually I understand the intuition and struggle with the math)... This is quite unfortunate because this really cool inverse reinforcement learning paper uses the online learning framework, and I've been really excited to explore these ideas more on my own.
I finished some more OWLQN simulations and had my weekly meeting with Mohsen today. (On a side note, I finally started using the GSB servers, which are absolutely incredible. The Stanford corn servers are used by the entire Stanford community, but no one seems to use the business school ones so I had free reign to take up 3000% of their fancy multithreaded CPU.) We brainstormed some possible algorithms to capture pairwise correlations in our data, but all our ideas are so far computationally inefficient and somewhat adhoc. Still, we're going to try implementing them to see what we get. Mohsen also brought up some connections to graphical models, and suggested we think about ways to bridge supervised learning to learning in graphical models  this is one of those problems where my gut instinct tells me there must be an easy solution, but upon further thought, I realize things are not so simple. In any case, I hope to explore this idea more. I really should finish the PGMs class that Daphne taught on Coursera...
I attended the MSandE New Directions talk by Asuman Ozdaglar today on financial and economic networks. She looked at how network structure impacted stability, i.e. how does the inteconnectedness of today's economic networks (suppliers, creditors, etc.) aggravate the domino effect of major shocks (financial crisis, bankruptcy, etc.). I was very intrigued by how she took a problem of very practical relevance and approached it with a lot of cool math. Her papers are now in my weekend todo list (which of course never actually gets done). In other news, I finally recompiled the OWLQN package in Linux so that I don't have to use MSDOS, and I'm adding a maxIters condition to the source code so it doesn't go into an infinite loop when it doesn't converge to a solution. I also learned how to use the parallel computing toolbox in MATLAB so I can speed up my simulations. Hopefully, all these muchawaited additions will make my future simulations less nightmareish.
Hello all! This is the beginnings of what I hope will by my public research journal. A quick overview of things I've been looking at in the last quarter / hope to look at this quarter: 1) LASSO project with Mohsen Bayati. I've done several simulations on real and fake datasets to study how feature correlations with the training label affect the performance of LASSO. However, I have mostly been using the popular glmnet package in R, which is sort of a black box. Currently, I'm trying to reproduce some of my results using a simpler and more transparent package called OWLQN, as well as to formalize some of my results into conjectures. More on this to come (hopefully)... 2) Detecting adverse events in healthcare with Andrea Montanari. I'm just getting started on this project with my new rotation advisor. Although he normally works on rather theoretical projects, Andrea graciously agreed to work on something more applied with me this quarter so that I can get a better flavor of different types of projects. The FDA now keeps an adverse events database where doctors can report adverse outcomes for patients taking certain drugs. Of course we can't jump to any conclusions about whether the drug actually caused said outcome since there are many, many biases in these reports  the patient's preexisting conditions, reporting bias, drugdrug interactions, etc. Several papers have looked at applying statistical methods to remove these biases and glean information of drugadverse event associations as well as possible drugdrug interactions. We hope to approach the problem along the same vein, except with (hopefully) a more sophisticated statistical model. 3) Zeroshot learning with Osbert Bastani, Milind Ganjoo, Richard Socher. I got started on this with Osbert for our CS 229 final project. You can read all about it in our (early) draft ICML paper here! 4) Reinforcement learning with ???. I'm looking for a partner in crime in MSandE 338 (Reinforcement learning). I have a few ideas for new algorithms in Qlearning and proving probable convergence  of course, my ideas are often wrong and have to be taken with a huge grain of salt, but I'd love to bounce ideas with someone on the topic! I've also been curious about using deep learning to learn value function approximations, so let me know if you're interested in that as well.
