« August 2014 | Main | October 2014 »
It was bootstrapping that made me start off on my statistics journey years ago. I have very fond memories of the days when I could understand simple things in statistics without resorting to complicating looking formulae. A few lines of code were all that was needed. Slowly I became more curious about many things in statistics and that’s how my love affair with stats began. There are two bibles that any newbie to bootstrap should go over; one by Efron & Tibshirani and the other by Davison & Hinkley. Any other specific topics, you can always understand by reading papers. It is always a nice feeling for me to read stuff about bootstrapping. However reading this book was an extremely unpleasant experience.
In the recent years with the rise of R, many authors have started writing books such as “Introduction to ____( fill in any statistical technique that you want to) using R”. With many more people adopting R, these books hope to fill the need of a data analyst who might not be willing immerse himself/herself in to the deep theories behind a technique. The target audience might want some package that can be used to crunch out numbers. Fair enough. Not everyone has the time and inclination to know the details. There are some amazing books that fill this need and do it really well. Sadly, this book is not in that category. Neither does it explains the key functions for using bootstrapping nor does it explain the code that has been sprinkled in the book. So, the R in the title is definitely a misleading one. Instead of talking about the nuances of the various functions based on author’s experience, all one gets to see is some spaghetti code in the book. I can’t imagine an author using 15 pages of the book (that too within a chapter and not the appendix) in listing various packages that have some kind of bootstrap function. That’s exactly the authors of this book have done. Insane! This book gives a historical perspective of various developments around bootstrapping techniques. You can’t learn anything specific from the book. It just gives a 10000 ft. overview of various aspects of bootstrapping. I seriously do not understand why the authors has even written this book. My only purpose in writing this review is to dissuade others from reading this book and wasting their time and money.
Introduction
The bootstrap is one of the number of techniques that are a part of a broad umbrella of nonparametric statistics that are commonly called resampling methods. It was the article by Brad Efron in 1979 that started it all. The impact of this important publication can be gauged by the following statement in Davison and Hinkley’s book :
The idea of replacing complicated and often inaccurate approximations to biases, variances and other measures of uncertainty by computer simulation caught the imagination of both theoretical researchers and users of statistical methods
Efron’s motivation was to construct a simple approximation to Jackknife procedure that was initially developed by John Tukey. Permutation methods were known since 1930s but they were ineffective beyond small samples. Efron connected bootstrapping techniques to the then available jackknife, delta method, cross validation and permutation tests. He was the first to show that bootstrapping was a real competitor to jackknife and delta method for estimating the standard error of an estimator. Throughout 1980s to 1990s, there was an explosion of papers on this subject. Bootstrap was being used for confidence intervals, hypothesis testing and more complex problems. In 1983, Efron wrote a remarkable paper that showed that bootstrap worked better than crossvalidation in classification problems of a certain kind. While these positive developments were happening, by 1990s, there were also papers that showed bootstrap estimates were not consistent in specific settings. The first published example of an inconsistent bootstrap estimate appeared in 1981. By the year 2000, there were quite a few articles that showed that bootstrapping could be a great tool to estimate various functions but it can also be inconsistent. After this brief history on bootstrapping, the chapter goes in to defining some basic terms and explaining four popular method; jackknife, delta method, cross validation and subsampling. Out of all the packages mentioned in the chapter (that take up 15 pages), I think all one needs to tinker around to understand basic principles are boot and bootstrap
Estimation
This chapter talks about improving the point estimation via bootstrapping. Historically speaking, the bootstrap method was looked at, to estimate the standard error of an estimate and later for a bias adjustment. The chapter begins with a simple example where bootstrap can be used to compute the bias of an estimator. Subsequently a detailed set of examples of using bootstrapping to improve cross validation estimate are given. These examples show that there are many instances where Bootstrapped crossvalidation technique gives a better performance than using other estimators like CV, 632 and e0 estimators. About estimating a location parameter for a random variable from a particular distribution, MLE does a great job and hence one need not really use bootstrapping. However there are cases where MLE estimates have no closed form solutions.In all such cases, one can just bootstrap away to glory. In the case of linear regression, there are two ways in which bootstrapping can be used. The first method involves residuals. Bootstrap the residuals and create a set of new dependent variables. These dependent variables can be used to form a bootstrapped sample of regression coefficients. The second method is bootstrapping pairs. It involves sampling pairs of dependent and independent variable and computing the regression coefficients. Between these two methods, the second method is found to be more robust to model misspecification.
Some of the other uses of bootstrapping mentioned in the chapter are:
My crib about this chapter is this : You are introducing data mining techniques like LDA, QDA, bagging etc. in a chapter where the reader is supposed to get an intuition about how bootstrapping can be used to get a point estimate. Who is the target audience for this book ? A guy who is already familiar with these data mining techniques would gloss over the stuff as there is nothing new for him. A newbie would be overwhelmed by the material. For a guy who is not a newbie and who is not a data mining person, the content will be appear totally random . Extremely poor choice of content for an introductory book.
Confidence Intervals
One of the advantages of generating bootstrapped samples is that they can be used to construct confidence intervals. There are many ways to create confidence intervals. The chapter discusses bootstrap-t, iterated bootstrap, BC, BCa and tiled bootstrap. Again I don’t expect any newbie to understand clearly these methods after reading this chapter. All the author has managed to do is to give a laundry list of methods and give some description about the methods.And Yes, an extensive set of references that makes you feel that you are reading a paper and not a book. If you want to really understand these methods, the bibles mentioned at the beginning are the right sources.
Hypothesis testing
For simple examples, hypothesis testing can be done based on the confidence intervals obtained via bootstrap samples. There are subtle aspects that one needs to take care of, such as sampling from the pooled data etc. Amazing that the author doesn’t even provide some sample code to illustrate this point. The code that’s provided does sampling from individual samples. Instead code should have been provided to illustrate sampling from pooled data. Again poor choice on the way to present the content.
Time Series
The chapter gives a laundry list of bootstrap procedures in the context of time series; model based bootstrap, non overlapping block bootstrap, circular bootstrap, stationary bock bootstrap, tapered block bootstrap, dependent wild bootstrap,sieve bootstrap. Again a very cursory treatment and references to a whole lot of papers and books. The authors get it completely wrong. In an introductory book, there must be R code, there must be some simple examples to illustrate the point. Instead if you see a lot of references to papers and journal articles, the reader is going to junk this book and move on
Bootstrap variants
The same painful saga continues. The chapter gives a list of techniques – Bayesian bootstrap, Smoothed bootstrap, Parametric bootstrap, Double bootstrap, m-out-of-n bootstrap, and wild bootstrap. There is no code whatsoever to guide the reader. The explanation given to introduce these topics is totally inadequate.
When the bootstrap is inconsistent and How to remedy it ?
This chapter gives a set of scenarios when the bootstrap procedure can fail
This is the worst book that I have read in the recent times. The authors are trying to cash in on the popularity of R. The title of the book is completely misleading. Neither is it an introduction to bootstrap methods nor is it an introduction to R methods for bootstrapping. All it does is give a cursory and inadequate treatment to the bootstrap technique. Do not buy or read this book. Total waste of time and money.
Posted at 11:23 PM in Programming, Statistics | Permalink | Comments (0)
The paper titled, “The imprecision volatility indexes”, analyzes VVIX, the vega weighted VIX, an estimate for the 30 day expected volatility. Market participants have always wanted some kind of quantitative measure for the volatility. CBOE introduced VIX based on the Black Scholes volatility of ATM options and later changed it to a method that is based on observed option prices. The latter method in the finance literature goes by the name, “model free method”, because it uses a replicating portfolio argument of pricing a variance swap. Having said that, it is not as though it is completely “model free”, after all the risk neutral expectation of a variance swap computation assumes basic GBM with constant volatility for the stock price evolution.
What’s this paper about ?
VIX in its current avatar is plagued with a set of problems. The real world implementation of a theoretical variance swap formula gives rise to truncation and discretization errors among other type of errors. There have been some fixes proposed like natural cubic spline smoothing. The main problem with VIX is that it gives an imprecise estimate when we really want it, i.e. in times of panic. There are set of indexes for volatility that are basically tweaks around the older CBOE VXO index. In order to take care of smile and microstructure effects, there have been several suggestions of making improvements to VXO. Some of them are vega weighted VIX( VVIX), Spread weighted VIX (SVIX), volatility elasticity weighted VIX(EVIX), Transaction volume weighted VIX(TVVIX) etc. A thorough analysis of these alternative indexes is done in a paper by Susan Thomas and Rohini Grover.
This paper analyzes VVIX, an index that is a vega weighted average of Black Scholes implied volatility across strikes. The authors use NIFTY option quotes data between Feb 2009 to September 2010 for the analysis. On each day of the training data, four time instants in the day are taken for the analysis. The paper does not mention the procedure behind selecting the four time instants. I am guessing that the time instants selected are the high liquidity periods, i.e. start and end of the trading day. Or may be they were uniformly/randomly distributed during the day.
Anyway coming back to the procedure. At each time instant, a bootstrap sampling distribution of VVIX is estimated. Based on this distribution, 95% confidence bands are estimated. By using the price series, four times a day for 20 months of data, there are ~1500 data points, each of which give an estimate of 95% confidence interval for VVIX. The authors create kernel density plots based on these ~1500 data points and report the median width of CI as 2.92% and sigma of the estimate as 0.74%. If one looks at the median 95% VVIX estimate obtained, it is quite imprecise.
The paper also tries to check whether this imprecision is just a manifestation of liquidity of the underlying asset. A regression analysis is done using impact cost and the results imply a weak relationship between the width of CI and impact cost.The last section of the paper uses CI measure to select among various alternative volatility indexes such SVIX,TVVIX, VVIX and EVIX and finds VVIX is better than others.
What are the implications of the results from this paper ?
VVIX, an alternative volatility indexes, is imprecise in its measurement. This means that on a majority of days, a market participant does not really know whether the true unknown volatility has gone up or gone down. SVIX which was shown in this paper to be a good estimator of realized volatility , ranks behind VVIX based on CI measure as a model selection criterion. This means SVIX is imprecise too. CBOE, NSE and many stock exchanges in the world follow VIX that is “model free” and the VIX formula used has its own set of problems and is imprecise.
So,where are we ? The alternative volatility indexes are imprecise. The volatility indexes currently being used are plagued with problems. What’s the way out ? I guess we will have to wait until someone figures out a better volatility index that takes care of many issues of the real world such as
In any case, I think this is a very interesting research area for a curious mind.
Posted at 09:53 AM in Finance, Statistics | Permalink | Comments (0)
In every elementary statistics textbook on inference you will find the following question
How to draw an inference on a correlation estimate between two variables ?
An inverse hyperbolic tangent transformation is applied to the correlation coefficient, which then is shown to be distributed as a normal random variable.This transformed variable is then used to compute confidence intervals on the original scale.
If one is curious, a natural follow up question would be,
Given any random variable X from a specific family of distributions, does there exist a single transformation, Y = g(X), such that Y has, nearly a normal distribution ?
This seemingly straightforward question is not so straightforward if you start thinking seriously about it. Here is a paper titled,” Transformation theory : How Normal is a family of distributions ?” by Brad Efron, the pioneer of bootstrapping technique who answers the question in a masterly fashion.
The paper is 15 pages long and the author tackles this question in extreme detail. One of the surprising findings of the paper is that you can say a lot about the existence of a transformation with out actually knowing the transformation. The key mathematical object that is used throughout the discussion is a diagnostic function that is motivated in terms of local transformation to normality. The diagnostic function measures how quickly the local transformation to normality is changing as the parameter of the distribution changes.
The author defines the following 6 classes of distributions
For each of the above families, the diagnostic function is used to explore various aspects of the family. The paper gives the method to retrieve all the relevant components of the transformation via the diagnostic function. This paper is extremely important for one reason : One can use the bootstrapped distribution to get to all the components that go in to General scaled transformation family Why is it necessary to know the components of GSTF? Because it is important to take these in to consideration while estimating confidence intervals.
Most of the R packages out there have functions that have amazing math behind them. If you ever happen to use boot.ci function for estimating nonparametric confidence intervals, then there is a fair chance you will like this paper. One of the powerful techniques that boot.ci function uses is Bca (bias-corrected and accelerated bootstrap) . It is used to compute better confidence intervals. If you are curious about the math behind Bca, going through this paper would be helpful.
Posted at 12:12 AM in Statistics | Permalink | Comments (0)
The paper titled, “Extracting Model-Free Volatility from Option Prices: An Examination of the VIX Index”, is a very interesting paper that talks about the problems in the VIX index computation that is currently being used at CBOE. Other stock exchanges throughout the world are also following a similar method for disseminating VIX, called the fear index. One of the most interesting conclusions of the paper is this :
VIX underestimates the true volatility in times of panic, i.e. when we need the fear index the most, it acts as an imprecise gauge
Before 2003, VIX was computed by averaging the ATM volatilities. The volatilities were in turn inferred from the option prices via Black Scholes model. Post 2003, VIX index computations was changed to reflect a methodology based on the fair value of a variance swap. The valuation of variance swap is via replicating argument and hence is not dependent on a specific model. Well, the previous statements is not entirely correct. The stock is assumed to be GBM with constant diffusion parameter, instead of let’s a jump diffusion model. It is “model free” in the sense that it is using the actual option prices quoted in the market to obtain VIX. However as far as using the replicating argument is concerned, a model is used after all to compute the risk neutral expectation of realized volatility. In any case, CBOE had to put in some tweaks in order to use the theoretical fair price of variance swap in the real world that has finite strike range and discrete strike interval.
Why should be there any error at all ?
The fair value of future variance is based on the replicating argument and it is given by the following formula :
In the real world, continuous strikes are not available nor are strikes available from 0 to infinity. So, there is bound to be some approximation error between the theoretical fair value and its real world implementation. In CBOE implementation the VIX is computed by the following formula :
where F_{0} is the forward index level, T is the option maturity, K_{i} is the strike price of i^{th} OTM option, K_{0} is the first strike price below F_{0}, Q(T,K_{i}) is the midpoint of the latest available bid and ask prices for the option, r is the risk free rate and Delta K_{i} is the strike price increment. An adjustment at the strike price K_{0 }is made by redefining
Q(T,K) as the average price of call and put options.
What are the types of errors that occur in the current VIX avatar ?
The paper mentions four types of approximation errors :
How large are these approximation errors ?
The authors simulate option prices using a price process that has a stochastic volatility component and random jumps. Since the simulation starts off with a known volatility smile, it is easy to check whether VIX is giving the right estimate or not. The authors analyze only the first three of the above approximation errors. Option prices at various strike ranges with different strike price intervals and varying maturities are simulated. CBOE VIX is computed for all the simulated prices and a thorough error analysis is done in the paper. Here are the findings :
How to fix these errors ?
The authors propose a natural cubic spline interpolation of the implied volatility curve within the available finite strike range and extrapolate the smoothed IV curve outside the finite interval based on the slop of the curve at the end points. What’s the advantage of this interpolation ? For one, you can use a finite grid to integrate various terms. Secondly, the extrapolation can reduce the truncation error. With this simple solution, discretization and truncation errors can be significantly reduced. As a robustness check, the authors use smoothing method to test the model prices generated by a stochastic volatility with random jumps model. They find that the smoothing method is consistent across all volatility and index levels. The maximum error via smoothing fix is only about 8 index basis points whereas the CBOE VIX errors range between +79 and –198 index basis points.
CBOE procedure for computing VIX leads to positive discretization error and negative truncation errors. At modest levels of volatility, they appear to mostly offset each other. At lower volatility levels, the discretization error dominates and hence there is an over estimation of true volatility. At higher volatility levels, the truncation error dominates and hence results in an underestimation of true volatility. A simple fix suggested in the paper is natural cubic spline plus extrapolation that smoothens the IV curve and thus reduces the discretization error and the truncation error.
Posted at 10:33 PM in Finance | Permalink | Comments (0)
It’s truly a great pleasure for me to be at the University of California at Berkeley today. Not quite 50 years ago, when I was an undergraduate studying physics in Cape Town, I began applying to go to the United States for graduate school. It seemed to be the right thing to do if you were serious about your field.
So I applied to three schools: Columbia, because I knew someone in Cape Town who had just gone there, and because it was in New York City; Caltech, because Feynman was there and had recently been awarded the Nobel prize and also published the stylish and insightful Feynman lectures on physics, though I didn’t understand at the time what he had actually accomplished; and Berkeley, because it was in the news for the start of the revolts against arbitrary authority on campus. I read the other day that year is the 50th anniversary of the Berkeley Free Speech Movement that seem to me to have kicked off the Sixties. For those of you who are too young to remember that, which I suppose is all of you graduating today and maybe much of the faculty too, take a look — it’ll make interesting reading.
So, a few years later I ended up at Columbia, but I’ve always had a soft spot for Berkeley. Twenty years later I had a PhD in physics and had been a physics academic, and then through various serendipities moved from physics to finance. Berkeley played a role in that too. The first paper I ever read in finance was the paper on the binomial model co-authored by Mark Rubinstein with Cox and Ross. Much of what I understand about finance still comes from understanding that paper and trying to extend it in various ways. Similarly but later with other papers by Berkeley authors like Hayne Leland, Mark Garman and Terry Marsh. So, here I am today, in indirect ways someone who was influenced by Berkeley, and I have tried to think about what encouragement and perhaps advice to give you as you set out to enter the world of finance to which I came as a latecomer and an outsider.
______
First I want to remark on how the world has changed.
Before I became a financial engineer, I was a theoretical physicist. I came to financial engineering without the kinds of excellent and comprehensive training that you’ve just received. There was no way to get that kind of training in 1985 when I arrived at Goldman Sachs. I arrived one day in November and was set by my boss to work on building binomial options models for bonds. There were only two text books on options you could buy: Jarrow and Rudd, and Cox and Rubinstein, two classics I still own. I quickly set about reading them. There was no one to teach you and no training program at Goldman in those days. You pretty much had to teach yourself. In many ways that was good. The whole field of quantitative finance, at least in industry, was a kind of amateur heaven in those days. You could spend a short time learning something and then you were ready to start to try to do something new. It’s nice to have lived in an era like that but those days have unfortunately passed. I feel a bit sorry for your generation — you have so much formalism to learn before you feel you can do anything practical and useful, though that isn’t always the case.
Life for financial engineers has changed since those days, and I once wrote. In 1985, I quickly noticed the embarrassment involved in being “quantitative”. Sometimes, talking in a crowded elevator with another “quant,” you might start to say something about the duration or convexity of a financial instrument. Duration was sophisticated stuff in those days, though now it’s introductory for you! If the person you were talking to had been at the firm a little longer than you, he – it was usually a he, unlike a lot of the people I see before me, though, as now, most of us were non-American born – that person would cringe a little, and rapidly try to change the subject. “See the Yankees game last night?” he might ask, the sort of things a real bond trader might say.
Soon, you began to realize, there was something a little shameful about two consenting adults talking math in a crowded elevator; there was something embarrassing about mentioning programming or Java in the company of bankers. There was something awful about being “outed” as a quant in public. People in the elevator just looked away.
Even in the Nineties, quantitative skills were reluctantly tolerated. Once, a friend and I were talking on the trading floor when one of the convertible traders walked between us, momentarily. Suddenly he grimaced and winced; he clutched his temples with both hands as though a sharp pain had pierced him and exclaimed, “Aaarrgghhhh! The force field! It’s too intense! Let me get out of the way!”
I truly remember that even in the Nineties big shots in business didn’t put their email address on their business cards — that was for geeks — and didn’t put a PhD on their card either. That branded you.
Now, in the aftermath of the financial crisis, after big firms have made lots of money via financial engineering and lost lots of money by carelessness or hubris, after quantitative and algorithmic trading has become the fashion, after many hedge funds call themselves quant, financial engineering and risk management have become hot areas. That’s good for you, and gives you as a financial engineer much more power in the world than we had, but it also adds responsibilities which I’m going to talk about a little later.
______In this new environment, let me give you some pieces of advice based on my own experience.
1. I want to talk about is Doing Dirty Work. You’ve learned a lot of stochastic calculus and optimization and stuff like that in the past 18 months. That’s necessary and important. But equally important, if you’re going into the business world, is that the business world runs on dirty work. Don’t scorn or eschew getting your hands dirty . Most of the useful things I’ve done in life involved getting my hands dirty, esp. in business. It’s OK to do your own programming, your own figures, your own dirty work. You learn a lot more by doing it yourself than by avoiding it or giving it to someone else to do. Don’t think anything is beneath you because you have an education.2. The second thing: Be Reliable. When you start working, get to be so good at something that people around you can rely on you for it. I once hired a guy at Goldman to help set up our new Sun Microsystems workstations for derivatives systems in 1990 and when he arrived I started babbling to him in a panic about all the things I had to get done and the things that weren’t working. He turned to me and said: “Don’t worry about that, that’s what I’m here for!” I can’t tell you what a good impression that makes. Remember that the firm is hiring you for what you can do for them, not vice versa.
3. I also want to talk a little about The Nature of Financial Engineering, the characteristics that make it both difficult and interesting.
Science seeks to discover the fundamental principles that describe the world, and is reductive. Engineering is about using those principles, constructively, for a purpose. Mechanical engineering is concerned with building devices based on Newton’s laws, suitably combined with heuristic or empirical rules about more complex forces (friction, for example) that are too difficult to derive from first principles. Electrical engineering is the study of how to create useful electrical devices based on Maxwell’s equations and solid-state physics, combined with similar heuristics. Similarly, bio-engineering is the art of building prosthetics and other biologically active devices based on the principles of biochemistry, physiology and molecular biology.So what is financial engineering? In a logically consistent world, financial engineering should be based on financial science. Financial engineering would be the study of how to create functional financial devices – convertible bonds, synthetic CDOs, etc. – that perform in desired ways, not just at expiration, but throughout their lifetime.
But what exactly is financial science?
Brownian motion and other idealizations you’ve learned about, while they capture some of the essential features of risk, are not truly accurate descriptions of the characteristic behavior of financial objects. We don’t know the correct stochastic partial differential equations for a stock or its volatility. Maybe we never will, because people’s behavior changes. There are no proven models that work reliably.Therefore, financial models are often crude but useful analogies with better understood physical systems. We pretend stocks behave like smoke diffusing, or that short term rates satisfy geometric Brownian motion. None of this is true in the same sense that it’s true that planets satisfy Newton’s laws.
There is as yet no truly reliable financial science beneath financial engineering. Financial models attempt to describe the ripples on a vast and ill-understood sea of ephemeral human passions, using variables such as volatility and liquidity that are clever quantitative proxies for complex human behaviors. Such models are roughly reliable only as long as the world doesn’t change too much. When it does, when crowds panic, anything can happen.
That’s the bad part of the story but it’s also the good part of the story. The good part is that financial engineering is really a multidisciplinary field that involves not only science but art too. It involves financial knowledge, business knowledge, mathematics, statistics. It also involves psychology and introspection. Also, very very importantly, computation, because there’s little you can achieve without computation. So think of yourself as working in an interdisciplinary field in which you have to bring to bear many skills to solve practical and theoretical problems.
4. That brings me to the idea of Good Taste and Intuition.
Your job will often involve integrating different aspects of the financial field. To do it well you will need to combine your quantitative knowledge with market knowledge, taste and the intuition that grows from experience. Don’t just be satisfied with getting a number as an answer; be able to explain in words and ideas why, roughly, the number is what it is, why it increases or decreases in certain ways as certain things change. Always try to develop a visceral or physical or economic intuition to support or even precede your results. You’d be surprised how many complicated things can be understood simply if you struggle with them.
The most topical economist of today, Keynes, gave a speech about the intuition of Sir Isaac Newton, the founder of the modern scientific approach, at the tercentenary of his birth in Cambridge, England. Keynes had read some long lost notes of Newton’s, and spoke about Newton’s focus and intuition:I believe that the clue to his mind is to be found in his unusual powers of continuous concentrated introspection … His peculiar gift was the power of holding continuously in his mind a purely mental problem until he had seen straight through it. I fancy his pre-eminence is due to his muscles of intuition being the strongest and most enduring with which a man has ever been gifted. Anyone who has ever attempted pure scientific or philosophical thought knows how one can hold a problem momentarily in one's mind and apply all one's powers of concentration to piercing through it, and how it will dissolve and escape and you find that what you are surveying is a blank. I believe that Newton could hold a problem in his mind for hours and days and weeks until it surrendered to him its secret. Then being a supreme mathematical technician he could dress it up, how you will, for purposes of exposition, but it was his intuition which was pre-eminently extraordinary - 'so happy in his conjectures', said De Morgan, 'as to seem to know more than he could possibly have any means of proving'. (De Morgan laws of logic)
There is the story of how he informed Halley of one of his most fundamental discoveries of planetary motion. 'Yes,' replied Halley, 'but how do you know that? Have you proved it?' Newton was taken aback - Why, I've known it for years', he replied. 'If you'll give me a few days, I'll certainly find you a proof of it' - as in due course he did …I quote this because I find it inspiring, and I hope you do too. Not only art, but science too, requires intuition.
I want to go on a bit more about good taste. The world of markets, which is the world of people, is hard to fathom. Financial models are merely approximations and analogies. Since no model can accurately describe that world, it’s actually important to try hard but not too hard. You have to know when to stop and you have to know what to leave out of your model and leave to your users’ intuition. That’s why implied volatilities and implied variables and calibration play such a big part in finance. To be useful, one should be ambitious in believing a model can represent the world, but not too ambitious. What works best are simple low-dimensional models with a few essential characteristics. Most real things are too messy for a full theoretical treatment, and so models with good taste that can be calibrated to observable fungible tradable securities prices, are very important and count for much.5. I want to give you some words of caution too, about the misuse of models.
Finance is a very large and growing part of the economy. It can do good — this is a capitalist world and credit and securitization is the way that new ventures get funded — but most of the people who go into it — most but not all — don’t go into it because they want to do good. In that sense it’s not like medicine or social work. There’s nothing wrong with that.
Modeling is human. Financial modeling is human too. And people need models to understand how to value securities, how to invest. Risk is everywhere. Because of that, because investors want return with the least amount of risk, models are a great sales tool. People buy securities on the basis of models of some kind. Salespeople use models to sell illiquid securities for liquid cash. One therefore has to be careful that one’s models are not misused in an unethical way, and I became very aware of this in my professional life.
Several years ago therefore Paul Wilmott and I wrote The Financial Modelers’ Manifesto, an attempt to provide a Hippocratic oath for financial engineers, part of which I quote:The Modelers' Hippocratic Oath
~ I will remember that I didn't make the world, and it doesn't satisfy my equations.
~ Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.
~ I will never sacrifice reality for elegance without explaining why I have done so.
~ Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.
~ I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension.
I think these are good principles, and you fill find they are hard to put into practice when your models are used to make money. It’s not easy, but you have to try.
6. Conclusion.
I’ve said a lot, and I want to finish on a positive note about financial engineering, which I’ve been doing for almost 30 years.
I once read a biography of Goethe, Germany’s 18th Century Shakespeare, a great Romantic, and one of the last people to make contributions to both art and science, which is what, in some sense, we are trying to do too.
Scientists often look down on Goethe, and regard Goethe as a poet who strayed beyond his proper place, who shouldn’t have tried to do science. His critics said he mistakenly thought of nature as a work of art, and that he was trying to be qualitative where he should be quantitative.
But, according to the book I read, Goethe was not so naive as to think that nature is a work of art. Rather, he believed that our knowledge and description of nature is a work of art.
That’s how I like to picture what we can do in financial modeling – making a beautiful and truthful description of what we can see. We’re involved in intuiting, inventing or concocting approximate laws and patterns. We can synthesize both art and science in creating understanding. We can use our intuition, our scientific knowledge and our pedagogical skills to paint a picture of how think qualitatively, and then, within limits, quantitatively, about the world of human affairs, and in so doing, have an impact on how other people think.
Go out there, have a good impact, and have a good time.
Posted at 12:16 AM in Finance | Permalink | Comments (0)
The note titled,” More than you ever wanted to know about Volatility Swaps”, written by Derman, Demeterfi, Kamal and Zhou, is a fantastic fifty page write up highlighting many aspects of valuing a variance swap and a volatility swap. I love the structure followed in the note. Instead of heading right in to the math behind valuation, the paper gives starts off by giving a superb intuition into the need for variance swap and how does one go about pricing a variance swap with nothing more than common sense. In this blog post, I will summarize some of the points from the note.
What’s the need for variance swap or volatility swap?
If an investor wants to take a long volatility or short volatility position, exposure to an option is one of the common ways. However this has a problem as the option position gives the trader exposure to direction of the stock as well as the volatility. What if the trader wants to trade forward volatility? A delta hedged option removes the exposure to the stock direction to a certain extent but not completely. There is a clear need for a trader to directly trade volatility. That’s where variance swaps and volatility swaps come in to prominence. Volatility swaps are forward contracts on annualized volatility. These swaps have several characteristics that make trading attractive. Even though option market participants talk about volatility, it is the variance that has a more fundamental theoretical significance. This is so because the correct way to value a swap is to value the portfolio that replicates it, and the swap that can be replicated most reliably is a variance swap.
What’s the intuition behind pricing a variance swap?
A trader looking to have an exposure on pure volatility needs a portfolio that is sensitive to changing volatility. If he/she takes an exposure in to a single option, then as the spot moves away from the strike price of the contract, the option loses its vega and hence is no longer responds to the changing volatility. Ok,so a single option is not enough. Hence the trader has to have a portfolio of options with different strikes. Does equal exposure to all the strikes makes a portfolio immune to stock price movement? Using some visuals the paper makes an intuitive argument that the options must be weighed inversely proportion to the strike squared. This fact is also proved mathematically in the appendix. However one need to go through the math to appreciate the fact that a set of options weighted in a certain manner makes a portfolio immune to stock price movement. The authors then show that an exposure to a set of options with varying strikes resembles a log contract, an exotic option that isn’t traded in the market. Having established the connection between variance swap payoff and log contract payoff, the note moves on to the actual math.
What’s the math behind obtaining the fair value of a variance swap?
Well, the math is not as daunting as I expected. In just a few steps it is extremely clear that the difference between the SDE that drives a log Forward contract and a SDE of a certain exposure in the forward contract gives a pure exposure to the realized variance. To put it in simple words, a log contract hedged by a static replication involving forward contract gives the trader exposure to pure realized variance devoid of any stock movement contamination.
How to replicate the hedge in the real world ?
Once the expression for the expected value of a variance swap is derived, the rest of the details revolve around applying the fair value of the variance swap to the real world where one has to deal with the following aspects among many:
All the above aspects are discussed at great lengths. For volatility skew, the note also gives a closed price formula for the fair value, depending on whether one assumes skew to vary linearly with strike or black schools delta. The final section of the paper deals with pricing volatility swap from variance swap. Naïve pricing of volatility swap from variance swap will result in incorrect pricing. The note concludes saying that a healthy variance swaps market is needed to price and value volatility swaps and this entails requiring an arbitrage free stochastic evolution of volatility surface.
This paper is a great paper to learn many aspects of math finance. Pricing, valuation, replicating in the real world, dealing with real world hedging issues, etc. are all discussed in the context of a variance swap.
Posted at 11:37 PM in Finance | Permalink | Comments (0)
The article titled, “The Log Contract”, is a 20 year old article. It was first article that made a case for the need for a new instrument to hedge volatility. There is something nice about papers written in the old times. The authors give a healthy intuition about the stuff they are about to explain in the paper, use simple equations that do not require too much of “head banging” and at the end of it, the reader pretty much gets the gist of the paper. Such papers are rare nowadays. In today’s world, pick any finance/stats/quant paper, there are at least two dozen heavy references given in the appendix and a substantial preparation is needed to understand the key idea of the paper. Very rarely is a paper self-contained. May be that’s the way it is supposed to be.
Coming back to this paper…
The author, Anthony Neurberger, shows that a delta hedged contract is not completely risk free. A delta hedged portfolio might make money or lose money based on the difference between the volatility used for hedging and the actual volatility seen over the life of the option. Hence traders need an instrument to hedge pure volatility. An empirical observation made in the paper is that 80% of the hedging error that remains after delta-hedging an option is on account of incorrect forecast of the volatility over the life of the option. Hence an option writer will need some instrument to hedge the vol. What about a trader who just wants to go long volatility or short volatility ? Well, one go long or short a delta hedged portfolio. However this does not give a complete exposure to the volatility. Hence the need for an instrument that gives a direct exposure to the realized volatility.
This article talks about a specific kind of contract called the log contract. The contract is a futures-style contract that is tied to a conventional futures contract. If the conventional futures settlement price at expiration is F_T, then the settlement price of the log contract is log(F_T). What’s the advantage of having such log contract ? It can be seen that by forming a long short portfolio of future and log futures contract, a trader can get exposure to a contract whose payoff depends only one the realized volatility and not on the hedger’s forecast of volatility.
The greatest advantage of log contract is its simplicity in providing a pure play volatility trade. The contract keeps its sensitivity to volatility whatever be the asset price. This idea was later used in valuing a variance swap, which in turn lead to the CBOE VIX index. VIX is basically an interpolation between fair values of near month and mid month variance swaps.
Posted at 11:07 PM in Finance | Permalink | Comments (0)
The paper titled, “Liquidity considerations in estimating implied volatility”, by Susan Thomas and Rohini Grover, is about a new way of constructing volatility index that is based on weighing the implied volatility of the options based on the relative spreads at various strikes. The key idea behind the paper is that there is considerable liquidity asymmetry across various strikes for the near month and mid month contracts on NIFTY options. This leads the authors to hypothesize a measure that is based on weighing implied volatilities. The other indices discussed in the paper are VXO, VVIX and EVIX. The obvious question is,” How should these indices be evaluated, given that the volatility is unobserved”?
The authors use 10 min squared returns to compute the daily realized volatility and use it as yard stick for comparing the VXO, VVIX, EVIX and the hypothesized SVIX(spread weighted VIX). How’s the performance of these estimators measured ?
Based on the above performance measures, the authors conclude that there is a case for SVIX as a better volatility index as compared to VXO, VVIX and EVIX.
Some of the questions that come to my mind as I finish reading this paper are:
Someday I will get to figuring out the answers for the above questions.
One thing I must say is this: I found the paper extremely well written in terms of the logical sequence of arguments used for making a case for SVIX.
Posted at 01:53 AM in Econometrics, Finance | Permalink | Comments (0)
The paper, “Determining the best forecasting models”, is about testing 55 models that belong to the GARCH family. If we have just one model and a straw model, it is easy to show that some statistic on the test sample that the hypothesized model is superior. How do we go about testing a set of competing models? There are many wonderful techniques in the Bayesian world. However this paper is more frequentist in nature. It uses a method called “Model Confidence Set” for deciding the best forecasting models. This is akin to forming a confidence interval for a parameter rather than a point estimate. What’s the advantage of Model Confidence Set ?
What does the MCA procedure involve?
Let’s say you have four models with different specifications. The objective is to obtain best three model set, best two model set and the best model. Why does one need to have varying cardinality sets? Most often than not we have a set of competing models where there is no particular model that outperforms every other model. We might be looking for a set of models that more or less perform equally. MCA puts a framework around it in the following way:
Coming back to this paper, the author uses the above methodology to test 55 volatility models. The advantage with using this method is that it gives a probability for each model in each model set. Thus one gets a quantitative measure of how each model is stacked against each other in various model sets.
What are the results of applying MCA to 55 models?
I think MCA is an extremely useful method to compare a set of models. Several questions come to my mind, right away :
Posted at 10:33 PM in Econometrics, Finance, Statistics | Permalink | Comments (0)
The paper titled, “A Simple Long Memory Model of Realized Volatility”, is one of the most cited papers in the area of long memory volatility models.
One typically assumes that log prices follow an arithmetic random walk. In this kind of set up, it has been shown in the previous research that integrated volatility of Brownian motion can be approximated to any arbitrary precision using the sum of intraday squared returns. In fact this statement is applicable at a more general class of stochastic processes – finite mean semi martingales (includes Ito processes, pure jump processes, jump diffusion processes). Sum of intraday squared returns, ``Realized volatility, is a nonparametric measure that asymptotically converges to the integrated volatility as the sampling frequency increases. The flip side to this utopian scenario is the microstructure noise that one needs to contend with as the time scales become finer. Noise introduces a significant bias in the RV estimate. Hence one has to make a tradeoff between measurement noise and unbiasedness. In many papers, researchers have used anywhere between 15 min to 30 min intervals, as they might have observed that as the shortest return interval at which the resulting volatility is still not biased. Another approach that can be adopted to deal with the microstructure noise is to filter away the noise using the autocovariance structure of k tick aggregated returns. Well one might have to search a grid to get an optimal k to begin with. In an earlier paper by Corsi, one such filtering method is described.
The author uses 12 years of FX tick data and finds the following four patterns throughout:
Why another volatility model? Is GARCH(1,1), the emblematic volatility model not enough ? Can’t we us some variant of GARCH family?
The obvious question is what does it take to build a model that has the characteristics of multifractal behavior + long memory+ fat tails ? The strict answer to this comes from physicists who are of the view that only multiplicative processes lead to multifractal processes. Any multiplicative process will be difficult to identify and estimate. What’s the way out? The author takes a view that the long memory and the multiscaling features observed in the data could also be only an apparent behavior generated from a process which is not really long memory or multiscaling. This is what motivated him to come up with an additive model that captures the stylized facts present in the dataset.
What’s the basic idea behind the model introduced in the paper?
The author banks on “Heterogeneous Market Hypothesis” that says that the presence of market increases volatility. In a market where there are homogenous agents, a bigger set of agents should make the prices converge at some mean price. Reality is something else. The bigger the market size, the more correlated is volatility. One can think of markets populated with three kinds of agents, intraday agent, mid frequency trading agent and low frequency trading agent. This means that a volatility model needs to take in to consideration at at least three time scales. HARCH process is one that incorporates this kind of a behavior. It belongs to the wide ARCH family but differs from other ARCH-type processes in the unique property of considering squared returns aggregated over different intervals. One stylized fact that HARCH shows is that longer time interval volatility has an influence on shorter time intervals. The paper builds up on HARCH model and introduces a cascade of model between the three time scales: daily, weekly and monthly volatility processes. There is one equation for each time scale and three equations are related in a way that monthly volatility feeds in to weekly volatility which in turn feeds in to daily volatility.
The economic interpretation is that each volatility component in the cascade corresponds to a market component which forms expectation for the next period volatility based on the observation of the current realized volatility and on the expectation for the longer horizon volatility. The daily volatility equation can be seen as a three factor stochastic volatility model where the factors are directly the past realized volatilities viewed at different frequency.
The author then hypothesizes that the RV computed on the filtered returns on a daily basis can act as a proxy for the LHS in the above equation. This turns the entire equation in to a simple time series representation of the cascade model that becomes easy to estimate. The author terms that above model as HAR-RV, Heterogeneous Autoregressive model for the realized volatility. The authors simulate paths from the above model and show that the behavior matches the observed data in the following aspects
The authors use standard OLS method with Newey-West covariance correction for estimating the model parameters. A comparative study of performance against a set of standard volatility models shows that HAR-RV manages give a reasonably good out of sample forecasting performance. Simplicity of its structure and estimation is definitely an appealing aspect of HAR-RV model.
Posted at 10:49 AM in Econometrics, Finance | Permalink | Comments (0)
The paper titled, Efficient Estimation of Volatility using High Frequency Data, is about testing a set of volatility estimators using high frequency data. I will attempt to briefly summarize the paper.
For a person working in the financial markets, there is not a day that goes by without hearing the word, “volatility’’. Yet, it is something that is not observed. If you assume that stocks follow some random process like a GBM, then the relevant question to ask is, “How does one estimate the diffusion parameter/process in the model?” One of the principles from classical statistics, minimal sufficient statistics, says that, for estimating the volatility, every increment in the price process is needed. This means that any discrete sampling implies loss of information.
One of the standard ways to measure daily volatility is via realized volatility: chop the day in tiny intervals, compute squared returns for each interval and add them up for the entire trading day, you get an estimate of daily volatility. This formula seems straightforward, yet there is something problematic with it. As you increase the sampling frequency, the microstructure noise comes in to picture. This noise component is besides the usual bid-ask bounce. Bid-ask bounce can be easily removed from the training data by taking midpoint of quotes. This noise component that arises out microstructure noise is termed as incoherent price process. Zhou’s paper takes in to account this noise process and models the log price process as a combination of Brownian motion and an i.i.d process. In this kind of a setup, Zhou suggests an estimator that is simple enough to implement. Another problem using realized volatility computed via finer intervals squared returns is that the estimator is strongly biased upwards. To achieve unbiasedness, the lower bound for the time between observations is about the order of 30 min which means throwing away most of the high frequency data.
What’s this paper about?
The paper is mainly about computing a set of estimators that include tweaks of Zhou’s estimators in “tick” time rather than “homogenous” time and then comparing the performance of all these estimators on simulated data. What are the estimators considered in the paper?
A constant volatility random walk and a GARCH(1,1) process data is simulated and each of the above estimators are tested for their forecasting effectiveness. The paper goes in extreme detail in stating the various issues that come up in such a back testing exercise. Extremely valuable for anyone working with HFD.
What are the conclusions of the tests ?
Overall conclusion of the paper is that the authors suggest the Zhou’s estimator on filtered time series as a best available estimator. However the authors conclude the paper with a word of caution:
The optimal choice of the volatility estimator is still an open problem !
Posted at 11:04 AM in Econometrics, Finance | Permalink | Comments (0)
The paper titled, “Consistent High-Precision Volatility from High Frequency Data”, talks about the trade off that one needs to make while considering sampling frequency. If you increase sampling frequency, the measurement error goes down but microstructure noise increases. If you decrease sampling frequency, the microstructure noise decreases but the measurement error goes up.
Researchers in the past have suggested the usage of 10min/20min/xmin intervals based on some visual tools that have fancy names such as volatility signature plots. The authors of the paper argue that there is a flaw in using such tools that work on homogenized time series. If such tools are used in tick time, the sampling frequency becomes so low that it discards most of the HFD.
To address this problem differently, the authors suggest a direct way to handle bias arising out of microstructure noise. For the FX markets, the authors model the price process as a combination of Brownian motion and a random i.i.d noise process. They recommend a procedure where by an EWMA operator can be used to reduce the microstructure noise, resulting in a filtered time series that can be conveniently used for volatility estimation.The EWMA operator is useful for removing noise in FX data whereas a more complicated method needs to be adopted for removing noise from the equity data.
The basic idea of the paper is to provide a method to filter away the noise at high frequency scale so that the resulting series can be used to get a high precision volatility estimate.
Posted at 02:39 AM in Econometrics, Finance | Permalink | Comments (0)
The paper titled, Regression analysis with many specifications, uses stationary bootstrap method to evaluate a large set of models.
In a typical data mining set up, the problem of choosing the number of covariates can be handled in many ways such as
In each of the above cases, the assumption is that covariates are independent. In a time series setting where regressors are lagged time series, the assumption of independence is obviously weak. Hence these methods might have limited applicability. Having said that, I guess that there are researchers working in ML/DM areas who are trying to refine methods so that broad range of techniques available inthe data mining field could be applied in econometrics. Ok,now coming back to the paper,
Let’s say you have an independent variable and a set of 101 covariates you want to regress. You somehow feel that at max three regressors should be enough. At the same time you know that a specific variable, call it X1 out of 101 variables must always appear in the model. So, the selection boils down to 100 variables. How do you go about selecting the number of regressors to include ? Should it be two/three regressors? Based on the number of regressors, what should be those regressors ?
This paper by PR Hansen is a nice paper that gives the reader a basic understanding of model selection in an univariate time series setting. The methodology followed in the paper is as follows :
Coming back to our problem of model selection, there could “100 \choose 1” two variable models, “100 \choose 2” three variable models. To begin with we can run the above 6 step procedure for two variable models, “100 \choose 1” and check whether maximal R squared statistic is significant. Similarly one can run the above 6 step procedure for “100 \choose 2” three variable models.
A variant of this procedure can be used to select a subset of volatility models that can be considered as best performing. among a large set of volatility models.
Posted at 02:27 AM in Econometrics, Statistics | Permalink | Comments (0)
Geeta Ma’am & Kathalaya find a new audience : India Inc
Posted at 11:44 PM in Storytelling | Permalink | Comments (2)
Via efincareers :
Quant traders working in investment banking are not happy. Squeezed by regulations that curb investment banks’ prop-trading activities and by cost-cutting that means that pre-crisis compensation packages have been consigned to history, job dissatisfaction is at an all-time low, according to industry observers.
Quantitative PhDs who would have usually gravitated towards high-paying roles in the financial sector are looking for alternative career paths, while those already working in banking are seeking to move on.
Where are they going? The obvious answer is to strike it out alone. The fact that 60 quant traders from Barclays’ Quants division are kick-starting their own hedge fund is part of a broader trend of Wall Street’s rocket scientists looking for riches in smaller operations. Mark Standish, the former co-head of capital markets at RBC, and Richard Tavoso, who oversaw global arbitrage and trading at the firm, are also starting Taursa Capital Partners.
“Startup costs for a quant manager or team to go independent are falling very rapidly,” says John Fawcett, founder and CEO of quant community Quantopian. “That decrease is driven by the rapid adoption of lower cost technology services and access to open-source analytics and computing packages. At the same time, quants have less compelling incentives at large banks.”
The Barclays and RBC traders had one thing in common, though – a strong and demonstrable track record, which would prove attractive to potential investors. Fawcett believes that aspiring quants straight out of university will also increasingly look to go it alone, using online portals (like Quantopian) to match quant talent with investors with a desire to gain exposure to these strategies.
“Historically, this group would be the most likely to pursue a standard investment banking career track – but today that talent is looking more and more favourably on Silicon Valley, and more critically at Wall Street,” he says. “They see peers with the same computer science degrees who go into the software/internet space taking big risks, being more entrepreneurial and enjoying a more independent lifestyle as a result. I think the next generation of quants is going to be even more influenced by these.”
This, of course, seems like a big risk for a quant with little or no financial services experience. However, this is not the only alternative career paths that quants are taking instead of going into the relative safety of a high-paying banking job.
“A lot of PhDs are taking jobs as quantitative researchers at large tech firms like Google and Amazon, and getting exposure to cool technologies that have yet to reach the financial sector,” says James Kennedy, head of the quant and trading practice at NJF Search. “They’re then moving to hedge funds to help them make the most of big data opportunities, and this switch usually means a lot more money.”
The key to making earning big bucks at a hedge fund is retaining control of your intellectual property – owning the algorithms and code is in line with the more entrepreneurial spirit of today’s quants, says Fawcett: “I think we will see quants start to push to retain more ownership over their own intellectual property and performance. At Quantopian we are betting on this trend to drive more demand for hosted backtesting and trading platforms.”
Alternative career options
The cost-cutting at investment banks is forcing quants out the industry and is acting as a “recruiting sergeant” for high frequency trading firms that are actively hiring, according to Jon Gilbert, head of the technology and quant practice at recruiters Astbury Marsden.
However, this isn’t the limit of career options. Hedge funds who employ high frequency trading strategies are also increasingly looking to hire quant traders to research, design and implement their own strategies for medium frequency trading strategies. This is providing an increasing number of employment opportunities for quants looking to leave banking, but who still fear striking out on their own, says Kennedy.
“Traders are particularly drawn to the smaller trading houses where they have greater freedom to develop their own trading strategies and quickly make a name for themselves,” adds Gilbert.
All of this is not to say that the investment banks don’t want to hire quants. The long-term future of quant traders in investment bank is increasingly shaky as firms are forced to retreat from prop trading activities, but banks are hiring in other areas, says Kennedy.
“We’re seeing a lot of roles for quants within the investment banks around model validation, largely as a result of regulations like the Comprehensive Capital Analysis Review (CCAR) and a lot of PhDs are moving into data scientist positions within financial services,” he says. “Ultimately, quants are not going to make the sort of money they made pre-2007, but the jobs in banking are still relatively well-paid.”
Posted at 12:43 AM in Finance | Permalink | Comments (0)