Steven Shreve’s books on Stochastic calculus (Volume I + Volume II) are amazing in terms of breadth. Basic intuition is built in Volume I using a discrete-time binomial asset pricing model. In Volume II, the author introduces all the concepts needed to build a financial model in continuous-time. In this post, I will try to summarize a few points from Volume II.

__Chapter 1: Introduction__

The most important mind shift that one needs to make when moving from the discrete-time case to continuous-time case is that of “uncountable outcome space”. This means that intuitive understanding of probability is not enough. One needs to have a decent understanding of measure theory. The first chapter and second chapters of the book serve as a crash course to measure theory.

Chapter 1 starts off with discussing two examples where the outcome space is uncountable. These examples show how one can create the event space from the outcome space. Once the event space is created, a measure is clipped on to it so that one moves from measurable space to measure space. Probabilities are assigned to sets rather than atoms in the case of uncountable outcome space. Hence one needs to work with sets. There are some examples given that highlight the need to have a firm grasp of set theory. There are many complicated sets that have probability but which cannot be described explicitly. Hence set theory helps formulate the interested event as a combination of simpler sets for which probability is known. The key probability related concepts covered in Chapter 1 are Random variables, Distribution measure, CDF, the condition for equivalence of Riemann and Lebesgue integral, Monotone convergence theorem, Dominated convergence theorem, Law of Unconscious Statistician (LOTUS), Equivalent measures, Change of measures. Obviously from a math-fin perspective, the most valuable section is the change of measure.

Enough explanation is given so that the reader understands that it is necessary to "separate the process from the measure". A process with an outcome space will have different distributions based on what measure is being applied. This concept is made specific with some examples like converting a normal random variable in to a standard normal random variable using measure change. A key tool for measure change is the Radon Nikodym derivative. To get an intuition behind this tool, I think its better to review Volume I and then follow the content from this chapter.

__Chapter 2: Information and Conditioning__

It is very important to get the intuition right about the concepts such as sigma algebra generated by a variable, filtration, adapted stochastic process and conditional expectation. It is hard to appreciate these objects in an uncountable outcome space without seeing how they behave in a countable or in a simplified outcome space.

The chapter starts off with a three coin toss outcome world and gives the reader, a good intuition about the sets of a sigma algebra. By using phrases such as "sets are resolved by the information", the user gets a good idea about the meaning of filtration. In a typical undergrad setting, one does not need concepts such as sigma algebra as the outcome space is pretty much well known and you are trying to estimate something about a random variable that is defined on the entire outcome space. However things become murkier in the real world where you have random variables defined on partial information. The random variables themselves generate sigma algebras and you need to be comfortable in working with them. Filtrations are key math objects that appear in defining an adapted stochastic process. The beauty of this chapter as mentioned earlier is that these concepts are explained using a three coin toss outcome space. You can clearly see the connections between various concepts.

The chapter explains the principles of independence, conditional expectation, Markov property and Martingales. I liked the way conditional expectation is explained in the chapter. Personally I have always found conditional expectation to be the toughest concept in probability theory. May be because one needs to guess the variable and there is no well defined way to go about guessing. All you have is that conditional expectation should follow two properties. One of the properties is ``partial averaging''. One must guess the random variable so that it satisfies ``partial averaging '' property. This chapter lists all the necessary properties of conditional expectation. __ Chapter 3: Brownian motion__

The chapter starts off with a section on symmetric random walk and lists the properties of the object such as independence of increments, it being a martingale etc. It then uses a scaled version of symmetric random walk to illustrate the concept of quadratic variation. The intention behind introducing scaled random walk is that it converges in distribution to Brownian motion and thus is a nice way to look at a discrete process that can generate Brownian motion.

To keep continuity with Volume I of the book where binomial asset pricing model is dealt, the chapter uses the limit of a binomial process to illustrate log normal distribution, the most common assumption for the distribution of stock prices. Given this prelude, Brownian motion is formally defined and properties of univariate and multivariate Brownian motion are given. The fact that Brownian motion has a Gaussian distribution at its core gives the flexibility of defining a Brownian motion in terms of moment generating functions, or mean and covariance matrix or in terms of independent increments and their distributions. The section on Quadratic variation explores the peculiar behavior of Brownian motion as compared to other random processes. Since the process is nowhere differentiable, one sees that it has a quadratic variation property. This property is the reason why one must learn about Ito's calculus. There is a little section that shows that by assuming GBM, one can compute the volatility of the asset based on a sample path. Brownian motion is a flexible object because it is a Martingale as well as a Markov process. The Markov property is especially useful to compute the expectation for a variety of functions that are dependent on the Brownian motion.

One learns the importance of first passage time in a discrete Markov chain setting where it can be used to classify states as null recurrent or positive recurrent state. In the case of Brownian motion, one can guess that it is a null recurrent process as it is nothing but a scaled symmetric random walk, where the latter is a null recurrent chain. This guess is made rigorous by introducing exponential martingale that contains a Brownian motion. One can use optional sampling theorem on this martingale and come up with the result that exponential martingale stopped at first passage time is still a martingale. This fact can be used in a beautiful way to compute the hitting time probability of Brownian motion and the transition density of the Brownian motion. These properties of the first passage time are rederived using reflection principle.

Introduction to any stochastic process must not overwhelm the reader and this chapter does just that. It gives the right amount of math to start working with the process. Thankfully construction from the first principles is left out. If you have worked on continuous Markov chains with discrete state space, one can appreciate the transition density concepts in a better way. In a CTMC, you assume that the holding times are exponentially distributed with a parameter that is dependent on the state and hence one can talk about transition probabilities. Since Brownian motion is a continuous state continuous time process, you talk about transition density instead of transition probabilities. The other thing that one needs to appreciate is the Markov property of Brownian motion. It is extremely useful for simulating various types of sample paths and computing various aspects of the sample path. For example an option which is dependent on the maximum of Brownian motion, one can use the Markov property to get the conditional density of the maximum of Brownian motion given the value of the Brownian motion at a specific time.

__Chapter 4: Stochastic Calculus__

This chapter introduces many concepts of stochastic calculus such as Ito integral, Ito processes and Ito's lemma. The Ito's integral is defined for simple integrands and main properties such as mean, variance, Quadratic variation and martingale property are explored. Subsequently Ito's integral for general integrands is introduced and the following properties of general Ito's integral are explored

- Continuity
- Adaptivity
- Linearity
- Martingale
- Ito's Isometry
- Quadratic variation

The chapter contains a thorough introduction to Ito's lemma for various stochastic processes. Numerous examples are given so that a reader is comfortable in applying Ito’s lemma to Brownian motion, functions of Brownian motion, Ito processes etc. Black Scholes PDE is derived assuming that a replicating portfolio exists. This might be a little odd for someone who has not come across replication argument. Why should there be a hedge? This is dealt in the chapter on risk neutral pricing. In any case once you assume a self replicating portfolio, you can equate the stochastic component and time component of the SDEs to get Black Scholes PDE. Levy’s characterizations for univariate and multivariate Brownian motions are given. The chapter ends with a section on Brownian bridge that is useful for Monte Carlo simulation.

The takeaway from this chapter is – Ito integrals are to evaluating using two steps. First step involves finding a non anticipating function that converges to the integrand in the mean square sense. Second step involves formulating the discrete stochastic integral of the non anticipating function. The final step involves taking the limiting value of the discrete stochastic integral to arrive at the Ito integral.

__Chapter 5 – Risk Neutral Pricing__

The section starts with the most important process, the Radon Nikodym derivative variable that is relevant to risk neutral pricing. This is denoted by Z and it plays a key role in changing the measure of a random variable. If you have a normal random variable with a constant mean, using Z, one can change the measure so that the variable is a standard normal under the new measure. From a computational perspective, Radon Nikodym derivative is used to swap between the real world measure and risk neutral measure for calculating the expectations. As far as changing the measure on a stochastic process, you need much more than a simple variable, you need a process to do that job. This is precisely done by manufacturing a Radon Nikodym process by conditioning on the filtration.

The chapter introduces one dimensional Girsanov theorem that is very useful to change the measure for a Brownian motion. If you take a Brownian motion, any measure change can only change the drift component. The volatility of the original process remains the same as volatility determines the possible price paths and any measure change does not interfere with the price paths. It changes the likelihood of the price paths.BTW, these measures old and new, go under the name,”equivalent measures'”. However if the original process becomes a martingale by changing the measure, then the new measure is called ''equivalent martingale measure''. Expressions for GBM and discounted GBM are given under risk neutral measure. The chapter then talks about Martingale representation theorem that basically says that you if you have two martingales with respect to same measure, you can manufacture one from another.

For some reason, I think this chapter should have had a clear description right at the beginning of the chapter, about the need for understanding measure change, martingale etc. I love the presentation in the book by Baxter and Rennie who give the three step procedure to find the option value, right at the very beginning :

- Find a measure so that discounted stock price is a martingale. Here is where one can used Girsanov
- Form a martingale process involving the claim of the relevant derivative
- Use Martingale representation theorem to guarantee a self replicating portfolio

A portfolio with long stock and long money market account is a martingale under risk neutral measure. Since the discounted stock price is a martingale, you can manufacture a replicating portfolio using Martingale representation theorem. One way to understand Martingale Representation theorem is

- In the big bad world, there is P measure
- You can use Girsanov's theorem to change to any measure. One can use the theorem to see to it that discounted stock price is a Martingale
- Once you are in this world where discounted stock price is a martingale, it means that the discounted wealth equation of long stock and long money markets is also a Martingale
- You can manufacture a process from the claim by conditioning on the sigma algebra.
- You have now two martingales, one from the claim process, and one from the wealth equation.
- You can use Martingale representation theorem to manufacture both the above processes from discounted stock price equation. Why? Both processes are martingales and hence you can merely scale and shift the discounted price martingale to manufacture other martingales

The chapter then uses Multidimensional Girsanov and Multidimensional Martingale theorem to state two fundamental theorems of asset pricing. The first states that if a market model has a risk-neutral probability measure, then it does not admit arbitrage. The second theorem is about the uniqueness of risk neutral measure. The chapter concludes by using risk neutral framework for valuing options on stocks that pay continuous dividends, stocks that pay discrete dividends, forwards and futures.

__Chapter 6 – Connections with Partial Differential Equations__

This chapter gives the four-step procedure for finding the pricing differential equation and for constructing a hedge for a derivative security. They are

- Determine the variables on which the derivative security price depends. In addition to time t, these are the underlying asset price S(t) and possibly other stochastic processes. We call these stochastic processes the state processes. One must be able to represent the derivative security payoff in terms of the state processes
- Write down a system of SDEs for the state processes. Be sure that, except for the driving Brownian motions, the only random processes appearing on the right hand side of these equations are the processes themselves. This ensures that the vector of state processes is Markov
- The Markov property guarantees that the derivative security price at each time is a function of time and the state processes at that time. The discounted option price is a martingale under the risk neutral measure. Compute the differential of the discounted option price, set the dt term equal to 0, and obtain thereby a PDE.
- The terms multiplying the Brownian motion differentials in the discounted derivative security price differential must be matched by the terms multiplying the Brownian motion differentials in the evolution of the hedging portfolio.

__Chapter 7 - Exotic Options__

This chapter contains the pricing for three kinds of exotic options:

- Barrier options: PDE approach and risk neutral approach are described. In the risk neutral expectation evaluation, joint density of maximum of Brownian motion and Brownian motion is used to derive a closed form solution. To make the computations easy, another change of measure is done from the risk neutral measure so that the term with constant drift term is also removed from the context. So, in all there are two change of measures that take place in the evaluation of barrier options, first is the change of measure from real world to risk neutral world, second is the change of measure from a risk neutral world to a world that makes computations even more convenient. PDE pricing is done using stopping times and optional sampling theorem. Out of the two approaches, PDE approach is more appealing to me.
- Lookback options: Floating strike case is analyzed where the lookback option is priced using PDE approach as well as risk neutral approach. In using the PDE approach, there is an extra dY term that is not like dW or dt. This differential gives rise to a new boundary condition. PDE approach looks elegant as compared to several pages of ink wasted on deriving a closed form solution
- Asian options: The PDE approach has a twist here. One has to introduce a new state variable so that the pair of processes involving the stock price and the new process constitute a two dimensional Markov process. The PDE looks similar but the only change is a new boundary condition. This option has no closed form solution and hence risk neutral expectation approach is not explored. Instead a Numeraire based approach is given. Numeraire based option valuation is a powerful way of thinking about option valuation. The advantage of learning and understanding this approach is that it can applied to a larger universe of derivative securities valuation.

__Chapter 9 – Change of Numeraire__

This chapter deals with numeraire, unit of account in which other assets are denominated. What's the advantage of valuing assets in terms of a numeraire? Well, firstly there could be financial considerations where the claim processes force to value the claim in terms of different currency, i.e. different numeraire. More often, numeraire approach is taken for ease of modeling. A model can be complicated or simple depending on the choice of numeraire. Firstly one must keep in mind that the risk neutral measures changes as soon as the assets are accounted by a different numeraire. Hence one might have to change the measure again so that risk neutrality is preserved. One massive advantage is that this change of measure arising due to numeraire has a very appealing Radon Nikodym derivative process. It turns out that the discounted numeraire that is normalized turns out to be the Radon Nikodym derivative process.

This simplifies many computations. The chapter shows applications of three numeraires

- Domestic money market account
- Foreign money market account
- A zero-coupon bond maturing at time T also called the T-forward measure

In the case of the first two numeraires, appropriate measures are computed where by the relevant quantities become martingales. For example a stock and foreign money market that are valued according to the domestic money market numeraire have a domestic risk neutral measure under which the two assets are martingales. It is important to find out the martingale measure so that one can use Martingale representation theorem to create a self-replicating portfolio. In the same manner, the Domestic money market and stock valued according to the foreign money market numeraire have a foreign risk neutral measure.

Siegel's exchange rate paradox is explained very well using the domestic risk neutral measure and foreign risk neutral measure. The chapter ends with valuing an option under a random interest rate environment using forward measure. There is an exercise problem on quanto option that illustrates the power of numeraire in valuing options. Quanto option pays off in one currency the price in another currency of an underlying asset without taking currency conversion in to account. For example a quanto call on a British asset struck at $ 25 would pay $ 5 if the price of the asset upon expiration of the option is £30. To address this problem, you take the asset with GBM; divide by exchange rate to get the price in the foreign currency. You then show that the price process is a GBM too and then value quanto options.

__Chapter 11 – Jump Processes__

If one has to introduce jumps in to the price process, one of the elementary ways that is analytically tractable is the Poisson process. Some preliminary background is provided for the reader so that Poisson and Compound Poisson processes can be incorporated in to derivative modeling. Poisson process is an appealing process for many reasons, one of it being that it is memoryless. Poisson processes are characterized by exponentially distributed interarrival times or gamma distributed arrival times or as a counting process. A homogeneous Poisson process has stationary and independent increments. Typically one comes across all the properties are Poisson processes in any elementary text on probability. A variant of basic Poisson process that is relevant from derivative pricing perspective is the Compensated Poisson process. This process is a martingale and like everywhere else in math fin, martingales are cherished objects for risk neutral valuation.

Poisson process is too simplistic for financial markets. The most basic variant of Poisson process is Compound Poisson that allows for random jump sizes. These random jumps are IID and are independent from the Poisson process. A compensated compound Poisson process is defined and is shown that this is a martingale. One of the classic ways to look at Compound Poisson process with finite jump size is by using superposition principle. If you consider a time interval, one can define several Poisson processes that have fixed jump size whose intensity is proportional to the intensity of the original process. This decomposition of the original compound Poisson process in to multiple Poisson processes of fixed size is an analytical convenience. Many problems can be solved by this property of splitting and merging Poisson processes.

In a pure Black Scholes world the delta hedge position needs to be integrated with respect to dW, the Brownian motion process. In the case of a jump diffusion process, the integrator is a process that has a pure jump process component and a Brownian motion component. The chapter deals with processes with finitely many jumps in any time interval. Obviously with this new component in the price process, one needs tools to work with. One of the first techniques that need to be learnt is the application of Ito’s lemma for a process with jumps. Since the jump process and Brownian motion are independent, Ito’s lemma for a jump process looks very similar to Ito’s lemma defined for a Brownian motion functional. The only extra term is the one that captures the jump behavior over an interval. Stochastic integral of a function with respect to jump process is defined and it has most of the structure of a general Ito process. Obviously there are a few restrictions on the integrand so that stochastic integral makes sense. Quadratic variation of the process also changes as there is a jump component. Again since the jump component is independent, the quadratic variation term arising from Brownian integrator tags along with the quadratic variation of the jump process. Ito's lemma for multiple jump processes is also mentioned. Perhaps the most challenging section of the chapter is the one on risk neutral measure. A change of measure for a simple Poisson process affects the intensity of the process. A change of measure for a compound Poisson process affects the intensity and the distribution of the jump sizes. In each case, an equivalent of Girsanov theorem is stated for changing the measure. Detailed explanation and derivations are given for change of measure for a homogeneous Poisson, compound Poisson, compound Poisson + Brownian motion. The associated Radon Nikodym derivatives are also provided. The chapter ends with pricing a call option under jump diffusion process.

This book provides a clear exposition of all the concepts relating to the stochastic calculus that are needed for understanding advanced continuous-time models.