[GSoC-PortA] mapping function

Sat Jun 29 15:45:11 CEST 2013

Based on side conversations with Ross and Peter, I thought I should talk 
a little bit about next steps related to the mapping function.

Apologies for the long email, I want to be complete, and I hope that 
some of this can make its way to the documentation.

The purpose of the mapping function is to transform a weights vector 
that does not meet all the constraints into a weights vector that does 
meet the constraints, if one exists, hopefully with a minimum of 
transformation.

In the random portfolios code, we've used a couple of techniques 
pioneered by Pat Burns.  The philosophical idea is that your optimum 
portfolio is most likely to exist at the edges of the feasible space.

At the first R/Finance conference, Pat used the analogy of a mountain 
lake, where the lake represents the feasible space.  With a combination 
of lots of different constraints, the shore of the lake will not be 
smooth or regular.  The lake (the feasible space) may not take up a 
large percentage of the terrain.

If we randomly place a rock anywhere in the terrain, some of them will 
land in the lake, inside the feasible space, but most will land outside, 
on the slopes of the mountains that surround the lake.  The goal should 
be to nudge these towards the shores of the lake (our feasible space).

Having exhausted the analogy, let's talk details.

A slightly more rigorous treatment of the problem is given here:
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1680224
It is possible that can use this method directly for random portfolios 
(and that we could add the ectra constraint types to DEoptim).  If so, 
much of the rest of what I'll write here is irrelevant.  I strongly 
suspect that there will be some constraint types that will still need to 
be 'adjusted' via a mapping method like the one laid out below, since a 
stochastic solver will hand us a vector that needs to be transformed at 
least in part to move into the feasible space.  It's alsom not entirely 
clear to me that the methods presented in the paper can satisfy all our 
constraint types.

I think our first step should be to test each constraint type, in some 
sort of hierarchy, starting with box constraints (almost all solvers 
support box constraints, of course), since some of the other 
transformations will violate the box constraints, and we'll need to 
transform back again.

Each constraint can be evaluated as a logical expression against the 
weights vector.  You can see code for doing something similar with time 
series data in the sigFormula function in quantstrat. It takes advantage 
of some base R functionality that can treat an R object (in this case 
the weights vector) as an environment or 'frame'. This allows the 
columns of the data to be addressed without any major manipulation, 
simply by column name (asset name in the weights vector, possibly after 
adding names back in).

The code looks something like this:
eval(parse(text=formula), data)

So, 'data' is our weights vector, and 'formula' is an expression that 
can be evaluated as a formula by R.  Evaluating this formula will give 
us TRUE or FALSE to denote whether the weights vector is in compliance 
or in violation of that constraint.  Then, we'll need to transform the 
weight vector, if possible, to comply with that constraint.

Specific Cases:
I've implemented this transformation for box constraints in the random 
portfolios code.  We don't need the evaluation I'll describe next for 
box constraints, because each single weight is handled separately.

min_sum and max_sum leverage constraints can be evaluated without using 
the formula, since the formula is simple, and can be expressed in simple 
R code.  The transformation can be accomplished by transforming the 
entire vector.  There's code to do this in both the random portfolios 
code and in constrained_objective.  It is probably preferable to do the 
transformation one weight at a time, as I do in the random portfolios 
code, to end closer to the edges of the feasible space, while continuing 
to take the box constraints into account.

linear (in)equality constraints and group constraints can be evaluated 
generically via the formula method I've described above.  Then 
individual weights can be transformed taking the value of the constraint 
(<,>,=) into account (along with the box constraints and leverage 
constraints).

and so on...

Challenges:
- recovering the transformed vector from a optimization solver that 
doesn't directly support a mapping function.  I've got some tricks for 
this using environments that we can revisit after we get the basic 
methodology working.

-allowing for progressively relaxing constraints when the constraints 
are simply too restrictive.  Perhaps Doug has some documentation on this 
as he's done it in the past, or perhaps we can simply deal with it in 
the penalty part of constrained_objective()

Hopefully this was helpful.

Regards,

Brian

-- 
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock