[GSoC-PortA] mapping function

Sun Jul 7 07:04:38 CEST 2013

From: gsoc-porta-bounces at lists.r-forge.r-project.org
[mailto:gsoc-porta-bounces at lists.r-forge.r-project.org] On Behalf Of Ross
Bennett
Sent: Saturday, July 06, 2013 1:27 PM
To: PortfolioAnalytics
Subject: Re: [GSoC-PortA] mapping function

Doug,

I'm sure Brian will provide more detail, but hopefully this will provide
some initial clarification about random portfolios and the logic I am using
in rp_transform().

The random portfolio weights are being generated with a user specified
leverage constraint, min_sum and max_sum. One could generate a dollar
neutral portfolio by specifying min_sum=-0.01 and max_sum=1.01

[Doug] I guess you meant min_sum = -.01 and max_sum = + .01 for the dollar
neutral case.

or a leveraged portfolio with min_sum=1.49 and max_sum=1.51. I think this
gives more flexibility than the method proposed by Shaw.

[Doug]  Brian and Peter:  I have looked over your nice R-finance 2010 and
2012 presentations and will eventually play around with the R scripts for
the latter.   I also looked again at slides 43-51 in Burns 2009.   And I'm
still trying to figure out what algorithm(s) are actually used in the random
portfolio method in PortfolioAnalytics.   In both 2010 and 2012  you say
>From a portfolio seed, generate random permutations

of weights that meet your constraints on each asset.   However on slide 43
Burns seems to reject (at least that specific form of) random permutations,
and indeed just using random permutations of a fixed seed is not appealing
except maybe for permutation tests.   On slide 44 Burns proposes a sensible
general idea which is apparently what you are using.  But there is no
specificity of the random search method.   I could if necessary figure it
out from the "random" choice code in optimize.portfolio, but it would be a
lot more pleasant if you could point me to, or provide a precise description
of how the algorithm currently works in PortfolioAnalytics.  Then I would
have a solid starting point.

A related question:  is the reason for using the min_sum and max_sum window
for obtaining a reasonable number of random weight vectors that
approximately satisfy the desired equality constraint, and  similarly for
other constraints.  I'm guessing that this is not necessary to various
extents with the Shaw method.  For example it seems like the dollar neutral
set of random weights can be obtained by translating the simplex by along
the high-D 45 degree line to the origin and the leverage set of random
weights by translating in the opposite direction.  Of course you would still
have the problem of generating a huge number of random weight vectors to
minimize the constraints violation objective. 

I need to spend some more time understanding the Shaw (2011) paper, but it
is unclear to me if Shaw's method will work if we want to specify
constraints such as min=c(0.05, 0.12, 0.15, 0.05, 0.08) and max=c(0.85,
0.92, 0.73, 0.75, 0.82) or if it only works for box constraints where all
the values of min and max are equal.

[Doug] I'm not sure it even works for the box constraints with non-zero
lower bound - it looks like you can generate random weight vectors with
either a general set of lower bounds on the simplex or a general set of
upper bounds on the simplex but not both.  I'll read that part again
tomorrow to be sure.

P.S. Peter are you on this group list (I guess you are but added you in "To"
just in case you are not).

Thanks,

Ross

On Sat, Jul 6, 2013 at 1:31 PM, Doug Martin <martinrd at comcast.net> wrote:

Brian,

I don't really understand what is going on, e.g., why do you need min_sum
and max_sum?  I get the impression that the random portfolio weights are
being generated without a full-investment constraint, but with a range of
near-full-investment constraints.  Why is that needed?    If I understand
Shaw (2010) correctly there exists a good method of randomly sampling from
the simplex of positive numbers, i.e., long-only and full-investment, in a
manner that promises to work well in higher dimensions by generating several
sets of random samples ranging from one that is most concentrated at 1/n and
ones that are more concentrated near edges and vertices.  In Shaw (2011)
Section 4.3 he describes a modification for handling lower bounds, and at
first glance it appears that it can be modified for upper bounds, hence box
constraints since the long-only part is already covered.  Caveat:  I have
only quickly skimmed the first parts of the Shaw papers so maybe I am
missing something.

Doug

From: gsoc-porta-bounces at lists.r-forge.r-project.org
[mailto:gsoc-porta-bounces at lists.r-forge.r-project.org] On Behalf Of Ross
Bennett
Sent: Saturday, July 06, 2013 9:34 AM
To: PortfolioAnalytics

Subject: Re: [GSoC-PortA] mapping function

All,

I have a few thoughts on the hierarchy and how to transform weights that
violate constraints and would appreciate any input or feedback.

The process outlined below is what I am thinking of following for the
constraint mapping fn_map() function.

*	Step 1: Test weights vector for violation of min_sum or max_sum. If
violated, transform the weights vector with rp_transform() taking into
account both leverage and box constraints. Another option is to normalize
the weights by transforming the entire vector like what is done
constrained_objective()... is one way preferable over the other?
*	Step 2: Test weights vector for violation of min or max. If
violated, transform the weights vector with rp_transform() taking into
account both leverage and box constraints. If we can't generate a feasible
portfolio, this is because min or max is too restrictive.  Try relaxing min
or max. e.g., if min is violated, we could simply relax min by doing
something like min <- min - 0.05 and try this N times to generate a feasible
portfolio. Or we could randomly select an element of min and decrease it
instead of modifying the entire vector.
*	Step 3: Test weights vector for violation of groups, cLO, or cUP. If
violated, transform the weights vector with rp_transform() taking into
account leverage, box, and group constraints. If we can't generate a
feasible portfolio, try relaxing cLO or cUP. e.g., if cLO is violated, we
could simply relax cLO by doing something like cLO <- cLO - 0.05 and try
this N times to generate a feasible portfolio. Or we could randomly select
an element of min and decrease it instead of modifying the entire vector.
*	 
*	Step 4: Test weights vector for violation of max_pos. If violated,
transform the weights vector with rp_transform() taking into account
leverage, box, group, and position limit constraints.

Please advise if this is consistent with how you see the hierarchy in the
constraint mapping function working?

Thanks,

Ross

On Tue, Jul 2, 2013 at 10:13 PM, Ross Bennett <rossbennett34 at gmail.com>
wrote:

All,

I added an rp_constraint() function that uses logic from
randomize_portfolio() to transform a weights vector element by element to
satisfy (if possible) leverage *and* box constraints.

Here is a summary of where I am at with the mapping function.

*	Box Constraints (min/max)

*	rp_transform() takes a weight vector that violates either
min_sum/max_sum leverage constraints *or* min/max box constraints and
returns and weighting vector that satisfies leverage *and* box constraints.
*	txfrm_box_constraint() takes a weight vector that violates min/max
box constraints and will set any weight that violates min or max to its min
or max respectively. This is too simplistic and does not take into account
min_sum/max_sum leverage constraints.
*	I think rp_transform() is the better option here... thoughts?

*	Leverage Constraints (min_sum/max_sum)

*	rp_transform() takes a weight vector that violates either
min_sum/max_sum leverage constraints *or* min/max box constraints and
returns and weighting vector that satisfies leverage *and* box constraints.
*	txfrm_weight_sum_constraint() takes a weight vector that violates
min_sum/max_sum leverage constraints and normalizes the entire weights
vector to satisfy leverage constraints. This is too simplistic and does not
take into account min/max box constraints.
*	I think rp_transform() is the better option here... thoughts?

*	Group Constraints (groups, cLO, cUP) 

*	txfrm_group_constraint() loops through the groups and checks if cLO
or cUP is violated. If cLO or cUP is violated the weights of the given group
are normalized to equal cLO or cUP, whichever is violated. This will likely
change the sum of the weights vector and violate min_sum/max_sum so we will
have to "re-transform".
*	I think txfrm_group_constraint() is a good first step because it
gets us close to satisfying the group constraints.
*	I'm working on incorporating the group constraints into
rp_transform().
*	I'm not seeing how to use the eval(parse(text=formula), data) code
to evaluate group constraints. Do you have a simple example?

*	Diversification Constraint

*	I'm having a hard time coming up with a straightforward solution to
transform the vector of weights to meet the diversification constraint. One
idea I was working on was to generate N random portfolios and select the
portfolio with the closest diversification value.
*	Note that I define diversification as: diversification = 1 -
sum(weights^2)
*	Would it be better to just handle this *like* an objective and
penalize in constrained_objective()?

*	Turnover Constraint

*	I'm having a hard time coming up with a straightforward solution to
transform the vector of weights to meet the turnover constraint. One idea I
was working on was to generate N random portfolios and select the portfolio
with the closest turnover value.
*	Would it be better to just handle this *like* an objective and
penalize in constrained_objective()?

*	Position Limit Constraint

*	txfrm_position_limit_constraint() sets the nassets - max_pos minimum
weights equal to 0.
*	An issue is that for any min_i > 0, this will violate the min box
constraint and be penalized later. Would it make sense to change min_i to 0
for asset_i that is set equal to 0?

One last thing is that I have the mapping function in a loop to get each
constraint object. The weights vector will be transformed in the order that
the constraints were added. In order to honor the hierarchy of how we
transform the weights vector, this should not be in a loop so we control the
order of transformation. Is that correct?

I look forward to your feedback and comments.

Thanks,

Ross

On Sun, Jun 30, 2013 at 11:51 AM, Doug Martin <martinrd at comcast.net> wrote:

-----Original Message-----
From: gsoc-porta-bounces at lists.r-forge.r-project.org
[mailto:gsoc-porta-bounces at lists.r-forge.r-project.org] On Behalf Of Brian
G. Peterson

Sent: Saturday, June 29, 2013 6:45 AM
To: PortfolioAnalytics
Subject: [GSoC-PortA] mapping function

Based on side conversations with Ross and Peter, I thought I should talk a
little bit about next steps related to the mapping function.

Apologies for the long email, I want to be complete, and I hope that some of
this can make its way to the documentation.

The purpose of the mapping function is to transform a weights vector that
does not meet all the constraints into a weights vector that does meet the
constraints, if one exists, hopefully with a minimum of transformation.

In the random portfolios code, we've used a couple of techniques pioneered
by Pat Burns.  The philosophical idea is that your optimum portfolio is most
likely to exist at the edges of the feasible space.

At the first R/Finance conference, Pat used the analogy of a mountain lake,
where the lake represents the feasible space.  With a combination of lots of
different constraints, the shore of the lake will not be smooth or regular.
The lake (the feasible space) may not take up a large percentage of the
terrain.

If we randomly place a rock anywhere in the terrain, some of them will land
in the lake, inside the feasible space, but most will land outside, on the
slopes of the mountains that surround the lake.  The goal should be to nudge
these towards the shores of the lake (our feasible space).

Having exhausted the analogy, let's talk details.

A slightly more rigorous treatment of the problem is given here:

 <http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1680224>
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1680224

[Doug] This is the 2010 paper, which I just read much of.  Very nice paper.
I find Burns papers pretty useless except for pointing out nice statistical
aspect and promoting PortfolioProbe.   For example in the paper you sent, he
does not say clearly what he is doing in generating the random portfolios
and what he means by out-of-sample.  As you once mentioned, I guess you got
most details through direct conversation with him.

Then I found the new Shaw 2011 paper at
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1856476. 

It has some nice additional material, e.g., Section 4.3 on lower bounds and
Section 4.4 on bounded shorting, among other.  I still need to finish
reading this version.

In general if you accept random portfolios that violate the constraints,
what is a good way to: (a) Accept only those within a certain distance, with
appropriately defined metric,  of the feasible regions (you don't want to
consider all infeasible solutions - I guess this the reason for min.sum,
max.sum, etc.?), and (b) Assuming an appropriate metric, does one want to
take a solution nearest to a vertex?

It is possible that can use this method directly for random portfolios (and
that we could add the ectra constraint types to DEoptim).  If so, much of
the rest of what I'll write here is irrelevant.  I strongly suspect that
there will be some constraint types that will still need to be 'adjusted'
via a mapping method like the one laid out below, since a stochastic solver
will hand us a vector that needs to be transformed at least in part to move
into the feasible space.  It's alsom not entirely clear to me that the
methods presented in the paper can satisfy all our constraint types.

I think our first step should be to test each constraint type, in some sort
of hierarchy, starting with box constraints (almost all solvers support box
constraints, of course), since some of the other transformations will
violate the box constraints, and we'll need to transform back again.

Each constraint can be evaluated as a logical expression against the weights
vector.  You can see code for doing something similar with time series data
in the sigFormula function in quantstrat. It takes advantage of some base R
functionality that can treat an R object (in this case the weights vector)
as an environment or 'frame'. This allows the columns of the data to be
addressed without any major manipulation, simply by column name (asset name
in the weights vector, possibly after adding names back in).

The code looks something like this:

eval(parse(text=formula), data)

So, 'data' is our weights vector, and 'formula' is an expression that can be
evaluated as a formula by R.  Evaluating this formula will give us TRUE or
FALSE to denote whether the weights vector is in compliance or in violation
of that constraint.  Then, we'll need to transform the weight vector, if
possible, to comply with that constraint.

Specific Cases:

I've implemented this transformation for box constraints in the random
portfolios code.  We don't need the evaluation I'll describe next for box
constraints, because each single weight is handled separately.

min_sum and max_sum leverage constraints can be evaluated without using the
formula, since the formula is simple, and can be expressed in simple R code.
The transformation can be accomplished by transforming the entire vector.
There's code to do this in both the random portfolios code and in
constrained_objective.  It is probably preferable to do the transformation
one weight at a time, as I do in the random portfolios code, to end closer
to the edges of the feasible space, while continuing to take the box
constraints into account.

linear (in)equality constraints and group constraints can be evaluated
generically via the formula method I've described above.  Then individual
weights can be transformed taking the value of the constraint

(<,>,=) into account (along with the box constraints and leverage
constraints).

and so on...

Challenges:

- recovering the transformed vector from a optimization solver that doesn't
directly support a mapping function.  I've got some tricks for this using
environments that we can revisit after we get the basic methodology working.

-allowing for progressively relaxing constraints when the constraints are
simply too restrictive.  Perhaps Doug has some documentation on this as he's
done it in the past, or perhaps we can simply deal with it in the penalty
part of constrained_objective()

Hopefully this was helpful.

Regards,

Brian

--

Brian G. Peterson

 <http://braverock.com/brian/> http://braverock.com/brian/

Ph: 773-459-4973

IM: bgpbraverock

_______________________________________________

GSoC-PortA mailing list

 <mailto:GSoC-PortA at lists.r-forge.r-project.org>
GSoC-PortA at lists.r-forge.r-project.org

 <http://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/gsoc-porta>
http://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/gsoc-porta

_______________________________________________
GSoC-PortA mailing list
GSoC-PortA at lists.r-forge.r-project.org
http://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/gsoc-porta

_______________________________________________
GSoC-PortA mailing list
GSoC-PortA at lists.r-forge.r-project.org
http://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/gsoc-porta

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/gsoc-porta/attachments/20130706/b4229319/attachment-0001.html>