This is probably a linear programming question, and probably easy, but ill ask anywayI have a list of objects (call them OBJ) and two lists of prices for those objects, taken at different time periods. I want to partition the objects into two groups such that the difference of the average of the differences associated with each group is maximizede.g MAX( avgGroup1(P2-P1)- avgGroup2(P2-P1))So we're maximizing the difference between groups of the average price difference amongst the groupsmake sense?
6/30/2009 1:50:21 PM
Does this need to be guaranteed optimal and does the process have to be deterministic?
6/30/2009 2:10:50 PM
^ No, it doesn't have to be either.It does have to be 'near optimal' though. I don't know what the word for it is, but suppose i give it a (reasonable) parameter, it can't be off of the optimal solution by more than that parameter. Ie. it can't have a probability of giving me the worst solutions. I know this is going to be some simplex method bullshit that I don't want to do....
6/30/2009 2:27:42 PM
^ That makes this a lot easier, nearly optimal is something relatively easy to do.Are you familiar with Evolutionary Algorithms (EAs)? Basically you start with an initially (poor) solution and then evolve it until you have a good one. This problem sets itself up well for an EA I believe.This is basic idea of what you would do:1) - Randomly split the list into two groups. Measure the fitness of this solution, which in this case is:avgGroup1(P2-P1)- avgGroup2(P2-P1)2) - Create a new generation of solutions by randomly permuting your initial one. Swapping members from one group with ones from the other or taking (no exchange) a member from one group and putting it in the other. Calculate the relative fitness for one of these new offspring.3) - Allow selected (there are a few different methods of selection you can use) members of the population to be permuted/reproduce to create another generation of solutions.4) Repeat 3 for X generations or until you have solution convergence.This may not be for you, but I believe it would work really well for your problem. You could probably do this in only a few lines of code too.
6/30/2009 2:42:52 PM
I may be misunderstanding what you can choose here, but it sounds fairly simple.Suppose you have N objects total, and that you assign k objects to group 1 and N-k objects to group 2. For any given k, the groups that maximize your objective function can be constructed by rank ordering your objects according to the difference in the prices at each time (let a_1 be the object with the largest price difference, all the way through a_N the object with the smallest difference). Then the solution must be either assigning a_1 through a_k to group 1, or assigning a_(N-k+1) through a_N to group 1 (I'm assuming here that group labeling doesn't matter, so you're actually interested in the maximum absolution value of the difference).If you don't have a constraint on the size of the groups (i.e., partitions don't have to be of equal size), then you next need to choose the value of k that maximizes your objective function. Suppose you initially have optimal groups containing k and N-k objects. Adding another object to group one, resulting in k+1 objects, results in adding an object with a smaller difference in price than the group average with only k objects. Group 2 also loses an object with a larger difference in price than the group 2 average; therefore the difference between Group 1 and Group 2 becomes smaller than it was with only k objects. This holds in general, so the optimal choice of k is 1.
7/8/2009 1:29:23 AM
^ Are you saying that it is always optimal to assign one member to a group by itself and assign all others to the other group?
7/8/2009 10:42:37 AM
^^ Yeah, thanks. Now that I'm thinking about it, it is pretty simple. if the groups don't have to be the same size (they don't) I can simply start with the n-1 largest elements and the smallest. find the difference of avgs and compare that to the n-2 largest and the 2 smallest. if it increases the difference, continue, if not, stop. the act of swapping changing the smallest element is monotone, so I wont ever get pidgin holed into a wrong answer. (I don't think)^ That cant be true, consider 100,100,1,1I guess this problem gets harder when you have arbitrarily many groups and want to maximize the average difference between each group pairwise... or maybe it doesnt.[Edited on July 8, 2009 at 10:54 AM. Reason : .][Edited on July 8, 2009 at 10:58 AM. Reason : .]
7/8/2009 10:53:26 AM
^^Sorry, I was implicitly assuming that the price differences were all distinct. If that's not the case, then you would rank order them as before, and if you are including an object in your group, you should also include any other object with the same price difference. That solves problems caused by sets like {100,100,1,1} as 1985 proposed.At first glance, I think this would only matter if the objects with identical price differences are the maximal or minimal price differences for the set of all objects. But yes, I think the solution should be that one group contains all the objects that have the maximal [or minimal] price difference.This is pure intuitive speculation, but generalizing to more than two groups might work as an iterative solution. Maximize two groups as outlined above; choose a third group as all objects from the second group that are on the opposite end of the ordering from the objects in the first group. In other words, if you choose the first group by taking the objects with the maximal price difference, the third group might be taking all objects with the minimal price difference, leaving the second group to be the rest in the middle. This does depend somewhat on your choice of metric when generalizing; are you thinking about maximizing the average pairwise difference in average price differences? You could also minimax it, or use some other metrics.Going beyond three groups, I would extend that to alternating picking off the highest and lowest price difference objects from what remains ungrouped, until you get to the specified number of groups. Then you just need to test whether your initial group should be the maximal or the minimal price difference objects. Again, just intuition, I haven't thought much about it.This is probably less a linear programming problem, and more about partitions of ordered sets.
7/8/2009 3:53:43 PM
7/8/2009 4:45:10 PM
^ You got it right, that is what I mean and your counter example works.well, almost, youd have to use 101,100,2,1 to statisfy his conditions[Edited on July 8, 2009 at 5:56 PM. Reason : .]
7/8/2009 5:54:33 PM
That's what I thought. Anyway, the most reasonable solution to me still seems to be to use something from the optimization toolbox (like an EA). You can't use linear programming, because this it too dynamic (i.e. you can't come up with a set of deterministic equations beforehand to solve). You can't use any the well known clustering or partitioning algorithms because you really don't care about clustering. If you wanted to group objects into two groups such the variance was minimized you could use these. This also rules out and sort of two dimensional projection and using splines.You can't use any of the function minimization techniques like Gauss-Newton for the same reason you can't use LP.I think that the heuristic that Shadowrunner suggested is probably the easiest to do, though I don't know if it will always yield the best answer, and I don't think it generalizes to higher dimensions, for the same reason that shortest path algorithms don't work by greedily taking the least expensive hop between two nodes. Usually there is a global minimum that can not be found by using pair-wise minimization/maximization. All that being said, I think if you ever have to do this for more than two lists something like an EA is your best bet. They are pretty scalable and robust. If you want any information on these I can point you in the right direction.Also, is this a purely academic example or are you actually using this for something?
7/8/2009 6:10:36 PM
No, my algorithm for {100,100,2,1} would give the solution Group 1 = {100,100} and Group 2 = {2,1}. What I had originally said was
7/8/2009 7:10:22 PM
I think that if you do check the two permutations for each possible size of groups, that should always result in the best answer because it results in the best answer for any given group size. Just wrap the algorithm in a loop for size 1 to size N/2, store the results in an array along with an indication of whether the best solution for a given size was the maximal or minimal group, pull out the array index of the maximal element of the array, and you have your winner.I have a hunch that the best group size is related to the largest price difference between elements in the ordering. For example, in 1985's set of {101, 100, 2, 1}, the largest price difference is between 100 and 2, so that suggests the optimal partition is the one that splits at that point. Any quick counterexamples by inserting new numbers in between 100 and 2?
7/8/2009 7:17:14 PM
^ I had the same hunch, ill try to think up a quick proof/counter exampleP.S. Thanks for the awesome thread guys. I appreciate the input. This is a problem I ran into at work, and while it doesn't really need to be solved for me to continue with work, I found it kind of fun. [Edited on July 9, 2009 at 11:03 AM. Reason : .]
7/9/2009 11:02:31 AM