TWW - Algorithm Question


	User not logged in - login - register

Home

Calendar

Books

School Tool

Photo Gallery

Message Boards

Users

Statistics

Advertise

Site Info

go to bottom

|

|

Message Boards » » Algorithm Question

Page [1]

AntecK7 All American 7756 Posts user info edit post	I have a database of items and sizes. I wan't to group them into groups of X objects with a total size of <=Y but as close to Y as possible Here are some example values (there are actually 2k of them) 109 114 54 25 85 86 71 66 62 52 146 77 117 36 70 48 I want Put them into groups of 60 items with a group total value of 6000. My first though is grab the largest and then the smallest, and keep doing that to the center median value. but ideally i would actually have them kinda averaged out. There is probably some smart/packing algorithm or something that I could use here, but I'm not that savvy anymore.. Any thoughts 2/17/2015 7:30:28 PM
sarijoul All American 14208 Posts user info edit post	So your givens are the number of items in a group and the group total value (and the large set of numbers to pull from)? 2/17/2015 7:50:08 PM
smoothcrim Universal Magnetic! 18986 Posts user info edit post	if you can assume the set of values isn't ordered, as your set is listed, then the most (compute) efficient way to do that would be to grab the next element from the list and add it to the group until the next value would take you over. a couple nested loops should do it. if you wanted to add packing efficiency as well, just have it try list.next(), list.next().next(), and list.next().next().next(), so you bound any run-away searches. should be pretty close to O(n) 2/17/2015 8:29:13 PM
neolithic All American 706 Posts user info edit post	It probably depends on how much you care about getting close to an optimal solution. A solution like the one outlined in ^ would probably be fine if you want a quick solution. I think you would probably do a little better in terms of average distance from your upper bound if you first sorted your item size list. Create a priority queue of size 6000/60 = 100 and start adding the biggest items first to this priority queue. The PQ should be sorted on the basis of total item size for each element in the queue. Add the next biggest item from the list to the top of the PQ (i.e. the element with the smallest total size) and repeat. If an element hits a size of 6000 exactly, remove it from the PQ. If you posted this to http://stackoverflow.com, I bet you would get some answers that would be nearly optimal within an hour. 2/17/2015 8:52:49 PM
moron All American 34954 Posts user info edit post	Visualize your numbers as a distribution. You have 33 buckets to pluck from. To keep things even, you have to drain the buckets by pulling from pairs in the nth bucket and n+33/2 bucket . You'll have to calculate a distribution for your values though to use this (which shouldn't be hard really). 2/17/2015 9:19:10 PM
AntecK7 All American 7756 Posts user info edit post	does need to be optimal for allocation however, compute/caluation optimization isn't really important. I don't want to really brute force it... Also my code has to be in powershell. Right now this is my current plan Since I know that the average Value of the group=100 Create a group, get the first Item from the list add that to the group find a value that has not been assigned to a group such that (Group+New Value) is as close to 100 as possible without putting the sum(group)>6000 Add that value to the group, continue till the group has 60 items in it or the sum of the group>6000 then create a new group, and grab the first free value. Again, probably horrific from a compute standpoint, but I think it will get me near where i want to be. on a simplified set of where say we wanted 5 items per group with a sum of 500, using the data i posted earlier Group 1 would be something like 109,86,114,77,117 Group 2 would be something like 54,146,70,117, et cetra 2/17/2015 9:20:27 PM
LimpyNuts All American 16860 Posts user info edit post	This looks like a variation of the "k-partition problem" which is well-studied. I did a quick googling and found "The Easiest Hard Problem: Number Partitioning" by Stephan Mertens which presents some methods for partitioning a set into 2 subsets. You might try looking in math or computer science journals. 2/18/2015 12:48:36 PM
AntecK7 All American 7756 Posts user info edit post	^ thanks for the tip. That makes sense, Looking at solutions for the "Balanced Partition Problem" which i can recursively call on the set breaking them into 2 sets of about equal size and count, then keep subdividing those sets. [Edited on February 18, 2015 at 3:11 PM. Reason : dd] 2/18/2015 3:04:30 PM
Novicane All American 15423 Posts user info edit post	sounds like a data structures class i've seen this before but my mind isn't functioning right now. 2/18/2015 4:33:25 PM
AntecK7 All American 7756 Posts user info edit post	so i think i got the follwoing powershell code working for the most part $myArray = 1,6,2,4,5,9,1 $ArraySize=$myarray.count $Sum=($myArray\|Measure-Object -Sum).sum $sol=new-object bool[] ([math]::Round($sum/2+1,[System.MidpointRounding]::AwayFromZero)) $sol[0]=$true #for ([int]$i=0;$i -lt $ArraySize;$i++){ foreach($i in $myArray){ for ([int]$j=$sum/2;$j -ge $i;$j--){ ##-Host IValue:$i #Write-Host JValue:$j if ($sol[$j-$i]){ $sol[$j]=$true } #End If }#End J Loop }#End I Loop [int]$HalfSumCloser=([math]::Round($sum/2,[System.MidpointRounding]::AwayFromZero)) For(;($sol[$HalfSumCloser]) -ne $true{ $HalfSumCloser-- } (($sum - $halfsumcloser) - $halfsumcloser) but that tells me if i can do it, not how to actually do it. 2/18/2015 5:21:03 PM
AntecK7 All American 7756 Posts user info edit post	trying to implement this http://people.cs.clemson.edu/~bcdean/dp_practice/dp_4.swf Yea.... i haven't done this type of programming problem in 12 years... if ever. 2/18/2015 6:17:37 PM
AntecK7 All American 7756 Posts user info edit post	$myarray=@() $myarray+=120 for($i=0;$i -lt 10;$i++){ $myarray+=Get-Random -Minimum 80 -Maximum 150 } $best=($myarray\|Measure-Object -Sum).Sum $target=$best/2 write-host Starting work arraysum $best $bests=@() $p=@{} $p[-1]=@{0=@()} for ([int]$i=0;$i -lt $myArray.Length;$i++){ $x=$myArray[$i] $p[$i]=@{} foreach ($j in $p[$i-1].Keys){ $p[$i][$j]=$p[$i-1][$j] $p[$i][$j+$x]=$p[$i-1][$j]+@($x) #$p if([math]::Abs($j+$x-$target) -lt [math]::Abs($best-$target) -and ($p[$i][$j+$x]).count -eq [int]($myArray.count/2)){ $best=$j+$x #write-host Best $best #write-host besti $i $j $x $bests=$p[$i][$j+$x] } } } Write-Host Solution error: ($best-$target) write-host Original Array ($myarray\|Sort-Object) write-Host Group1 (($bests\|Sort-Object) -join ', ') Write-Host group1 sum ($best\|measure-object -Sum).Sum [System.Collections.ArrayList]$group2list=$myArray foreach($value in $bests){ $group2list.Remove($value) } Write-Host write-Host Group2 (($group2list\|Sort-Object) -join ', ') Write-Host group2 sum ($group2list\|measure-object -Sum).Sum [Edited on February 18, 2015 at 11:59 PM. Reason : BUGS!] 2/18/2015 11:51:28 PM
AntecK7 All American 7756 Posts user info edit post	one note on this, on a set of 1000 buckets this thing DRINKS memory Basically re-wrote this python code in ps https://bj0z.wordpress.com/2011/03/07/the-balanced-partition-problem/ [Edited on February 19, 2015 at 12:22 AM. Reason : phython] 2/19/2015 12:20:43 AM
AntecK7 All American 7756 Posts user info edit post	okay so screw those bastards here is how i'm doing it, seems 1000 times faster, although it might not be 100% accurate $myarray=@() for($i=0;$i -lt 1000;$i++){ $myarray+=Get-Random -Minimum 80 -Maximum 150 } [System.Collections.ArrayList]$Set1=@() [System.Collections.ArrayList]$Set2=@() for([int]$i=0;$i -lt $myArray.Count/2;$i++){ $set1.Add($myArray[$i]) } for([int]$i=$myArray.Count/2;$i -lt $myArray.Count;$i++){ $set2.Add($myArray[$i]) } $s1sum=($set1\|Measure-Object -Sum).Sum $s2sum=($set2\|Measure-Object -Sum).Sum $setdifference=$s1sum-$s2sum for($i=0;$i -lt $set1.Count;$i++){ if($setdifference -eq 0){ break} $setdifference $s1item=$set1[$i] for($J=0;$j -lt $set2.Count;$j++){ $s2item=$set2[$j] if([math]::Abs((($s1sum-$s1item+$s2item)-($s2sum-$s2item+$s1item))) -lt [math]::abs($setdifference)){ #write-host Change $set1.Remove($s1item) $set1.Add($s2item) $set2.remove($s2item) $set2.Add($s1item) $s1sum=($set1\|Measure-Object -Sum).Sum $s2sum=($set2\|Measure-Object -Sum).Sum $setdifference=$s1sum-$s2sum $i=-1 break }} } write-host TotalSum ($myarray\|Measure-Object -sum).sum write-host StartArray (($myarray\|Sort-Object) -join ', ') write-host Set1Sum ($set1\|Measure-Object -sum).sum write-host Set1 (($set1\|Sort-Object) -join ', ') write-host Set2Sum ($set2\|Measure-Object -sum).sum write-host Set2 (($set2\|Sort-Object) -join ', ') 2/19/2015 1:47:43 AM
409Sea New Recruit 19 Posts user info edit post	You said you want groups of 60 close to 6000, this implies the mean of all your values is more or less 100. Is this for certain? With a mean reasonably far away from 100 I think you'd end up with a lot of values that you couldn't get to add up to 6000 in groups of 60. 2/20/2015 6:09:21 PM
AntecK7 All American 7756 Posts user info edit post	^just an example, right now values are between 80 and 300, with a mode around 120. Actually trying to load balance file servers to datacenter DFS servers 2/21/2015 12:38:29 AM

Message Boards » Tech Talk » Algorithm Question

Page [1]

go to top

|

|

Admin Options : move topic | lock topic

© 2025 by The Wolf Web - All Rights Reserved.
The material located at this site is not endorsed, sponsored or provided by or on behalf of North Carolina State University.
Powered by CrazyWeb v2.39 - our disclaimer.