Fdtxjr

Question

I've created the following code to try and find the optimum "diet" from a game called Eco. The maximum amount of calories you can have is 3000, as shown with MAXCALORIES.

Is there any way to make this code faster, since the time predicted for this code to compute 3000 calories is well over a few hundred years.

Note: I am trying to find the highest SP (skill points) you get from a diet, the optimum diet. To find this, I must go through every combination of diets and check how many skill points you receive through using it. The order of food does not matter, and I feel this is something that is slowing this program down.

import itertools
import sys
import time

sys.setrecursionlimit(10000000)

#["Name/Carbs/Protein/Fat/Vitamins/Calories"]
available = ['Fiddleheads/3/1/0/3/80', 'Fireweed Shoots/3/0/0/4/150', 'Prickly Pear Fruit/2/1/1/3/190', 'Huckleberries/2/0/0/6/80', 'Rice/7/1/0/0/90', 'Camas Bulb/1/2/5/0/120', 'Beans/1/4/3/0/120', 'Wheat/6/2/0/0/130', 'Crimini Mushrooms/3/3/1/1/200', 'Corn/5/2/0/1/230', 'Beet/3/1/1/3/230', 'Tomato/4/1/0/3/240', 'Raw Fish/0/3/7/0/200', 'Raw Meat/0/7/3/0/250', 'Tallow/0/0/8/0/200', 'Scrap Meat/0/5/5/0/50', 'Prepared Meat/0/4/6/0/600', 'Raw Roast/0/6/5/0/800', 'Raw Sausage/0/4/8/0/500', 'Raw Bacon/0/3/9/0/600', 'Prime Cut/0/9/4/0/600', 'Cereal Germ/5/0/7/3/20', 'Bean Paste/3/5/7/0/40', 'Flour/15/0/0/0/50', 'Sugar/15/0/0/0/50', 'Camas Paste/3/2/10/0/60', 'Cornmeal/9/3/3/0/60', 'Huckleberry Extract/0/0/0/15/60', 'Yeast/0/8/0/7/60', 'Oil/0/0/15/0/120', 'Infused Oil/0/0/12/3/120', 'Simple Syrup/12/0/3/0/400', 'Rice Sludge/10/1/0/2/450', 'Charred Beet/3/0/3/7/470', 'Camas Mash/1/2/9/1/500', 'Campfire Beans/1/9/3/0/500', 'Wilted Fiddleheads/4/1/0/8/500', 'Boiled Shoots/3/0/1/9/510', 'Charred Camas Bulb/2/3/7/1/510', 'Charred Tomato/8/1/0/4/510', 'Charred Corn/8/1/0/4/530', 'Charred Fish/0/9/4/0/550', 'Charred Meat/0/10/10/0/550', 'Wheat Porridge/10/4/0/10/510', 'Charred Sausage/0/11/15/0/500', 'Fried Tomatoes/12/3/9/2/560', 'Bannock/15/3/8/0/600', 'Fiddlehead Salad/6/6/0/14/970', 'Campfire Roast/0/16/12/0/1000', 'Campfire Stew/5/12/9/4/1200', 'Wild Stew/8/5/5/12/1200', 'Fruit Salad/8/2/2/10/900', 'Meat Stock/5/8/9/3/700', 'Vegetable Stock/11/1/2/11/700', 'Camas Bulb Bake/12/7/5/4/400', 'Flatbread/17/8/3/0/500', 'Huckleberry Muffin/10/5/4/11/450', 'Baked Meat/0/13/17/0/600', 'Baked Roast/4/13/8/7/900', 'Huckleberry Pie/9/5/4/16/1300', 'Meat Pie/7/11/11/5/1300', 'Basic Salad/13/6/6/13/800', 'Simmered Meat/6/18/13/5/900', 'Vegetable Medley/9/5/8/20/900', 'Vegetable Soup/12/4/7/19/1200', 'Crispy Bacon/0/18/26/0/600', 'Stuffed Turkey/9/16/12/7/1500']

global AllSP, AllNames
AllSP = []
AllNames = []

def findcombs(totalNames, totalCarbs, totalProtein, totalFat, totalVitamins, totalNutrients, totalCalories, MAXCALORIES):
 doneit = False
 for each in available:
 each = each.split("/")
 name = each[0]
 carbs = float(each[1])
 protein = float(each[2])
 fat = float(each[3])
 vitamins = float(each[4])
 nutrients = carbs+protein+fat+vitamins
 calories = float(each[5])
# print(totalNames, totalCalories, calories, each)
 if sum(totalCalories)+calories <= MAXCALORIES:
 doneit = True
 totalNames2 = totalNames[::]
 totalCarbs2 = totalCarbs[::]
 totalProtein2 = totalProtein[::]
 totalFat2 = totalFat[::]
 totalVitamins2 = totalVitamins[::]
 totalCalories2 = totalCalories[::]
 totalNutrients2 = totalNutrients[::]

 totalNames2.append(name)
 totalCarbs2.append(carbs)
 totalProtein2.append(protein)
 totalFat2.append(fat)
 totalVitamins2.append(vitamins)
 totalCalories2.append(calories)
 totalNutrients2.append(nutrients)
# print(" ", totalNames2, totalCarbs2, totalProtein2, totalFat2, totalVitamins2, totalNutrients2, totalCalories2)
 findcombs(totalNames2, totalCarbs2, totalProtein2, totalFat2, totalVitamins2, totalNutrients2, totalCalories2, MAXCALORIES)
 else:
 #find SP
 try:
 carbs = sum([x * y for x, y in zip(totalCalories, totalCarbs)]) / sum(totalCalories)
 protein = sum([x * y for x, y in zip(totalCalories, totalProtein)]) / sum(totalCalories)
 fat = sum([x * y for x, y in zip(totalCalories, totalFat)]) / sum(totalCalories)
 vitamins = sum([x * y for x, y in zip(totalCalories, totalVitamins)]) / sum(totalCalories)
 balance = (carbs+protein+fat+vitamins)/(2*max([carbs,protein,fat,vitamins]))
 thisSP = sum([x * y for x, y in zip(totalCalories, totalNutrients)]) / sum(totalCalories) * balance + 12
 except:
 thisSP = 0
 #add SP and names to two lists
 AllSP.append(thisSP)
 AllNames.append(totalNames)

def main(MAXCALORIES):
 findcombs([], [], [], [], [], [], [], MAXCALORIES)
 index = AllSP.index(max(AllSP))
 print()
 print(AllSP[index], " ", AllNames[index])

for i in range(100, 3000, 10):
 start = time.time()
 main(i)
 print("Calories:", i, ">>> Time:", time.time()-start)

Edit: On request, here is the formula for calculating the $textSP :$

$$
beginalign
textCarbs & ~=~ fractextamount_1 times textcalories_1 times textcarbs_1 + cdotstextamount_1 times textcalories_1 + cdots \[5px]
textSP & ~=~ fracN_1 C_1 + N_2 C_2C_1 + C_2 times textBalance + textBase Gain
endalign
$$
where:

$N$ is the nutrients of the food (carbs+protein+fat+vitamins);

$C$ is the calories of the food;

$textBase Gain = 12$ (in all cases);

$textBalance = fractextSum Nutrients2 times texthighest nutrition .$

I didn't even know you could set the recursion limit to be so huge... :O Yeah keeping it at 1000 forces you to write safer code btw :) — Mar 17 at 20:11
Good point, when you set it that high it usually means the code is very inefficient! :P @Peilonrayz — Mar 17 at 20:12
Let's try to be more specific about your constraints. You need to select between 1 and n foods so long as the calorie count is smaller than or equal to 3000? This doesn't need recursion if you use Python's built-in itertools.combinations. — Mar 17 at 20:52
@greybeard These values are all for a game called "Eco", not for real life! — Mar 17 at 20:58
OOh, it's the knapsac problem! You're probably better off trying for a "good enough" solution. — Mar 18 at 11:46

score 19 · Accepted Answer · 2019-03-23 03:45:58Z

Readability is #1

Global variables are bad. Don't use them. I have to spend a long while looking at your code to tell what uses them and when. When your code becomes hundreds of lines long this is tedious and unmaintainable.

If you need to use recursion and add to something not in the recursive function use a closure.

You should load available into an object, rather than extract the information from it each and every time you use it.

Using the above you can simplify all your totalNames, totalCarbs into one list.

Rather than using AllSP and AllNames you can add a tuple to one list.

You should put all your code into a main so that you reduce the amount of variables in the global scope. This goes hand in hand with (1).

Rather than copying and pasting the same line multiple times you can create a function.

All this gets the following. Which should be easier for you to increase the performance from:

import itertools
import sys
import time
import collections

sys.setrecursionlimit(10000000)

_Food = collections.namedtuple('Food', 'name carbs protein fat vitamins calories')

class Food(_Food):
 @property
 def nutrients(self):
 return sum(self[1:5])


def read_foods(foods):
 for food in foods:
 name, *other = food.split('/')
 yield Food(name, *[float(v) for v in other])


def tot_avg(food, attr):
 return (
 sum(f.calories * getattr(f, attr) for f in food)
 / sum(f.calories for f in food)
 )


def find_combs(available, MAXCALORIES):
 all_combinations = []
 def inner(total):
 for food in available:
 total_calories = [f.calories for f in total]
 if sum(total_calories) + food.calories <= MAXCALORIES:
 inner(total[:] + [food])
 else:
 try:
 nutrients = [
 tot_avg(total, 'carbs'),
 tot_avg(total, 'protein'),
 tot_avg(total, 'fat'),
 tot_avg(total, 'vitamins')
 ]
 balance = sum(nutrients) / 2 / max(nutrients)
 except ZeroDivisionError:
 continue
 sp = tot_avg(total, 'nutrients') * balance + 12
 all_combinations.append((sp, total))
 inner([])
 return all_combinations


def main(available):
 for MAXCALORIES in range(100, 3000, 10):
 start = time.time()
 all_ = find_combs(available, MAXCALORIES)
 amount, foods = max(all_, key=lambda i: i[0])
 print(amount, ' ', [f.name for f in foods])
 print('Calories:', amount, '>>> Time:', time.time()-start)


if __name__ == '__main__':
 available = ['Fiddleheads/3/1/0/3/80', 'Fireweed Shoots/3/0/0/4/150', 'Prickly Pear Fruit/2/1/1/3/190', 'Huckleberries/2/0/0/6/80', 'Rice/7/1/0/0/90', 'Camas Bulb/1/2/5/0/120', 'Beans/1/4/3/0/120', 'Wheat/6/2/0/0/130', 'Crimini Mushrooms/3/3/1/1/200', 'Corn/5/2/0/1/230', 'Beet/3/1/1/3/230', 'Tomato/4/1/0/3/240', 'Raw Fish/0/3/7/0/200', 'Raw Meat/0/7/3/0/250', 'Tallow/0/0/8/0/200', 'Scrap Meat/0/5/5/0/50', 'Prepared Meat/0/4/6/0/600', 'Raw Roast/0/6/5/0/800', 'Raw Sausage/0/4/8/0/500', 'Raw Bacon/0/3/9/0/600', 'Prime Cut/0/9/4/0/600', 'Cereal Germ/5/0/7/3/20', 'Bean Paste/3/5/7/0/40', 'Flour/15/0/0/0/50', 'Sugar/15/0/0/0/50', 'Camas Paste/3/2/10/0/60', 'Cornmeal/9/3/3/0/60', 'Huckleberry Extract/0/0/0/15/60', 'Yeast/0/8/0/7/60', 'Oil/0/0/15/0/120', 'Infused Oil/0/0/12/3/120', 'Simple Syrup/12/0/3/0/400', 'Rice Sludge/10/1/0/2/450', 'Charred Beet/3/0/3/7/470', 'Camas Mash/1/2/9/1/500', 'Campfire Beans/1/9/3/0/500', 'Wilted Fiddleheads/4/1/0/8/500', 'Boiled Shoots/3/0/1/9/510', 'Charred Camas Bulb/2/3/7/1/510', 'Charred Tomato/8/1/0/4/510', 'Charred Corn/8/1/0/4/530', 'Charred Fish/0/9/4/0/550', 'Charred Meat/0/10/10/0/550', 'Wheat Porridge/10/4/0/10/510', 'Charred Sausage/0/11/15/0/500', 'Fried Tomatoes/12/3/9/2/560', 'Bannock/15/3/8/0/600', 'Fiddlehead Salad/6/6/0/14/970', 'Campfire Roast/0/16/12/0/1000', 'Campfire Stew/5/12/9/4/1200', 'Wild Stew/8/5/5/12/1200', 'Fruit Salad/8/2/2/10/900', 'Meat Stock/5/8/9/3/700', 'Vegetable Stock/11/1/2/11/700', 'Camas Bulb Bake/12/7/5/4/400', 'Flatbread/17/8/3/0/500', 'Huckleberry Muffin/10/5/4/11/450', 'Baked Meat/0/13/17/0/600', 'Baked Roast/4/13/8/7/900', 'Huckleberry Pie/9/5/4/16/1300', 'Meat Pie/7/11/11/5/1300', 'Basic Salad/13/6/6/13/800', 'Simmered Meat/6/18/13/5/900', 'Vegetable Medley/9/5/8/20/900', 'Vegetable Soup/12/4/7/19/1200', 'Crispy Bacon/0/18/26/0/600', 'Stuffed Turkey/9/16/12/7/1500']
 main(list(read_foods(available)))

I want speed and I want it now!

To speed up your program you can return early. Knowing if sum(total_calories) + food.calories <= MAXCALORIES: then you should return if the inverse is true when food is the food with the lowest amount of calories.

def test_early_leaf(available, MAXCALORIES):
 all_combinations = []
 min_calories = min(a.calories for a in available)

 def find_sp(total):
 try:
 nutrients = [
 tot_avg(total, 'carbs'),
 tot_avg(total, 'protein'),
 tot_avg(total, 'fat'),
 tot_avg(total, 'vitamins')
 ]
 balance = sum(nutrients) / 2 / max(nutrients)
 except ZeroDivisionError:
 return None
 return tot_avg(total, 'nutrients') * balance + 12

 def inner(total):
 if sum(f.calories for f in total) + min_calories > MAXCALORIES:
 sp = find_sp(total)
 if sp is not None:
 all_combinations.append((sp, total))
 else:
 for food in available:
 total_calories = [f.calories for f in total]
 if sum(total_calories) + food.calories <= MAXCALORIES:
 inner(total[:] + [food])

 inner([])
 return max(all_combinations, key=lambda i: i[0])

I added another function that performs memoization via an LRU cache with an unbound size. However it seemed to slow the process.

enter image description here

How to optimizing the algorithm

Firstly the equations are:

$$g(f, a) = fracSigma(f_a_i times f_textcalories_i)Sigma(f_textcalories_i)$$
$$n = g(f, textcarbs), g(f, textprotein), g(f, textfat), g(f, textvitimins)$$
$$textSP = g(f, textnutrients) times fracSigma n2max(n) + textBase gain$$

From here we have to find the maximums.

What's the maximum and minimum that $fracSigma n2max(n)$ can be?

$$ fracn + n + n + n2 times n = frac4n2n = 2$$
$$ fracn + 0 + 0 + 02 times n = fracn2n = 0.5$$

This means all we need to do is ensure the calorie average of all the different nutrients are the same. It doesn't matter what value this average is, only that all have the same.

What's the maximum that $g(f, textnutrients)$ can be?

Firstly taking into account:

$$fracSigma(a_i times b_i)Sigma(b_i) = Sigma(a_i times fracb_iSigma(b_i))$$

We know that these are the calorie average of the foods nutritional value. To maximize this you just want the foods with the highest nutritional value.

Lets work through an example lets say we have the following five foods:

a/10/0/0/0/1

b/0/10/0/0/1

c/0/0/10/0/1

d/0/0/0/10/1

e/1/1/1/1/4

What's the way to maximize SP?

Eating 1 e would give you $4 times 2 = 8$.

Eating 4 a would give you $10 times 0.5 = 5$.

Eating 1 a, b, c and d would give you $10 times 2 = 20$.

And so from here we have deduced eating a, b, c and d in ratios of 1:1:1:1 give the most SP.

This means the rough solution is to find the foods that have the same calorie average for their individual nutrients where you select foods with a bias for ones with high total nutrients.

I'm going to leave it a little while until I accept this amazing answer, just in case there are any massive developments. Thanks for your help! — Mar 17 at 23:43
@RulerOfTheWorld It's always good to wait a while before accepting. :) If someone comes along and posts something better than the above I'd encourage you to give them the tick rather than me. I posted my answer halfway through so others can have easier to read code to work from. — Mar 17 at 23:46
If someone posts a better answer later, you can change your accept vote. I think it makes sense to accept once you've read and understood an answer to make sure it's actually good, and think it's comprehensive enough. Having an accepted answer doesn't close a question. — Mar 18 at 11:53

score 8 · Accepted Answer · 2019-03-19 15:40:38Z

Data representation

Your choice of data representation is curious. It's a middle ground between a fully-serialized text format and a fully-deserialized in-memory format (such as nested tuples or dictionaries). I'd offer that it's not as good as either of the above. If you're going for micro-optimization, you need to do "pre-deserialized" literal variable initialization that doesn't require parsing at all. The best option would probably be named tuples or even plain tuples, i.e.

available = (
 ('Fiddleheads', 3, 1, 0, 3, 80),
 # ...
)

But this won't yield any noticeable benefit, and it's not as maintainable as the alternative: just write a CSV file.

main isn't main

You've written a main function that isn't actually top-level code. This is not advisable. Rename it to something else, and put your top-level code in an actual main function, called from global scope with a standard if __name__ == '__main__' check.

list duplication

This:

totalNames[::]

should simply be

list(totalNames)

snake_case

Your names should follow the format total_names, rather than totalNames.

Also, variables in global scope (i.e. AllSP) should be all-caps; and you shouldn't need to declare them global.

Suggested

This doesn't at all tackle the main issue of algorithmic complexity, only Python usage. It isn't a good implementation, it's just to illustrate some stylistic improvements.

Note a few things:

Having a shebang at the top is very important to indicate to the shell and other programmers what's being executed

Use csv

Use tuple unpacking in your loops where possible

Abbreviate the formation of new lists by doing appends inline

Never except:; at a minimum except Exception: although even this should be more specific

Use f-strings where appropriate

Drop inner lists in list comprehensions when you don't need them

foods.csv

name,carbs,protein,fat,vitamins,calories
Fiddleheads,3,1,0,3,80
Fireweed Shoots,3,0,0,4,150
Prickly Pear Fruit,2,1,1,3,190
Huckleberries,2,0,0,6,80
Rice,7,1,0,0,90
Camas Bulb,1,2,5,0,120
Beans,1,4,3,0,120
Wheat,6,2,0,0,130
Crimini Mushrooms,3,3,1,1,200
Corn,5,2,0,1,230
Beet,3,1,1,3,230
Tomato,4,1,0,3,240
Raw Fish,0,3,7,0,200
Raw Meat,0,7,3,0,250
Tallow,0,0,8,0,200
Scrap Meat,0,5,5,0,50
Prepared Meat,0,4,6,0,600
Raw Roast,0,6,5,0,800
Raw Sausage,0,4,8,0,500
Raw Bacon,0,3,9,0,600
Prime Cut,0,9,4,0,600
Cereal Germ,5,0,7,3,20
Bean Paste,3,5,7,0,40
Flour,15,0,0,0,50
Sugar,15,0,0,0,50
Camas Paste,3,2,10,0,60
Cornmeal,9,3,3,0,60
Huckleberry Extract,0,0,0,15,60
Yeast,0,8,0,7,60
Oil,0,0,15,0,120
Infused Oil,0,0,12,3,120
Simple Syrup,12,0,3,0,400
Rice Sludge,10,1,0,2,450
Charred Beet,3,0,3,7,470
Camas Mash,1,2,9,1,500
Campfire Beans,1,9,3,0,500
Wilted Fiddleheads,4,1,0,8,500
Boiled Shoots,3,0,1,9,510
Charred Camas Bulb,2,3,7,1,510
Charred Tomato,8,1,0,4,510
Charred Corn,8,1,0,4,530
Charred Fish,0,9,4,0,550
Charred Meat,0,10,10,0,550
Wheat Porridge,10,4,0,10,510
Charred Sausage,0,11,15,0,500
Fried Tomatoes,12,3,9,2,560
Bannock,15,3,8,0,600
Fiddlehead Salad,6,6,0,14,970
Campfire Roast,0,16,12,0,1000
Campfire Stew,5,12,9,4,1200
Wild Stew,8,5,5,12,1200
Fruit Salad,8,2,2,10,900
Meat Stock,5,8,9,3,700
Vegetable Stock,11,1,2,11,700
Camas Bulb Bake,12,7,5,4,400
Flatbread,17,8,3,0,500
Huckleberry Muffin,10,5,4,11,450
Baked Meat,0,13,17,0,600
Baked Roast,4,13,8,7,900
Huckleberry Pie,9,5,4,16,1300
Meat Pie,7,11,11,5,1300
Basic Salad,13,6,6,13,800
Simmered Meat,6,18,13,5,900
Vegetable Medley,9,5,8,20,900
Vegetable Soup,12,4,7,19,1200
Crispy Bacon,0,18,26,0,600
Stuffed Turkey,9,16,12,7,1500

Python

#!/usr/bin/env python3

import csv
from time import time

ALL_SP = []
ALL_NAMES = []


def read(fn):
 with open('foods.csv') as f:
 reader = csv.reader(f, newline='')
 next(reader) # ignore title
 return tuple(
 (name, float(carbs), float(protein), float(fat), float(vitamins), float(calories))
 for name, carbs, protein, fat, vitamins, calories in reader
 )


AVAILABLE = read('foods.csv')


def find_combs(total_names, total_carbs, total_protein, total_fat, total_vitamins, total_nutrients,
 total_calories, max_calories):
 for name, carbs, protein, fat, vitamins, calories in AVAILABLE:
 nutrients = carbs+protein+fat+vitamins

 if sum(total_calories) + calories <= max_calories:
 find_combs(total_names + [name],
 total_carbs + [carbs],
 total_protein + [protein],
 total_fat + [fat],
 total_vitamins + [vitamins],
 total_nutrients + [nutrients],
 total_calories + [calories],
 max_calories)
 else:
 # find SP
 try:
 carbs = sum(x * y for x, y in zip(total_calories, total_carbs)) / sum(total_calories)
 protein = sum(x * y for x, y in zip(total_calories, total_protein)) / sum(total_calories)
 fat = sum(x * y for x, y in zip(total_calories, total_fat)) / sum(total_calories)
 vitamins = sum(x * y for x, y in zip(total_calories, total_vitamins)) / sum(total_calories)
 balance = (carbs+protein+fat+vitamins)/(2*max(carbs,protein,fat,vitamins))
 thisSP = sum(x * y for x, y in zip(total_calories, total_nutrients)) / sum(total_calories) * balance + 12
 except Exception:
 thisSP = 0

 # add SP and names to two lists
 ALL_SP.append(thisSP)
 ALL_NAMES.append(total_names)


def calc(max_calories):
 find_combs([], [], [], [], [], [], [], max_calories)
 index = ALL_SP.index(max(ALL_SP))
 print()
 print(f'ALL_SP[index]:.2f ALL_NAMES[index]')


def main():
 for i in range(100, 3000, 10):
 start = time()
 calc(i)
 print(f'Calories: i >>> Time: time()-start:.3f')


if __name__ == '__main__':
 main()

I'm going to do some reading and see what you're doing in terms of algorithm and submit a second answer to suggest a saner one.

Wow, thanks a lot! I'll edit my code following your advice now. Let me know if you find a way to optimise the algorithm! — Mar 17 at 20:10
@RulerOfTheWorld Please do not edit the code in your question once reviewing started. — Mar 17 at 21:03
@greybeard Of course, my apologies. I was not editing the code, but explaining the functions behind it in a section marked as an edit at the end of my post, so it wouldn't affect the previous code. — Mar 17 at 21:11

score 3 · Accepted Answer · 2019-03-23 03:51:59Z

I guess that my code needs a review too, because I am by far no good python programmer, but I wanted to share some of my ideas to solve your probelm that do not fit in a comment. So I hope at least the approach is some optimization to your code, if it is not the code itself.

I looked a bit on the function and thought that there must be an easier way to calculate it. So what I do here is:

$$textrmweighted_nutrients = frac(m odot c)^top cdot nm^top cdot c=fracsum_j^M(m_j times c_j) times n_jk_k=1 ldots Msum_j^M(m_j times c_j)$$

with $m$ being the amount of each foods (1 apple, 2 peaches, ... $rightarrow$ [1,2,...]), $M$ being the amount of foods (67 foods available), $c$ the kcals, $n$ the nutrients and $odot$ is element-wise multiplication. The result is a vector that needs to be summed up for the base value. It gets squared as the balance's numerator is the same. For the maximum in the balance, we can simply plug it in, as it is a vector from which a maximum can be chosen. The result looks in principle like this:
$$textrmSP = textrmsum(textrmweighted_nutrients)^2 cdot frac0.5max(textrmweighted_nutrients) + 12$$
Now as I write it, I think it looks even better like this:
$$textrmSP = frac12 cdot fractextrmsum(textrmweighted_nutrients)^2max(textrmweighted_nutrients) + 12$$

What should be done with this function now?

As you did, I wrote a function using itertools and a lot of possible combinations which luckily starts with the high calory foods, which give quite good results from the beginning. But as you found out yourself, you will be very old when/if the code ever finishes.
Therefore, I chose a genetic algorithm to solve the problem as for my untrained eyes, this looked like a nice way. On the other hand I always wanted to use a GA to solve a problem ... :D

#!/usr/bin/env python3
import numpy as np
import itertools as it
from deap import base, creator, tools, algorithms
import random


def generate_function(skill_gain_multiplier=1, base_skill_gain=12):
 # read in the foods
 names, nutrients, calories = give_food()

 # define skill_point function
 def skill_points(amounts):
 numerator = (amounts * calories).dot(nutrients)
 denominator = amounts.dot(calories)
 weighted_nutrients = np.divide(numerator, denominator)
 base_value = np.sum(weighted_nutrients) ** 2
 balance_modifier = (
 0.5 * 1 / np.max(weighted_nutrients) * skill_gain_multiplier
 )
 result = base_value * balance_modifier + base_skill_gain
 return result

 # define calory check function
 def calory_check(amounts):
 calory_count = amounts.dot(calories)
 return calory_count

 return names, skill_points, calories, calory_check


def give_food():
 available = [
 "Fiddleheads/3/1/0/3/80",
 "Fireweed Shoots/3/0/0/4/150",
 "Prickly Pear Fruit/2/1/1/3/190",
 "Huckleberries/2/0/0/6/80",
 "Rice/7/1/0/0/90",
 "Camas Bulb/1/2/5/0/120",
 "Beans/1/4/3/0/120",
 "Wheat/6/2/0/0/130",
 "Crimini Mushrooms/3/3/1/1/200",
 "Corn/5/2/0/1/230",
 "Beet/3/1/1/3/230",
 "Tomato/4/1/0/3/240",
 "Raw Fish/0/3/7/0/200",
 "Raw Meat/0/7/3/0/250",
 "Tallow/0/0/8/0/200",
 "Scrap Meat/0/5/5/0/50",
 "Prepared Meat/0/4/6/0/600",
 "Raw Roast/0/6/5/0/800",
 "Raw Sausage/0/4/8/0/500",
 "Raw Bacon/0/3/9/0/600",
 "Prime Cut/0/9/4/0/600",
 "Cereal Germ/5/0/7/3/20", # test
 "Bean Paste/3/5/7/0/40",
 "Flour/15/0/0/0/50",
 "Sugar/15/0/0/0/50",
 "Camas Paste/3/2/10/0/60",
 "Cornmeal/9/3/3/0/60",
 "Huckleberry Extract/0/0/0/15/60",
 "Yeast/0/8/0/7/60", # test
 "Oil/0/0/15/0/120",
 "Infused Oil/0/0/12/3/120",
 "Simple Syrup/12/0/3/0/400",
 "Rice Sludge/10/1/0/2/450",
 "Charred Beet/3/0/3/7/470",
 "Camas Mash/1/2/9/1/500",
 "Campfire Beans/1/9/3/0/500",
 "Wilted Fiddleheads/4/1/0/8/500",
 "Boiled Shoots/3/0/1/9/510",
 "Charred Camas Bulb/2/3/7/1/510",
 "Charred Tomato/8/1/0/4/510",
 "Charred Corn/8/1/0/4/530",
 "Charred Fish/0/9/4/0/550",
 "Charred Meat/0/10/10/0/550",
 "Wheat Porridge/10/4/0/10/510",
 "Charred Sausage/0/11/15/0/500",
 "Fried Tomatoes/12/3/9/2/560",
 "Bannock/15/3/8/0/600",
 "Fiddlehead Salad/6/6/0/14/970",
 "Campfire Roast/0/16/12/0/1000",
 "Campfire Stew/5/12/9/4/1200",
 "Wild Stew/8/5/5/12/1200",
 "Fruit Salad/8/2/2/10/900",
 "Meat Stock/5/8/9/3/700",
 "Vegetable Stock/11/1/2/11/700",
 "Camas Bulb Bake/12/7/5/4/400",
 "Flatbread/17/8/3/0/500",
 "Huckleberry Muffin/10/5/4/11/450",
 "Baked Meat/0/13/17/0/600",
 "Baked Roast/4/13/8/7/900",
 "Huckleberry Pie/9/5/4/16/1300",
 "Meat Pie/7/11/11/5/1300",
 "Basic Salad/13/6/6/13/800",
 "Simmered Meat/6/18/13/5/900",
 # "Vegetable Medley/9/5/8/20/900", outdated values
 "Vegetable Medley/8/4/7/17/900",
 "Vegetable Soup/12/4/7/19/1200",
 "Crispy Bacon/0/18/26/0/600",
 "Stuffed Turkey/9/16/12/7/1500",
 ]

 all_names = []
 all_nutrients = []
 all_calories = []
 for item in available:
 name, *nutrients, calories = item.split("/")
 all_names.append(name)
 nutrients = [float(x) for x in nutrients]
 all_nutrients.append(nutrients)
 all_calories.append(float(calories))
 return np.array(all_names), np.array(all_nutrients), np.array(all_calories)


def brute_force(names, f, calory_check, cals):
 # create every possible combination
 combinations = it.product(range(2), repeat=len(names))

 best = 0.0
 cnt = 0
 for comb in combinations:
 # calculate value
 comb = np.array(list(comb))
 new = f(comb)
 # if better, replace best
 if new > best and calory_check(comb):
 best = new
 print(
 [x for x in zip(names, comb) if x[1] != 0], new, comb.dot(cals)
 )
 # show current iteration ... of quite a few
 else:
 sys.stdout.write(f"rcnt")
 sys.stdout.flush()
 cnt += 1


# the genetic algorithm is very simply based on the tutorials here:
# https://deap.readthedocs.io/en/master/examples/index.html
def genetic_algorithm(
 fitness_function,
 cal_chk,
 array_size,
 population_size=300,
 max_iterations=250,
):
 creator.create("FitnessMax", base.Fitness, weights=(1.0,))
 creator.create("Individual", np.ndarray, fitness=creator.FitnessMax)

 toolbox = base.Toolbox()

 # Attribute generator
 toolbox.register("attr_bool", random.randint, 0, 1)

 # Structure initializers
 toolbox.register(
 "individual",
 tools.initRepeat,
 creator.Individual,
 toolbox.attr_bool,
 array_size,
 )
 toolbox.register("population", tools.initRepeat, list, toolbox.individual)

 def cxTwoPointCopy(ind1, ind2):
 """Execute a two points crossover with copy on the input individuals. The
 copy is required because the slicing in numpy returns a view of the data,
 which leads to a self overwritting in the swap operation. It prevents
 ::

 >>> import numpy
 >>> a = numpy.array((1,2,3,4))
 >>> b = numpy.array((5.6.7.8))
 >>> a[1:3], b[1:3] = b[1:3], a[1:3]
 >>> print(a)
 [1 6 7 4]
 >>> print(b)
 [5 6 7 8]
 """
 size = len(ind1)
 cxpoint1 = random.randint(1, size)
 cxpoint2 = random.randint(1, size - 1)
 if cxpoint2 >= cxpoint1:
 cxpoint2 += 1
 else: # Swap the two cx points
 cxpoint1, cxpoint2 = cxpoint2, cxpoint1

 ind1[cxpoint1:cxpoint2], ind2[cxpoint1:cxpoint2] = (
 ind2[cxpoint1:cxpoint2].copy(),
 ind1[cxpoint1:cxpoint2].copy(),
 )

 return ind1, ind2

 # cutoff function was needed, as initial guesses were all above 3000 kcal
 # and no solution could be found. with the smooth cutoff function, the results
 # are pushed below 3000 kcal, which is where they belong.
 # not sure if this is smart or just overshot :D
 def cutoff(individual):
 return 0.5 - 0.5 * np.tanh((cal_chk(individual) - 3000) / 5000)

 # return the cutoff value if higher than 3000
 # and the true value if lower
 def evalFit(individual):
 if cal_chk(individual) <= 3000:
 return (fitness_function(individual),)
 else:
 return (cutoff(individual),)

 # toolbox.register("evaluate", evalOneMax)
 toolbox.register("evaluate", evalFit)
 toolbox.register("mate", tools.cxTwoPoint)
 toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)
 toolbox.register("select", tools.selTournament, tournsize=3)

 # Creating the population
 def main():
 pop = toolbox.population(n=population_size)
 hof = tools.HallOfFame(5, similar=np.array_equal)
 stats = tools.Statistics(lambda ind: ind.fitness.values)
 stats.register("avg", np.mean)
 stats.register("std", np.std)
 stats.register("min", np.min)
 stats.register("max", np.max)

 pop, log = algorithms.eaSimple(
 pop,
 toolbox,
 cxpb=0.5,
 mutpb=0.5,
 ngen=max_iterations,
 stats=stats,
 halloffame=hof,
 verbose=True,
 )

 return pop, log, hof

 return main


if __name__ == "__main__":
 # generating the functions
 names, f, cals, calory_check = generate_function()

 # not recommended
 # brute_force(names, f, calory_check, cals)

 # probably better
 ga = genetic_algorithm(
 f, calory_check, len(names), max_iterations=500, population_size=500
 )
 pop, log, hof = ga()

 # printing the result
 print("n########n# DONE #n########")
 for star in hof[1:]:
 [print(i, s) for i, s in zip(star, names) if i > 0]
 print(f"which has calory_check(star) kcal")
 print(f"and gives a SP of f(star)n---n")

and the result is something like this:

1 Vegetable Soup
1 Stuffed Turkey
which has 2700.0 kcal
and gives a SP of 87.34734734734735
---

1 Cereal Germ
1 Vegetable Soup
1 Stuffed Turkey
which has 2720.0 kcal
and gives a SP of 87.04413748413035
---

1 Bean Paste
1 Vegetable Soup
1 Stuffed Turkey
which has 2740.0 kcal
and gives a SP of 87.01479581771551
---

1 Flour
1 Vegetable Soup
1 Stuffed Turkey
which has 2750.0 kcal
and gives a SP of 86.9337837837838
---

87.347 is the highest I found so far. Sometime the algorithm gets stuck at a lower value, you may need to play around with the parameters of the GA to get a faster/better/more robust result. But as the code is very fast, maybe just run it multiple times and see which result is the highest.

I think I did, we should go to the chat chat.stackexchange.com/rooms/91219/… — Mar 23 at 1:11
Yeah it does :) What is $m$. Is that $n$ in my answer, but without calling the function? — Mar 23 at 2:33
I had a go at optimizing the code tonight, I think it's a pretty naive implementation. It's just finds the inverse ratio and picks the highest within a range. And I get 90.761@2400 calories, with 1 Stuffed Turkey and 1 Vegetable Medley. I doubted that it was actually 90.761, but it seems legit from the math in my answer and the OPs code. Do you get the same? — Mar 23 at 6:31

Ludisposed 9,08322267 · Accepted Answer · 2019-03-18 16:10:55Z

I see some replies with general tips for optimization, but I don't see anyone recommending a specific approach called memoization. It works wonders just for this kind of problems (results in some finite range around the <1M mark, 3000 is far below the upper limit).

Basically you would do something like this:

Create a sort of array (this one will be struxtured differently depending on whether you just need the value of the result, only one combination of food items or all combinations). Since no food has negative calories, you can only make it 0-3000

Then you do something like this (pseudocode):

for foodItem in foodItems:
 for value in caloriesArray:
 if caloriesArray[value] != 0: #has been reached before, so I can expand on it
 caloriesArray[value]+foodItems[foodItem] = ... #whatever you need, can be just True

There are plenty of sites explaining memoization and I'm not very good at explanations, but if this doesn't help you then I can include a simple example.

Then just find the highest reached value of the array.

I tested your theory, and found it to not really improve performance. (See the graph in my answer) — Mar 23 at 4:03
I will add an update with the code to my answer, but for exclusion/inclusion kind of problems like this, memoization should be vastly superior to anything else when used in the right way. Cannot promise I'll do it today though. — 2 days ago

score 19 · Accepted Answer · 2019-03-23 03:45:58Z

Readability is #1

Global variables are bad. Don't use them. I have to spend a long while looking at your code to tell what uses them and when. When your code becomes hundreds of lines long this is tedious and unmaintainable.

If you need to use recursion and add to something not in the recursive function use a closure.

You should load available into an object, rather than extract the information from it each and every time you use it.

Using the above you can simplify all your totalNames, totalCarbs into one list.

Rather than using AllSP and AllNames you can add a tuple to one list.

You should put all your code into a main so that you reduce the amount of variables in the global scope. This goes hand in hand with (1).

Rather than copying and pasting the same line multiple times you can create a function.

All this gets the following. Which should be easier for you to increase the performance from:

import itertools
import sys
import time
import collections

sys.setrecursionlimit(10000000)

_Food = collections.namedtuple('Food', 'name carbs protein fat vitamins calories')

class Food(_Food):
 @property
 def nutrients(self):
 return sum(self[1:5])


def read_foods(foods):
 for food in foods:
 name, *other = food.split('/')
 yield Food(name, *[float(v) for v in other])


def tot_avg(food, attr):
 return (
 sum(f.calories * getattr(f, attr) for f in food)
 / sum(f.calories for f in food)
 )


def find_combs(available, MAXCALORIES):
 all_combinations = []
 def inner(total):
 for food in available:
 total_calories = [f.calories for f in total]
 if sum(total_calories) + food.calories <= MAXCALORIES:
 inner(total[:] + [food])
 else:
 try:
 nutrients = [
 tot_avg(total, 'carbs'),
 tot_avg(total, 'protein'),
 tot_avg(total, 'fat'),
 tot_avg(total, 'vitamins')
 ]
 balance = sum(nutrients) / 2 / max(nutrients)
 except ZeroDivisionError:
 continue
 sp = tot_avg(total, 'nutrients') * balance + 12
 all_combinations.append((sp, total))
 inner([])
 return all_combinations


def main(available):
 for MAXCALORIES in range(100, 3000, 10):
 start = time.time()
 all_ = find_combs(available, MAXCALORIES)
 amount, foods = max(all_, key=lambda i: i[0])
 print(amount, ' ', [f.name for f in foods])
 print('Calories:', amount, '>>> Time:', time.time()-start)


if __name__ == '__main__':
 available = ['Fiddleheads/3/1/0/3/80', 'Fireweed Shoots/3/0/0/4/150', 'Prickly Pear Fruit/2/1/1/3/190', 'Huckleberries/2/0/0/6/80', 'Rice/7/1/0/0/90', 'Camas Bulb/1/2/5/0/120', 'Beans/1/4/3/0/120', 'Wheat/6/2/0/0/130', 'Crimini Mushrooms/3/3/1/1/200', 'Corn/5/2/0/1/230', 'Beet/3/1/1/3/230', 'Tomato/4/1/0/3/240', 'Raw Fish/0/3/7/0/200', 'Raw Meat/0/7/3/0/250', 'Tallow/0/0/8/0/200', 'Scrap Meat/0/5/5/0/50', 'Prepared Meat/0/4/6/0/600', 'Raw Roast/0/6/5/0/800', 'Raw Sausage/0/4/8/0/500', 'Raw Bacon/0/3/9/0/600', 'Prime Cut/0/9/4/0/600', 'Cereal Germ/5/0/7/3/20', 'Bean Paste/3/5/7/0/40', 'Flour/15/0/0/0/50', 'Sugar/15/0/0/0/50', 'Camas Paste/3/2/10/0/60', 'Cornmeal/9/3/3/0/60', 'Huckleberry Extract/0/0/0/15/60', 'Yeast/0/8/0/7/60', 'Oil/0/0/15/0/120', 'Infused Oil/0/0/12/3/120', 'Simple Syrup/12/0/3/0/400', 'Rice Sludge/10/1/0/2/450', 'Charred Beet/3/0/3/7/470', 'Camas Mash/1/2/9/1/500', 'Campfire Beans/1/9/3/0/500', 'Wilted Fiddleheads/4/1/0/8/500', 'Boiled Shoots/3/0/1/9/510', 'Charred Camas Bulb/2/3/7/1/510', 'Charred Tomato/8/1/0/4/510', 'Charred Corn/8/1/0/4/530', 'Charred Fish/0/9/4/0/550', 'Charred Meat/0/10/10/0/550', 'Wheat Porridge/10/4/0/10/510', 'Charred Sausage/0/11/15/0/500', 'Fried Tomatoes/12/3/9/2/560', 'Bannock/15/3/8/0/600', 'Fiddlehead Salad/6/6/0/14/970', 'Campfire Roast/0/16/12/0/1000', 'Campfire Stew/5/12/9/4/1200', 'Wild Stew/8/5/5/12/1200', 'Fruit Salad/8/2/2/10/900', 'Meat Stock/5/8/9/3/700', 'Vegetable Stock/11/1/2/11/700', 'Camas Bulb Bake/12/7/5/4/400', 'Flatbread/17/8/3/0/500', 'Huckleberry Muffin/10/5/4/11/450', 'Baked Meat/0/13/17/0/600', 'Baked Roast/4/13/8/7/900', 'Huckleberry Pie/9/5/4/16/1300', 'Meat Pie/7/11/11/5/1300', 'Basic Salad/13/6/6/13/800', 'Simmered Meat/6/18/13/5/900', 'Vegetable Medley/9/5/8/20/900', 'Vegetable Soup/12/4/7/19/1200', 'Crispy Bacon/0/18/26/0/600', 'Stuffed Turkey/9/16/12/7/1500']
 main(list(read_foods(available)))

I want speed and I want it now!

To speed up your program you can return early. Knowing if sum(total_calories) + food.calories <= MAXCALORIES: then you should return if the inverse is true when food is the food with the lowest amount of calories.

def test_early_leaf(available, MAXCALORIES):
 all_combinations = []
 min_calories = min(a.calories for a in available)

 def find_sp(total):
 try:
 nutrients = [
 tot_avg(total, 'carbs'),
 tot_avg(total, 'protein'),
 tot_avg(total, 'fat'),
 tot_avg(total, 'vitamins')
 ]
 balance = sum(nutrients) / 2 / max(nutrients)
 except ZeroDivisionError:
 return None
 return tot_avg(total, 'nutrients') * balance + 12

 def inner(total):
 if sum(f.calories for f in total) + min_calories > MAXCALORIES:
 sp = find_sp(total)
 if sp is not None:
 all_combinations.append((sp, total))
 else:
 for food in available:
 total_calories = [f.calories for f in total]
 if sum(total_calories) + food.calories <= MAXCALORIES:
 inner(total[:] + [food])

 inner([])
 return max(all_combinations, key=lambda i: i[0])

I added another function that performs memoization via an LRU cache with an unbound size. However it seemed to slow the process.

enter image description here

How to optimizing the algorithm

Firstly the equations are:

$$g(f, a) = fracSigma(f_a_i times f_textcalories_i)Sigma(f_textcalories_i)$$
$$n = g(f, textcarbs), g(f, textprotein), g(f, textfat), g(f, textvitimins)$$
$$textSP = g(f, textnutrients) times fracSigma n2max(n) + textBase gain$$

From here we have to find the maximums.

What's the maximum and minimum that $fracSigma n2max(n)$ can be?

$$ fracn + n + n + n2 times n = frac4n2n = 2$$
$$ fracn + 0 + 0 + 02 times n = fracn2n = 0.5$$

This means all we need to do is ensure the calorie average of all the different nutrients are the same. It doesn't matter what value this average is, only that all have the same.

What's the maximum that $g(f, textnutrients)$ can be?

Firstly taking into account:

$$fracSigma(a_i times b_i)Sigma(b_i) = Sigma(a_i times fracb_iSigma(b_i))$$

We know that these are the calorie average of the foods nutritional value. To maximize this you just want the foods with the highest nutritional value.

Lets work through an example lets say we have the following five foods:

a/10/0/0/0/1

b/0/10/0/0/1

c/0/0/10/0/1

d/0/0/0/10/1

e/1/1/1/1/4

What's the way to maximize SP?

Eating 1 e would give you $4 times 2 = 8$.

Eating 4 a would give you $10 times 0.5 = 5$.

Eating 1 a, b, c and d would give you $10 times 2 = 20$.

And so from here we have deduced eating a, b, c and d in ratios of 1:1:1:1 give the most SP.

This means the rough solution is to find the foods that have the same calorie average for their individual nutrients where you select foods with a bias for ones with high total nutrients.

I'm going to leave it a little while until I accept this amazing answer, just in case there are any massive developments. Thanks for your help! — Mar 17 at 23:43
@RulerOfTheWorld It's always good to wait a while before accepting. :) If someone comes along and posts something better than the above I'd encourage you to give them the tick rather than me. I posted my answer halfway through so others can have easier to read code to work from. — Mar 17 at 23:46
If someone posts a better answer later, you can change your accept vote. I think it makes sense to accept once you've read and understood an answer to make sure it's actually good, and think it's comprehensive enough. Having an accepted answer doesn't close a question. — Mar 18 at 11:53

score 8 · Accepted Answer · 2019-03-19 15:40:38Z

Data representation

Your choice of data representation is curious. It's a middle ground between a fully-serialized text format and a fully-deserialized in-memory format (such as nested tuples or dictionaries). I'd offer that it's not as good as either of the above. If you're going for micro-optimization, you need to do "pre-deserialized" literal variable initialization that doesn't require parsing at all. The best option would probably be named tuples or even plain tuples, i.e.

available = (
 ('Fiddleheads', 3, 1, 0, 3, 80),
 # ...
)

But this won't yield any noticeable benefit, and it's not as maintainable as the alternative: just write a CSV file.

main isn't main

You've written a main function that isn't actually top-level code. This is not advisable. Rename it to something else, and put your top-level code in an actual main function, called from global scope with a standard if __name__ == '__main__' check.

list duplication

This:

totalNames[::]

should simply be

list(totalNames)

snake_case

Your names should follow the format total_names, rather than totalNames.

Also, variables in global scope (i.e. AllSP) should be all-caps; and you shouldn't need to declare them global.

Suggested

This doesn't at all tackle the main issue of algorithmic complexity, only Python usage. It isn't a good implementation, it's just to illustrate some stylistic improvements.

Note a few things:

Having a shebang at the top is very important to indicate to the shell and other programmers what's being executed

Use csv

Use tuple unpacking in your loops where possible

Abbreviate the formation of new lists by doing appends inline

Never except:; at a minimum except Exception: although even this should be more specific

Use f-strings where appropriate

Drop inner lists in list comprehensions when you don't need them

foods.csv

name,carbs,protein,fat,vitamins,calories
Fiddleheads,3,1,0,3,80
Fireweed Shoots,3,0,0,4,150
Prickly Pear Fruit,2,1,1,3,190
Huckleberries,2,0,0,6,80
Rice,7,1,0,0,90
Camas Bulb,1,2,5,0,120
Beans,1,4,3,0,120
Wheat,6,2,0,0,130
Crimini Mushrooms,3,3,1,1,200
Corn,5,2,0,1,230
Beet,3,1,1,3,230
Tomato,4,1,0,3,240
Raw Fish,0,3,7,0,200
Raw Meat,0,7,3,0,250
Tallow,0,0,8,0,200
Scrap Meat,0,5,5,0,50
Prepared Meat,0,4,6,0,600
Raw Roast,0,6,5,0,800
Raw Sausage,0,4,8,0,500
Raw Bacon,0,3,9,0,600
Prime Cut,0,9,4,0,600
Cereal Germ,5,0,7,3,20
Bean Paste,3,5,7,0,40
Flour,15,0,0,0,50
Sugar,15,0,0,0,50
Camas Paste,3,2,10,0,60
Cornmeal,9,3,3,0,60
Huckleberry Extract,0,0,0,15,60
Yeast,0,8,0,7,60
Oil,0,0,15,0,120
Infused Oil,0,0,12,3,120
Simple Syrup,12,0,3,0,400
Rice Sludge,10,1,0,2,450
Charred Beet,3,0,3,7,470
Camas Mash,1,2,9,1,500
Campfire Beans,1,9,3,0,500
Wilted Fiddleheads,4,1,0,8,500
Boiled Shoots,3,0,1,9,510
Charred Camas Bulb,2,3,7,1,510
Charred Tomato,8,1,0,4,510
Charred Corn,8,1,0,4,530
Charred Fish,0,9,4,0,550
Charred Meat,0,10,10,0,550
Wheat Porridge,10,4,0,10,510
Charred Sausage,0,11,15,0,500
Fried Tomatoes,12,3,9,2,560
Bannock,15,3,8,0,600
Fiddlehead Salad,6,6,0,14,970
Campfire Roast,0,16,12,0,1000
Campfire Stew,5,12,9,4,1200
Wild Stew,8,5,5,12,1200
Fruit Salad,8,2,2,10,900
Meat Stock,5,8,9,3,700
Vegetable Stock,11,1,2,11,700
Camas Bulb Bake,12,7,5,4,400
Flatbread,17,8,3,0,500
Huckleberry Muffin,10,5,4,11,450
Baked Meat,0,13,17,0,600
Baked Roast,4,13,8,7,900
Huckleberry Pie,9,5,4,16,1300
Meat Pie,7,11,11,5,1300
Basic Salad,13,6,6,13,800
Simmered Meat,6,18,13,5,900
Vegetable Medley,9,5,8,20,900
Vegetable Soup,12,4,7,19,1200
Crispy Bacon,0,18,26,0,600
Stuffed Turkey,9,16,12,7,1500

Python

#!/usr/bin/env python3

import csv
from time import time

ALL_SP = []
ALL_NAMES = []


def read(fn):
 with open('foods.csv') as f:
 reader = csv.reader(f, newline='')
 next(reader) # ignore title
 return tuple(
 (name, float(carbs), float(protein), float(fat), float(vitamins), float(calories))
 for name, carbs, protein, fat, vitamins, calories in reader
 )


AVAILABLE = read('foods.csv')


def find_combs(total_names, total_carbs, total_protein, total_fat, total_vitamins, total_nutrients,
 total_calories, max_calories):
 for name, carbs, protein, fat, vitamins, calories in AVAILABLE:
 nutrients = carbs+protein+fat+vitamins

 if sum(total_calories) + calories <= max_calories:
 find_combs(total_names + [name],
 total_carbs + [carbs],
 total_protein + [protein],
 total_fat + [fat],
 total_vitamins + [vitamins],
 total_nutrients + [nutrients],
 total_calories + [calories],
 max_calories)
 else:
 # find SP
 try:
 carbs = sum(x * y for x, y in zip(total_calories, total_carbs)) / sum(total_calories)
 protein = sum(x * y for x, y in zip(total_calories, total_protein)) / sum(total_calories)
 fat = sum(x * y for x, y in zip(total_calories, total_fat)) / sum(total_calories)
 vitamins = sum(x * y for x, y in zip(total_calories, total_vitamins)) / sum(total_calories)
 balance = (carbs+protein+fat+vitamins)/(2*max(carbs,protein,fat,vitamins))
 thisSP = sum(x * y for x, y in zip(total_calories, total_nutrients)) / sum(total_calories) * balance + 12
 except Exception:
 thisSP = 0

 # add SP and names to two lists
 ALL_SP.append(thisSP)
 ALL_NAMES.append(total_names)


def calc(max_calories):
 find_combs([], [], [], [], [], [], [], max_calories)
 index = ALL_SP.index(max(ALL_SP))
 print()
 print(f'ALL_SP[index]:.2f ALL_NAMES[index]')


def main():
 for i in range(100, 3000, 10):
 start = time()
 calc(i)
 print(f'Calories: i >>> Time: time()-start:.3f')


if __name__ == '__main__':
 main()

I'm going to do some reading and see what you're doing in terms of algorithm and submit a second answer to suggest a saner one.

Wow, thanks a lot! I'll edit my code following your advice now. Let me know if you find a way to optimise the algorithm! — Mar 17 at 20:10
@RulerOfTheWorld Please do not edit the code in your question once reviewing started. — Mar 17 at 21:03
@greybeard Of course, my apologies. I was not editing the code, but explaining the functions behind it in a section marked as an edit at the end of my post, so it wouldn't affect the previous code. — Mar 17 at 21:11

score 3 · Accepted Answer · 2019-03-23 03:51:59Z

I guess that my code needs a review too, because I am by far no good python programmer, but I wanted to share some of my ideas to solve your probelm that do not fit in a comment. So I hope at least the approach is some optimization to your code, if it is not the code itself.

I looked a bit on the function and thought that there must be an easier way to calculate it. So what I do here is:

$$textrmweighted_nutrients = frac(m odot c)^top cdot nm^top cdot c=fracsum_j^M(m_j times c_j) times n_jk_k=1 ldots Msum_j^M(m_j times c_j)$$

with $m$ being the amount of each foods (1 apple, 2 peaches, ... $rightarrow$ [1,2,...]), $M$ being the amount of foods (67 foods available), $c$ the kcals, $n$ the nutrients and $odot$ is element-wise multiplication. The result is a vector that needs to be summed up for the base value. It gets squared as the balance's numerator is the same. For the maximum in the balance, we can simply plug it in, as it is a vector from which a maximum can be chosen. The result looks in principle like this:
$$textrmSP = textrmsum(textrmweighted_nutrients)^2 cdot frac0.5max(textrmweighted_nutrients) + 12$$
Now as I write it, I think it looks even better like this:
$$textrmSP = frac12 cdot fractextrmsum(textrmweighted_nutrients)^2max(textrmweighted_nutrients) + 12$$

What should be done with this function now?

As you did, I wrote a function using itertools and a lot of possible combinations which luckily starts with the high calory foods, which give quite good results from the beginning. But as you found out yourself, you will be very old when/if the code ever finishes.
Therefore, I chose a genetic algorithm to solve the problem as for my untrained eyes, this looked like a nice way. On the other hand I always wanted to use a GA to solve a problem ... :D

#!/usr/bin/env python3
import numpy as np
import itertools as it
from deap import base, creator, tools, algorithms
import random


def generate_function(skill_gain_multiplier=1, base_skill_gain=12):
 # read in the foods
 names, nutrients, calories = give_food()

 # define skill_point function
 def skill_points(amounts):
 numerator = (amounts * calories).dot(nutrients)
 denominator = amounts.dot(calories)
 weighted_nutrients = np.divide(numerator, denominator)
 base_value = np.sum(weighted_nutrients) ** 2
 balance_modifier = (
 0.5 * 1 / np.max(weighted_nutrients) * skill_gain_multiplier
 )
 result = base_value * balance_modifier + base_skill_gain
 return result

 # define calory check function
 def calory_check(amounts):
 calory_count = amounts.dot(calories)
 return calory_count

 return names, skill_points, calories, calory_check


def give_food():
 available = [
 "Fiddleheads/3/1/0/3/80",
 "Fireweed Shoots/3/0/0/4/150",
 "Prickly Pear Fruit/2/1/1/3/190",
 "Huckleberries/2/0/0/6/80",
 "Rice/7/1/0/0/90",
 "Camas Bulb/1/2/5/0/120",
 "Beans/1/4/3/0/120",
 "Wheat/6/2/0/0/130",
 "Crimini Mushrooms/3/3/1/1/200",
 "Corn/5/2/0/1/230",
 "Beet/3/1/1/3/230",
 "Tomato/4/1/0/3/240",
 "Raw Fish/0/3/7/0/200",
 "Raw Meat/0/7/3/0/250",
 "Tallow/0/0/8/0/200",
 "Scrap Meat/0/5/5/0/50",
 "Prepared Meat/0/4/6/0/600",
 "Raw Roast/0/6/5/0/800",
 "Raw Sausage/0/4/8/0/500",
 "Raw Bacon/0/3/9/0/600",
 "Prime Cut/0/9/4/0/600",
 "Cereal Germ/5/0/7/3/20", # test
 "Bean Paste/3/5/7/0/40",
 "Flour/15/0/0/0/50",
 "Sugar/15/0/0/0/50",
 "Camas Paste/3/2/10/0/60",
 "Cornmeal/9/3/3/0/60",
 "Huckleberry Extract/0/0/0/15/60",
 "Yeast/0/8/0/7/60", # test
 "Oil/0/0/15/0/120",
 "Infused Oil/0/0/12/3/120",
 "Simple Syrup/12/0/3/0/400",
 "Rice Sludge/10/1/0/2/450",
 "Charred Beet/3/0/3/7/470",
 "Camas Mash/1/2/9/1/500",
 "Campfire Beans/1/9/3/0/500",
 "Wilted Fiddleheads/4/1/0/8/500",
 "Boiled Shoots/3/0/1/9/510",
 "Charred Camas Bulb/2/3/7/1/510",
 "Charred Tomato/8/1/0/4/510",
 "Charred Corn/8/1/0/4/530",
 "Charred Fish/0/9/4/0/550",
 "Charred Meat/0/10/10/0/550",
 "Wheat Porridge/10/4/0/10/510",
 "Charred Sausage/0/11/15/0/500",
 "Fried Tomatoes/12/3/9/2/560",
 "Bannock/15/3/8/0/600",
 "Fiddlehead Salad/6/6/0/14/970",
 "Campfire Roast/0/16/12/0/1000",
 "Campfire Stew/5/12/9/4/1200",
 "Wild Stew/8/5/5/12/1200",
 "Fruit Salad/8/2/2/10/900",
 "Meat Stock/5/8/9/3/700",
 "Vegetable Stock/11/1/2/11/700",
 "Camas Bulb Bake/12/7/5/4/400",
 "Flatbread/17/8/3/0/500",
 "Huckleberry Muffin/10/5/4/11/450",
 "Baked Meat/0/13/17/0/600",
 "Baked Roast/4/13/8/7/900",
 "Huckleberry Pie/9/5/4/16/1300",
 "Meat Pie/7/11/11/5/1300",
 "Basic Salad/13/6/6/13/800",
 "Simmered Meat/6/18/13/5/900",
 # "Vegetable Medley/9/5/8/20/900", outdated values
 "Vegetable Medley/8/4/7/17/900",
 "Vegetable Soup/12/4/7/19/1200",
 "Crispy Bacon/0/18/26/0/600",
 "Stuffed Turkey/9/16/12/7/1500",
 ]

 all_names = []
 all_nutrients = []
 all_calories = []
 for item in available:
 name, *nutrients, calories = item.split("/")
 all_names.append(name)
 nutrients = [float(x) for x in nutrients]
 all_nutrients.append(nutrients)
 all_calories.append(float(calories))
 return np.array(all_names), np.array(all_nutrients), np.array(all_calories)


def brute_force(names, f, calory_check, cals):
 # create every possible combination
 combinations = it.product(range(2), repeat=len(names))

 best = 0.0
 cnt = 0
 for comb in combinations:
 # calculate value
 comb = np.array(list(comb))
 new = f(comb)
 # if better, replace best
 if new > best and calory_check(comb):
 best = new
 print(
 [x for x in zip(names, comb) if x[1] != 0], new, comb.dot(cals)
 )
 # show current iteration ... of quite a few
 else:
 sys.stdout.write(f"rcnt")
 sys.stdout.flush()
 cnt += 1


# the genetic algorithm is very simply based on the tutorials here:
# https://deap.readthedocs.io/en/master/examples/index.html
def genetic_algorithm(
 fitness_function,
 cal_chk,
 array_size,
 population_size=300,
 max_iterations=250,
):
 creator.create("FitnessMax", base.Fitness, weights=(1.0,))
 creator.create("Individual", np.ndarray, fitness=creator.FitnessMax)

 toolbox = base.Toolbox()

 # Attribute generator
 toolbox.register("attr_bool", random.randint, 0, 1)

 # Structure initializers
 toolbox.register(
 "individual",
 tools.initRepeat,
 creator.Individual,
 toolbox.attr_bool,
 array_size,
 )
 toolbox.register("population", tools.initRepeat, list, toolbox.individual)

 def cxTwoPointCopy(ind1, ind2):
 """Execute a two points crossover with copy on the input individuals. The
 copy is required because the slicing in numpy returns a view of the data,
 which leads to a self overwritting in the swap operation. It prevents
 ::

 >>> import numpy
 >>> a = numpy.array((1,2,3,4))
 >>> b = numpy.array((5.6.7.8))
 >>> a[1:3], b[1:3] = b[1:3], a[1:3]
 >>> print(a)
 [1 6 7 4]
 >>> print(b)
 [5 6 7 8]
 """
 size = len(ind1)
 cxpoint1 = random.randint(1, size)
 cxpoint2 = random.randint(1, size - 1)
 if cxpoint2 >= cxpoint1:
 cxpoint2 += 1
 else: # Swap the two cx points
 cxpoint1, cxpoint2 = cxpoint2, cxpoint1

 ind1[cxpoint1:cxpoint2], ind2[cxpoint1:cxpoint2] = (
 ind2[cxpoint1:cxpoint2].copy(),
 ind1[cxpoint1:cxpoint2].copy(),
 )

 return ind1, ind2

 # cutoff function was needed, as initial guesses were all above 3000 kcal
 # and no solution could be found. with the smooth cutoff function, the results
 # are pushed below 3000 kcal, which is where they belong.
 # not sure if this is smart or just overshot :D
 def cutoff(individual):
 return 0.5 - 0.5 * np.tanh((cal_chk(individual) - 3000) / 5000)

 # return the cutoff value if higher than 3000
 # and the true value if lower
 def evalFit(individual):
 if cal_chk(individual) <= 3000:
 return (fitness_function(individual),)
 else:
 return (cutoff(individual),)

 # toolbox.register("evaluate", evalOneMax)
 toolbox.register("evaluate", evalFit)
 toolbox.register("mate", tools.cxTwoPoint)
 toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)
 toolbox.register("select", tools.selTournament, tournsize=3)

 # Creating the population
 def main():
 pop = toolbox.population(n=population_size)
 hof = tools.HallOfFame(5, similar=np.array_equal)
 stats = tools.Statistics(lambda ind: ind.fitness.values)
 stats.register("avg", np.mean)
 stats.register("std", np.std)
 stats.register("min", np.min)
 stats.register("max", np.max)

 pop, log = algorithms.eaSimple(
 pop,
 toolbox,
 cxpb=0.5,
 mutpb=0.5,
 ngen=max_iterations,
 stats=stats,
 halloffame=hof,
 verbose=True,
 )

 return pop, log, hof

 return main


if __name__ == "__main__":
 # generating the functions
 names, f, cals, calory_check = generate_function()

 # not recommended
 # brute_force(names, f, calory_check, cals)

 # probably better
 ga = genetic_algorithm(
 f, calory_check, len(names), max_iterations=500, population_size=500
 )
 pop, log, hof = ga()

 # printing the result
 print("n########n# DONE #n########")
 for star in hof[1:]:
 [print(i, s) for i, s in zip(star, names) if i > 0]
 print(f"which has calory_check(star) kcal")
 print(f"and gives a SP of f(star)n---n")

and the result is something like this:

1 Vegetable Soup
1 Stuffed Turkey
which has 2700.0 kcal
and gives a SP of 87.34734734734735
---

1 Cereal Germ
1 Vegetable Soup
1 Stuffed Turkey
which has 2720.0 kcal
and gives a SP of 87.04413748413035
---

1 Bean Paste
1 Vegetable Soup
1 Stuffed Turkey
which has 2740.0 kcal
and gives a SP of 87.01479581771551
---

1 Flour
1 Vegetable Soup
1 Stuffed Turkey
which has 2750.0 kcal
and gives a SP of 86.9337837837838
---

87.347 is the highest I found so far. Sometime the algorithm gets stuck at a lower value, you may need to play around with the parameters of the GA to get a faster/better/more robust result. But as the code is very fast, maybe just run it multiple times and see which result is the highest.

I think I did, we should go to the chat chat.stackexchange.com/rooms/91219/… — Mar 23 at 1:11
Yeah it does :) What is $m$. Is that $n$ in my answer, but without calling the function? — Mar 23 at 2:33
I had a go at optimizing the code tonight, I think it's a pretty naive implementation. It's just finds the inverse ratio and picks the highest within a range. And I get 90.761@2400 calories, with 1 Stuffed Turkey and 1 Vegetable Medley. I doubted that it was actually 90.761, but it seems legit from the math in my answer and the OPs code. Do you get the same? — Mar 23 at 6:31

Ludisposed 9,08322267 · Accepted Answer · 2019-03-18 16:10:55Z

I see some replies with general tips for optimization, but I don't see anyone recommending a specific approach called memoization. It works wonders just for this kind of problems (results in some finite range around the <1M mark, 3000 is far below the upper limit).

Basically you would do something like this:

Create a sort of array (this one will be struxtured differently depending on whether you just need the value of the result, only one combination of food items or all combinations). Since no food has negative calories, you can only make it 0-3000

Then you do something like this (pseudocode):

for foodItem in foodItems:
 for value in caloriesArray:
 if caloriesArray[value] != 0: #has been reached before, so I can expand on it
 caloriesArray[value]+foodItems[foodItem] = ... #whatever you need, can be just True

There are plenty of sites explaining memoization and I'm not very good at explanations, but if this doesn't help you then I can include a simple example.

Then just find the highest reached value of the array.

I tested your theory, and found it to not really improve performance. (See the graph in my answer) — Mar 23 at 4:03
I will add an update with the code to my answer, but for exclusion/inclusion kind of problems like this, memoization should be vastly superior to anything else when used in the right way. Cannot promise I'll do it today though. — 2 days ago

4 Answers 4

Readability is #1

I want speed and I want it now!

How to optimizing the algorithm

Data representation

main isn't main

list duplication

snake_case

Suggested

Your Answer

Sign up or log in

Post as a guest

Post as a guest

4 Answers 4

4 Answers 4

Readability is #1

I want speed and I want it now!

How to optimizing the algorithm

Readability is #1

I want speed and I want it now!

How to optimizing the algorithm

Readability is #1

I want speed and I want it now!

How to optimizing the algorithm

Readability is #1

I want speed and I want it now!

How to optimizing the algorithm

Data representation

main isn't main

list duplication

snake_case

Suggested

Data representation

main isn't main

list duplication

snake_case

Suggested

Data representation

main isn't main

list duplication

snake_case

Suggested

Data representation

main isn't main

list duplication

snake_case

Suggested

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

4 Answers
4

4 Answers
4

4 Answers
4