Baking with machine learning (2020)

kyrofa · on Jan 30, 2021

> I know, I’m using cups and teaspoons and not something more universal like grams. If my little model becomes a smash hit, I will try to add metric system support.

It's not that cups/teaspoons aren't universal that makes them noteworthy, it's that they're volumetric. You should be measuring your dry ingredients by weight. The unit isn't that important, we can convert once you start measuring properly.

justlv · on Jan 30, 2021

Interesting, but would it matter if the model can learn the non-linearity?

YokoZar · on Jan 30, 2021

It's not about linearity, it's that a "cup" of flour can be two completely different amounts of flour depending on how densely it's packed. Going by weight prevents this sort of measurement error.

mdeck_ · on Jan 30, 2021

Well, it CAN vary quite a lot theoretically, but that’s why there are guidelines on how to measure a cup of flour. https://www.mashed.com/182198/youve-been-measuring-flour-wro... Yet, to your point: when I bake (at least when I bake sourdough, which is virtually all I make), I always measure using a scale, and I measure in grams. No reason to go for imprecision when you don’t have to.

But these guidelines for how to measure flour by volume are also important simply for historical reasons: lots of the most respected cookbooks measure by volume, not by weight.

kranner · on Jan 30, 2021

From the Bouchon Bakery cookbook:

"""

WEIGHING VERSUS MEASURING

For years, chefs, professional bakers, and cookbook authors alike have urged home cooks to become comfortable with a scale, but habits are hard to change. Your mother and grandmother probably didn’t use scales, and you may even have their measuring cups in your kitchen drawer. But using a scale will change the way you cook and bake for the better in many ways.

And the merits of weighing are not only about accuracy: weighing is also more convenient. Weighing is a much easier and cleaner way to measure peanut butter, molasses, or corn syrup, for example—you simply set the mixing bowl on the scale, tare the scale (set it to zero), and measure the ingredient into the bowl, rather than having to scrape it into and out of a measuring cup. And, of course, there is the additional bonus that more than one ingredient can be measured into that same bowl.

Precision matters more in baking than in savory cooking, which is why Sebastien and Matthew provided exact weights, for optimum results. You’ll see that most of these recipes have what may seem crazily specific weights: 519 grams of flour, for instance, or 234 grams of sugar. This is because we converted these recipes from the larger-scale recipes used by the bakery. (This is another benefit of using weights—all recipes can easily be halved, doubled, or tripled, and so on, and they will work.) Do not be intimidated by these specific amounts—when you use a scale, it’s easy to measure 234 grams. However, when converting those weights to volume, we often had to round them off (despite Sebastien and Matthew’s preference that we not). In a short time it should become readily clear why weighing is the preferable route.

We strongly recommend using digital scales, either a bigger one that weighs to the tenth of a gram, or a basic kitchen scale for larger quantities and a palm scale that weighs smaller quantities.

"""

birdsbirdsbirds · on Jan 30, 2021

Doesn't this all break down once you use eggs? Absolute weights don't help much if you have to add one egg yolk since they don't have a fixed weight.

For cakes, all other ingredients should be measured in multiples of the used egg parts.

kranner · on Jan 30, 2021

The same cookbook recommends weighing eggs as well; recipes specify egg whites/yolks/whole by weight (all ingredients are specified by both weight and volume, though weight is recommended).

"""

Likewise, eggs vary slightly in weight from one to another; measuring eggs by weight ensures great accuracy. “Large” eggs are 56 grams/2 ounces by definition, but they vary in weight by 10 or more grams, so calling for eggs by weight, as we do in these recipes, guarantees more consistent results. And weighing allows you to use any size egg you have access to, which is especially helpful if you use farm-raised eggs, which are often not graded by size.

"""

Wowfunhappy · on Jan 30, 2021

It does seem like a lot more work though with eggs.

issamehh · on Jan 31, 2021

You could either just use the egg as it's likely close or adjust the rest of your ingredients from the weight of the egg. I made a calculator for some of my recipes that lets you put in how much of a specific ingredient you've got and tells you the adjusted amounts for the rest. With that it'd be fairly easy

maxwell · on Jan 30, 2021

Professional bakers use cartons of yolks and whites instead of cracking eggs.

NaOH · on Jan 30, 2021

>Well, it CAN vary quite a lot theoretically, but that’s why there are guidelines on how to measure a cup of flour. So says my baker mom, and so says the internet.

Where baking is done can override volumetric measuring, even with sound, consistent technique. I'm a professional bread baker working at altitude in a low humidity environment. One cup of flour for me will weigh less than a cup for someone in a high humidity area. Parents and grandparents passing down volumetric recipes in the past were unlikely to experience inconsistent baking/cooking outcomes because back in the day succeeding generations were apt to remain in the same geographic area.

mdeck_ · on Jan 30, 2021

Well, you’ve got all kinds of problems baking at altitude. You may need to make all kinds of adjustments to your recipes. So it feels to me like this is just one of a dozen issues you’ll have.

But yes, I did not mean to endorse measuring by volume—only that you can avoid much of the problem if you follow proper measuring technique. As I noted in my post, I personally use weights when baking.

As for the historical justification for using volumetric recipes: you’re probably overthinking it if you’re connecting it to geography, and especially if you’re connecting it to altitude. More likely the reason would just be that volumetric measuring was easier and cheaper to reproduce until quite recently. Scales can be expensive, can break, etc. Cups are cheap and durable.

NaOH · on Jan 30, 2021

Certainly, the historic justification for volumetric measuring is likely what you say, that scales were not a thing for most anyone to have until recently. My point was just that recipes used across generations in a family weren't back then (compared to nowadays) likely subject to geographic changes because people were much less likely to move far. I've got nuclear family members on both US coasts and I'm in a landlocked state. That kind of intra-family dispersal is much more common today.

mdeck_ · on Jan 30, 2021

Fair enough—I think we’re agreed, then! :)

fifilura · on Jan 30, 2021

While I agree with using scales, it is just so much less messy.

And altitude may be an issue, never had to consider that.

I think there are other factors that affects the amount of flour you should use, such as the ratio of water already in the flour.

If you use fresh flour directly from the mill, you may not need as much water in the dough.

I also imagine the method of grinding the flour may make a difference.

aidenn0 · on Jan 30, 2021

Indeed. She should use mcwt (thousands of a hundredweight). That will be much improved :)

kyrofa · on Jan 30, 2021

shrug it will produce more consistent results than what she has now, anyway.

jzer0cool · on Jan 30, 2021

Fun fact, measure of a teaspoon is confusing:

1 US tablespoon = 14.8 mL = 3 teaspoon

A metric tablespoon is exactly equal to 15 mL

The Australian definition: 1 tablespoon = 20 mL = 4 teaspoons

> https://en.wikipedia.org/wiki/Teaspoon > https://en.wikipedia.org/wiki/Tablespoon

Now my so gave me a weird look. I had all the ingredients in the bowl and I said, "It's a cookie!"

magnetowasright · on Jan 30, 2021

Cups also differ regionally; ‘half a liquid pint’ in ‘US customary units’, and varies between 200ml and 250ml elsewhere.

https://en.m.wikipedia.org/wiki/Cup_(unit)

throwawayboise · on Jan 30, 2021

That's interesting and I never knew that. I have a stainless steel liquid measuring cup, originally my mom's, from Sweden. Probably dates to the 1960s if not older. It's graduated in ml and pints. But its pints are a few ounces larger than US pints. I wonder if it's some other pint scale? I have always just assumed that they got the conversion factor wrong.

Edit: after reading your link, it appears to me that it is graduated in imperial pints, not US pints.

com2kid · on Jan 30, 2021

Oh bonus is when a recipe is partially translated from metric in youtube captions.

So you have 500ml of something, then 1 cup of something, and temperatures in Fahrenheit and you have to guess if it is US cups, or whatever cups the origin country of the video uses.

tppiotrowski · on Jan 30, 2021

The instant pot also comes with a “measuring cup” and it is a different amount than my Pyrex measuring cup. Took about 6 months to discover that.

tornato7 · on Jan 30, 2021

There are also Metric Cups, Imperial Cups, American Cups and Japanese Cups. Good luck following a recipe now.

andrewem · on Jan 30, 2021

An interesting normalization of amounts for baking bread and similar foods is called “baker’s percentage”.

The total weight (or mass) of flour in the recipe is 100%, and all ingredients are measured in terms of that. You do end up with very small percentages for ingredients like yeast, but in general it’s nice because you can easily see for instance how sweet a recipe is based on how much sugar is added as a percentage of the total flour.

https://en.m.wikipedia.org/wiki/Baker_percentage

slow_donkey · on Jan 30, 2021

I wrote an Android app for managing recipes with bakers percents so I definitely agree - being able to scan over the ratios of water, salt, and sugar tells you a lot about the recipe and what the dough will feel like.

I do wish normal cooking recipes had a similar convention - it's rare that I've ever used exact amounts since the ratio and technique are always more important.

throwawayboise · on Jan 30, 2021

This is the right approach. Then the measurement system doesn't matter. You can use lbs/oz, grams, whatever kind of scale you have handy.

ipsum2 · on Jan 29, 2021

Seems like a linear model would work equally as well, without all the deep learning bits.

jamesfe · on Jan 29, 2021

You are right, but she is a GCP ML advocate, so her job is to distill this stuff down into blog posts or 45 minute talk tracks.

mlthoughts2018 · on Jan 29, 2021

I normally agree, but this is clearly just for fun and for poking around with cloud tools. The author is not making claims about accuracy or anything, I think this post is not a good one for the stampede of “well actually...” ML approach bashing of HN. This is a nice post.

In my mind it stands in contrast to the SMART stats hard drive failure model post from a few days ago - that was an inappropriate approach AND the person was putting forward as actually being a valid analysis workflow & conclusion.

This post is obviously just for giggles, and it’s very good.

ipsum2 · on Jan 30, 2021

Yeah I know its just for fun, but you could use a single linear layer to demonstrate the same thing, and use the fancy TF tools as well.

simonhughes22 · on Jan 29, 2021

My thoughts exactly. A DL for 100 row dataset seems like overkill (and likely overfits).

refactor_master · on Jan 29, 2021

But tensorflow go brrrr

ekianjo · on Jan 30, 2021

Agree no need for DL for this. Far more efficient models for this kind of use exist.

ff7f00 · on Jan 30, 2021

How would you go about creating something that could generate recipes (e.g. generate me a cookie recipe)?

slow_donkey · on Jan 30, 2021

There's a more practical but relevant cooking problem that I haven't seen discussed - when extracting recipes from websites it's tricky to parse the actual ingredients and quantities.

For example, "1/2 cup of diced tomatoes or a can of tomatoes (preferably san marzano)". This sort of freeform text doesn't suit regex very well but also lacks substantial context clues. You'd most likely use named entity recognition which could recognize that "1/2" is a quantity, "cup" is the unit, etc. but I haven't gotten very good results yet.

Maybe I'll write up a post when I land on a solution.

Mathnerd314 · on Jan 30, 2021

There was an old HN post about the NYT recipe tagger. This seems to be the most up-to-date repo: https://github.com/mtlynch/ingredient-phrase-tagger

slow_donkey · on Jan 30, 2021

Ah yeah I saw this initially but haven't given CRFs a try yet. I was hoping ml had advanced enough that I could throw newer solutions at the problem. Thanks for linking this though, I just realized there's a lot of labelled training data the NYT provided which I will definitely use.

mdeck_ · on Jan 30, 2021

The next logical step here, it seems to me, is to scrape tens of thousands of recipes and apply this more broadly. There are, alas, perhaps a dozen or more dimensions along which recipes are going to be difficult to sanitize and normalize—there’s weight vs volume, variations in how to describe similar ingredients, and perhaps worst of all is the actual steps!

But are these challenges insurmountable? Well, maybe a few would limit you in some respects, but it certainly can’t be denied that some intrepid coder-cook could push this concept somewhat further.

You could also turn this all around and have a “consensus recipe generator.” Type in “pesto genovese” and it gives you a best guess of ingredients and amounts, with links to some recipes roughly matching those consensus amounts, perhaps with notable variations/“nearby” recipe clusters (sun-dried tomato pesto? or just the most popular pine nut substitute?) noted too.

sillysaurusx · on Jan 29, 2021

I noticed something about Adam. Suppose you want to do gradient accumulation. Easy: compute the gradients of the loss with respect to each model parameter, and accumulate it over N training examples. Then pass the result into Adam, performing a single step. This is standard gradient accumulation; it's equivalent to running a mega-GPU that can process 64 training examples at once, rather than a small-GPU that can only process 8. It just takes longer to train, since you're performing an adam update every 8 steps in that case.

But, I was staring at the Adam formula and thought of something. I'm not sure if it makes sense, but there seems to be an "alternate" way to accumulate gradients:

For each training example, compute the gradients and apply the gradients. However, we apply them in a special way. The final step of adam normally looks like this:

  param = param - (lr * m_t / (v_sqrt + epsilon_t))

I propose accumulating the gradients like this:

  accum = accum + (lr * m_t / (v_sqrt + epsilon_t))

Then after N training samples, when you want to do the actual variable update:

  param = param - accum
  accum = 0

The advantage of this approach (if it works at all) is that Adam updates continuously. Every training example would cause Adam's mean and variance estimators to update. (Recall that the whole point of Adam is that it tracks mean and variance for every parameter in the entire model.)

So, in traditional gradient accumulation, those mean and variance slots would only update every N training examples. With this approach, they would update every training example, and then the model params update every N training examples.

It might seem like a small tweak, but adam's variance stats are crucial; it's what makes adam effective. Updating the variance 8x more frequently might be an advantage.

inopinatus · on Jan 30, 2021

I recently used AI Dungeon to obtain an omelette recipe. You follow the resulting instructions at https://play.aidungeon.io/main/adventureView?publicId=6a346e... at your own risk.

Mathnerd314 · on Jan 30, 2021

> cornflour (ground corn, water)

pizza · on Jan 29, 2021

Side note:

one project I would like to do myself is a followup to YY Ahn et al's Flavor Networks research, but this time with some extra.. `zest`: using the flavor and ingredient networks, but turbocharging the model using some kind of molecular representation of the flavor compounds using e.g. dgl-lifesci for graph-based representations of individual molecules.

Basically instead of

food <-> ingredients <-> flavor molecules (strings) relations

use

food <-> ingredients <-> flavor molecule (graphs)

since graph neural networks are all the rage these days. I already made a kind of graph neural network thingy that works similarly for the AICrowd Learning to Smell competition that was kind of similar :) https://www.aicrowd.com/challenges/learning-to-smell

devindotcom · on Jan 29, 2021

In case you're wondering why this is circulating, Google had it on their cloud blog a couple weeks ago:

https://blog.google/products/google-cloud/just-desserts-baki...

I wrote it up around then too. Haven't tried the recipe, but the cakies look legit.

dfraser992 · on Jan 30, 2021

You can tell HN is full of engineers - all discussing the difference between weighing and measuring, the utility of using DL for this.... but no discussion of the end product. Well, I made this the other day. I measured, so probably didn't do that right and my oven is crummy, so the temperature was probably off and I think I left it in a little too long (or I'm at the wrong altitude?)

So the outside was a bit more crispy that I'd expect, but it wasn't burnt. The inside was definitely cakie - not fluffy enough to be called cake, but not solid or hard enough to be a proper cookie (the outside was cookie-like). Quite edible, but it turns out you can actually use too many chocolate chips....

Jugurtha · on Jan 29, 2021

Next step, watch Ben Krasnow's "Cookie Perfection Machine" video[0] on his Applied Science channel.

Next step: integrate the model with the machine. I wasn't able to reproduce this (couldn't find a link to the data in the article, the code to train the model, etc).

- [0]: https://www.youtube.com/watch?v=8YEdHjGMeho

nicholast · on Jan 30, 2021

An interesting application is the target of a feature set configuration that returns a specific desired output activation value (eg 50/50 for two of classes). I do not know if tensorflow has an automation for this task available, perhaps could be a fun followup project for the author.

thih9 · on Jan 29, 2021

Looks like the article is almost a year old, we should add "(2020)" to the title.

xiphias2 · on Jan 30, 2021

I was hoping for a home made robot that cooks at home...are we still that far from it?

xiaodai · on Jan 29, 2021

I honestly read banking....

vavooom · on Jan 30, 2021

Was also confused how banking and tablespoon measurements related!

cozzyd · on Jan 29, 2021

I guess this is not a reference to the TDP of some video cards.

fileyfood500 · on Jan 29, 2021

go sara!!

op03 · on Jan 30, 2021

Well we know what happens next. Some ambitious kid, in some random corner of the world, is going to cut and paste it and sell it to some police dept as a "Google" profiler. It will work as badly as the police do, but now they just point at the app to cover their ass, cause you know Google said so Judge. Where it doesnt do what the cops want the kid will add "features" to make sure it does. Give it a year or two and the local PD will be happily be promoting it to everyone else for a decent commision ofcourse. "Whistleblowers" will get taken care off. VCs will step in to scale up ops. The price will be so cheap it will bankrupt most honest competition. Half the world will be using it soon enough and then ofcourse Palantir will step in and buy them. The billionaire founder will then strut around grooming the next gen, funding further mindlessness. And the same thing will repeat from Medicine to the Military, from Finance to Psychology all the while a clueless buffoon class of corporate programmed robots stand up and talk about how they are empowering the masses.

So how do we stop the cycle? Dont empower the masses. Just yet. Dont work for execs pushing such narratives. If you are in positions of authority sideline them. They dont know what they are doing. Scaling everything is a prime directive for these robots and they are too dumb and unimaginative to rewrite their own code let alone decide what is good for the population.

pcstl · on Jan 30, 2021

This seems like a rather... circuitous justification for gatekeeping knowledge.