> I know, I’m using cups and teaspoons and not something more universal like grams. If my little model becomes a smash hit, I will try to add metric system support.
It's not that cups/teaspoons aren't universal that makes them noteworthy, it's that they're volumetric. You should be measuring your dry ingredients by weight. The unit isn't that important, we can convert once you start measuring properly.
It's not about linearity, it's that a "cup" of flour can be two completely different amounts of flour depending on how densely it's packed. Going by weight prevents this sort of measurement error.
Well, it CAN vary quite a lot theoretically, but that’s why there are guidelines on how to measure a cup of flour.
https://www.mashed.com/182198/youve-been-measuring-flour-wro...
Yet, to your point: when I bake (at least when I bake sourdough, which is virtually all I make), I always measure using a scale, and I measure in grams. No reason to go for imprecision when you don’t have to.
But these guidelines for how to measure flour by volume are also important simply for historical reasons: lots of the most respected cookbooks measure by volume, not by weight.
For years, chefs, professional bakers, and cookbook authors alike have urged home cooks to become comfortable with a scale, but habits are hard to change. Your mother and grandmother probably didn’t use scales, and you may even have their measuring cups in your kitchen drawer. But using a scale will change the way you cook and bake for the better in many ways.
And the merits of weighing are not only about accuracy: weighing is also more convenient. Weighing is a much easier and cleaner way to measure peanut butter, molasses, or corn syrup, for example—you simply set the mixing bowl on the scale, tare the scale (set it to zero), and measure the ingredient into the bowl, rather than having to scrape it into and out of a measuring cup. And, of course, there is the additional bonus that more than one ingredient can be measured into that same bowl.
Precision matters more in baking than in savory cooking, which is why Sebastien and Matthew provided exact weights, for optimum results. You’ll see that most of these recipes have what may seem crazily specific weights: 519 grams of flour, for instance, or 234 grams of sugar. This is because we converted these recipes from the larger-scale recipes used by the bakery. (This is another benefit of using weights—all recipes can easily be halved, doubled, or tripled, and so on, and they will work.) Do not be intimidated by these specific amounts—when you use a scale, it’s easy to measure 234 grams. However, when converting those weights to volume, we often had to round them off (despite Sebastien and Matthew’s preference that we not). In a short time it should become readily clear why weighing is the preferable route.
We strongly recommend using digital scales, either a bigger one that weighs to the tenth of a gram, or a basic kitchen scale for larger quantities and a palm scale that weighs smaller quantities.
The same cookbook recommends weighing eggs as well; recipes specify egg whites/yolks/whole by weight (all ingredients are specified by both weight and volume, though weight is recommended).
"""
Likewise, eggs vary slightly in weight from one to another; measuring eggs by weight ensures great accuracy. “Large” eggs are 56 grams/2 ounces by definition, but they vary in weight by 10 or more grams, so calling for eggs by weight, as we do in these recipes, guarantees more consistent results. And weighing allows you to use any size egg you have access to, which is especially helpful if you use farm-raised eggs, which are often not graded by size.
You could either just use the egg as it's likely close or adjust the rest of your ingredients from the weight of the egg. I made a calculator for some of my recipes that lets you put in how much of a specific ingredient you've got and tells you the adjusted amounts for the rest. With that it'd be fairly easy
>Well, it CAN vary quite a lot theoretically, but that’s why there are guidelines on how to measure a cup of flour. So says my baker mom, and so says the internet.
Where baking is done can override volumetric measuring, even with sound, consistent technique. I'm a professional bread baker working at altitude in a low humidity environment. One cup of flour for me will weigh less than a cup for someone in a high humidity area. Parents and grandparents passing down volumetric recipes in the past were unlikely to experience inconsistent baking/cooking outcomes because back in the day succeeding generations were apt to remain in the same geographic area.
Well, you’ve got all kinds of problems baking at altitude. You may need to make all kinds of adjustments to your recipes. So it feels to me like this is just one of a dozen issues you’ll have.
But yes, I did not mean to endorse measuring by volume—only that you can avoid much of the problem if you follow proper measuring technique. As I noted in my post, I personally use weights when baking.
As for the historical justification for using volumetric recipes: you’re probably overthinking it if you’re connecting it to geography, and especially if you’re connecting it to altitude. More likely the reason would just be that volumetric measuring was easier and cheaper to reproduce until quite recently. Scales can be expensive, can break, etc. Cups are cheap and durable.
Certainly, the historic justification for volumetric measuring is likely what you say, that scales were not a thing for most anyone to have until recently. My point was just that recipes used across generations in a family weren't back then (compared to nowadays) likely subject to geographic changes because people were much less likely to move far. I've got nuclear family members on both US coasts and I'm in a landlocked state. That kind of intra-family dispersal is much more common today.
That's interesting and I never knew that. I have a stainless steel liquid measuring cup, originally my mom's, from Sweden. Probably dates to the 1960s if not older. It's graduated in ml and pints. But its pints are a few ounces larger than US pints. I wonder if it's some other pint scale? I have always just assumed that they got the conversion factor wrong.
Edit: after reading your link, it appears to me that it is graduated in imperial pints, not US pints.
Oh bonus is when a recipe is partially translated from metric in youtube captions.
So you have 500ml of something, then 1 cup of something, and temperatures in Fahrenheit and you have to guess if it is US cups, or whatever cups the origin country of the video uses.
An interesting normalization of amounts for baking bread and similar foods is called “baker’s percentage”.
The total weight (or mass) of flour in the recipe is 100%, and all ingredients are measured in terms of that. You do end up with very small percentages for ingredients like yeast, but in general it’s nice because you can easily see for instance how sweet a recipe is based on how much sugar is added as a percentage of the total flour.
I wrote an Android app for managing recipes with bakers percents so I definitely agree - being able to scan over the ratios of water, salt, and sugar tells you a lot about the recipe and what the dough will feel like.
I do wish normal cooking recipes had a similar convention - it's rare that I've ever used exact amounts since the ratio and technique are always more important.
I normally agree, but this is clearly just for fun and for poking around with cloud tools. The author is not making claims about accuracy or anything, I think this post is not a good one for the stampede of “well actually...” ML approach bashing of HN. This is a nice post.
In my mind it stands in contrast to the SMART stats hard drive failure model post from a few days ago - that was an inappropriate approach AND the person was putting forward as actually being a valid analysis workflow & conclusion.
This post is obviously just for giggles, and it’s very good.
There's a more practical but relevant cooking problem that I haven't seen discussed - when extracting recipes from websites it's tricky to parse the actual ingredients and quantities.
For example, "1/2 cup of diced tomatoes or a can of tomatoes (preferably san marzano)". This sort of freeform text doesn't suit regex very well but also lacks substantial context clues. You'd most likely use named entity recognition which could recognize that "1/2" is a quantity, "cup" is the unit, etc. but I haven't gotten very good results yet.
Maybe I'll write up a post when I land on a solution.
Ah yeah I saw this initially but haven't given CRFs a try yet. I was hoping ml had advanced enough that I could throw newer solutions at the problem. Thanks for linking this though, I just realized there's a lot of labelled training data the NYT provided which I will definitely use.
The next logical step here, it seems to me, is to scrape tens of thousands of recipes and apply this more broadly. There are, alas, perhaps a dozen or more dimensions along which recipes are going to be difficult to sanitize and normalize—there’s weight vs volume, variations in how to describe similar ingredients, and perhaps worst of all is the actual steps!
But are these challenges insurmountable? Well, maybe a few would limit you in some respects, but it certainly can’t be denied that some intrepid coder-cook could push this concept somewhat further.
You could also turn this all around and have a “consensus recipe generator.” Type in “pesto genovese” and it gives you a best guess of ingredients and amounts, with links to some recipes roughly matching those consensus amounts, perhaps with notable variations/“nearby” recipe clusters (sun-dried tomato pesto? or just the most popular pine nut substitute?) noted too.
I noticed something about Adam. Suppose you want to do gradient accumulation. Easy: compute the gradients of the loss with respect to each model parameter, and accumulate it over N training examples. Then pass the result into Adam, performing a single step. This is standard gradient accumulation; it's equivalent to running a mega-GPU that can process 64 training examples at once, rather than a small-GPU that can only process 8. It just takes longer to train, since you're performing an adam update every 8 steps in that case.
But, I was staring at the Adam formula and thought of something. I'm not sure if it makes sense, but there seems to be an "alternate" way to accumulate gradients:
For each training example, compute the gradients and apply the gradients. However, we apply them in a special way. The final step of adam normally looks like this:
param = param - (lr * m_t / (v_sqrt + epsilon_t))
I propose accumulating the gradients like this:
accum = accum + (lr * m_t / (v_sqrt + epsilon_t))
Then after N training samples, when you want to do the actual variable update:
param = param - accum
accum = 0
The advantage of this approach (if it works at all) is that Adam updates continuously. Every training example would cause Adam's mean and variance estimators to update. (Recall that the whole point of Adam is that it tracks mean and variance for every parameter in the entire model.)
So, in traditional gradient accumulation, those mean and variance slots would only update every N training examples. With this approach, they would update every training example, and then the model params update every N training examples.
It might seem like a small tweak, but adam's variance stats are crucial; it's what makes adam effective. Updating the variance 8x more frequently might be an advantage.
one project I would like to do myself is a followup to YY Ahn et al's Flavor Networks research, but this time with some extra.. `zest`: using the flavor and ingredient networks, but turbocharging the model using some kind of molecular representation of the flavor compounds using e.g. dgl-lifesci for graph-based representations of individual molecules.
since graph neural networks are all the rage these days. I already made a kind of graph neural network thingy that works similarly for the AICrowd Learning to Smell competition that was kind of similar :) https://www.aicrowd.com/challenges/learning-to-smell
You can tell HN is full of engineers - all discussing the difference between weighing and measuring, the utility of using DL for this.... but no discussion of the end product. Well, I made this the other day. I measured, so probably didn't do that right and my oven is crummy, so the temperature was probably off and I think I left it in a little too long (or I'm at the wrong altitude?)
So the outside was a bit more crispy that I'd expect, but it wasn't burnt. The inside was definitely cakie - not fluffy enough to be called cake, but not solid or hard enough to be a proper cookie (the outside was cookie-like). Quite edible, but it turns out you can actually use too many chocolate chips....
Next step, watch Ben Krasnow's "Cookie Perfection Machine" video[0] on his Applied Science channel.
Next step: integrate the model with the machine. I wasn't able to reproduce this (couldn't find a link to the data in the article, the code to train the model, etc).
An interesting application is the target of a feature set configuration that returns a specific desired output activation value (eg 50/50 for two of classes). I do not know if tensorflow has an automation for this task available, perhaps could be a fun followup project for the author.
Well we know what happens next. Some ambitious kid, in some random corner of the world, is going to cut and paste it and sell it to some police dept as a "Google" profiler. It will work as badly as the police do, but now they just point at the app to cover their ass, cause you know Google said so Judge. Where it doesnt do what the cops want the kid will add "features" to make sure it does. Give it a year or two and the local PD will be happily be promoting it to everyone else for a decent commision ofcourse. "Whistleblowers" will get taken care off. VCs will step in to scale up ops. The price will be so cheap it will bankrupt most honest competition. Half the world will be using it soon enough and then ofcourse Palantir will step in and buy them. The billionaire founder will then strut around grooming the next gen, funding further mindlessness. And the same thing will repeat from Medicine to the Military, from Finance to Psychology all the while a clueless buffoon class of corporate programmed robots stand up and talk about how they are empowering the masses.
So how do we stop the cycle? Dont empower the masses. Just yet. Dont work for execs pushing such narratives. If you are in positions of authority sideline them. They dont know what they are doing. Scaling everything is a prime directive for these robots and they are too dumb and unimaginative to rewrite their own code let alone decide what is good for the population.
It's not that cups/teaspoons aren't universal that makes them noteworthy, it's that they're volumetric. You should be measuring your dry ingredients by weight. The unit isn't that important, we can convert once you start measuring properly.