Posts

Showing posts from October, 2021

Will It Bayes?

I'm trying out a new catchphrase. "Will It Bayes?" is meant to remind me to check whether a point estimate can be turned into an expectation over a distribution, whether a base rate has been considered, whether there is prior information that didn't make its way into a "data driven" discussion, whether I have  multiple hypotheses, etc. I haven't tried it often yet! But I think it should work pretty well. The phrase piggybacks off of "Will It Blend", a silly series of videos where someone checks whether a possibly nonsensical object will blend, if added to a blender, like an iPhone, miniature blender, etc. I already randomly think "Will It Blend", hopefully bootstrapping the noticing. Examples after only a few days where it's come up already and the answer was yes, it will Bayes: [REDACTED] for work Guessing which member of a partnership carved which pumpkin [REDACTED] for being raw Among Us when observing some behavior which impli...

How Well Am I Coping?

CW: Suicide I have a metric I use to track how well I'm coping with, y'know, life. I haven't seen anyone else use it, and I think it's probably very particular to me. What I do is any time I find myself thinking about death - my own death, not others' - I briefly reflect, and ask myself how long it's been since I last remember thinking similar thoughts. Then I have a scale on which I place myself. How long would it have to be before I'm surprised  it's been so long? For example, two days ago I thought about how none of [this] would be a problem for me if I didn't exist. Then I reflected and realized I had no similar thoughts the previous day, and was briefly pretty happy, because I'm currently on the 1-day part of the scale, and I hadn't  had any such thoughts for 1 day. The scale: a few hours - things are bad . Tell someone, get help, etc. eight waking hours - things are not good. Look for interventions. one day - problematic. If you know wh...

Estimating Now What We'll Later Know The Rate Will Be

Here's the setup: a stream of events comes in over time, most from working sources, and some from broken sources. We're trying to discriminate between them in real time, accepting only the legitimate events. When we accept an event from a broken source, we only learn that it was broken sometime in the future. Nevertheless, we'd like to estimate the fraction of our accepted events which are from broken sources as near real-time as possible. We have lots of historical data, so overall, we have a very good idea of our past broken rates, and also of the distribution of delays before we learn about which events were broken. Unfortunately sometimes we'd like to know the broken rate within a relatively small segment, so we might not have lots of historical data for that specific segment. Let's discuss some ways of performing the estimation and their pros and cons. Past Performance Perfectly Predicts Pfuture Bucket delay times. Find the overall distribution of delay buckets...

Creating an extra character

For Halloween, I'm running a light horror-ish RPG for some friends. We play other RPGs regularly. I had a great little game in mind called ViewScream, designed for online play. Everyone plays a character isolated on a spaceship that has Problems (TM). Each character has two life-threatening emergencies which must be solved lest they die in the end. Each character also has three technobabble solutions to help solve others' emergencies. These may or may not work; the player knows, but the other players don't know. Everyone takes turns dialoguing, telling about their emergencies, offering up solutions, seeing how they worked (or didn't). But there are 6 of us, and I didn't have a prebuilt scenario that went past 5. Characters are pretty minimal. They have a tiny personality description and a little bit of relationship info, a couple of barebones sentences explaining how they feel or think about another character. In addition they have one secret or twist or scripted mi...

Pumpkins!

Image
Today we carved pumpkins. No, wait. Today we carved pumpkins! We've wanted to carve pumpkins for several years. Each year, we think and say at home "we should carve pumpkins!" This year, instead, at the store, Samantha asked "should we buy pumpkins?" and I said "yep" and she bought pumpkins. Then we realized her hands would almost certainly not let here carve a pumpkin and became sad but thought okay, she can draw, I will carve. And then  a friend emailed, a friend I'd been meaning to invite over for [something] for months and just hadn't, so I said okay, serendipitous. Replied with a pumpkin carving invite. Turns out she really enjoys carving pumpkins and hadn't for some years. The moral of this story is that "just do it and adapt" is usually better than "wait for a situation and time that you are confident is best and only then do it".

Intentionality Aesthetic

 "Appearing intentional" covers a multitude of sins. This is not the aesthetic I generally aim for. It's the aesthetic that I aim for as a fallback, when I don't know what I'm doing. It's the aesthetic I aim for when something's already gone wrong. When you've got yourself a jazz solo and you play a wrong note, the plan is to fit that exact same note into something good , maybe 4-5 seconds later. If you have no idea how to set the table, make sure everything's symmetrical. If you don't know what a good way to arrange your bookshelf might be, pick any quality and sort by it - if you aim to please yourself, pick a quality that tickles you; if you aim to please others, pick a quality that is easily discernible at a glance. I know this approach works well. I've used it a lot, I've seen the results. But I don't know its long-term effects on me . Naively, optimizing a bunch of things to lie to the world and say "attention! I definite...

Make A Bad Plan

One of my more lucrative techniques at work I call Make A Bad Plan. This is where you know  that something  has to happen, but no one really has a good idea of exactly what or exactly how to do it, and also it's a lot easier to go work on less ambiguous projects, so how do you make it happen anyway, when no one, including you, really knows what "it" is? Two common ways this comes up are when you can see that there's an important problem in a system even if you don't know how to solve it, and when you can see that there's an important piece of information that humans know that isn't baked into automated systems and if only it was then something would be better, who knows what, but clearly humans know and act on this kind of thing so we should be able to use it, right? Make A Bad Plan calls for you to make a bad plan. Think of a few " something "s, pick one or synthesize one from your (bad) options, write it up as a draft proposal with enough detai...

Running A Moderately Effective Off-The-Cuff Brainstorming Session At Work

"Hey Guy - is this afternoon a good time for brainstorming [a certain project]? I'll send an invitation to everyone shortly. Is it OK for you to run the brainstorming session today?" Yes, it's definitely okay. I have a script I use for off-the-cuff brainstorming sessions. It's very simple and moderately effective, but importantly is very consistently moderately effective. Decide the topic. I will usually pick "the entire project, except with this particular goal or this particular constraint". Today, I chose to focus everyone on our internal beliefs about the analysis and numbers underlying the project, explicitly ignoring the external-facing view on the underlying fundamentals. Write up a short summary of background material. What do participants need to know, minimally, to do some brainstorming? Are we going to be focusing on the entire thing, one portion of the thing, or what? This should be readable and digestible in under 3 minutes. Once everyone is...

[Review] Counterfactual Estimation and Optimization of Click Metrics for Search Engines

Paper Review Counterfactual Estimation and Optimization of Click Metrics for Search Engines - Lihong Li, Shunbao Chen, Jim Kleban, Ankur Gupta Also see Stripe's talk on how they use these concepts in production. Condensed abstract: In this paper, we propose to address [the expense of A/B testing] using causal inference techniques, under the contextual-bandit framework. This approach effectively allows one to run (potentially infinitely) many A/B tests offline from search logs, making it possible to estimate and optimize online metrics quickly and inexpensively. In Brief If you've got a system that makes choices about what to do with items, and you get feedback about how good your choice turned out, you can choose to randomize your choices somewhat in a way that lets you retrospectively analyze how any  decision algorithm would have performed. The important things are to make sure you have a nonzero chance of making every choice on every item, and to make sure you know the ac...

Everything's A Skill By Default

Your default assumption when considering [Doing Something] should be that it's a skill which can be learned. Maybe you've spend a lot of time doing it, and you're pretty chill with the whole idea, and you've picked up some nice habits to make it easier. Maybe you've never done it before but it seems easy enough, after all, everyone manages it, right? Or maybe you've never done it before and wow, this is not going to go well, isn't it. I see three possibly-surprising failure modes that occur specifically because we humans see a thing to do and do not think "that's a skill!". They each happen a lot. Often invisibly, except for the consequences. Accidental Amateur Arrogance. This is where you pretty much never go to the grocery store to supply the household with food, but sit at home and snark about how you would be able to shop so much more quickly and efficiently because you're not the type to get distracted. You'd just be in and out of ...

Smooth Transitions

I'm breaking a drinking habit. It's a pretty innocuous one, for something labeled "drinking habit". Not alcohol, not even lattes. Currently I drink roughly 4-5 "Zevias" per day; a soda sweetened with stevia rather than sugar. I rarely drink anything else; occasionally decaf coffee or sparkling water with cranberry juice, but that's about it. We purchase them at a discount, usually, so this runs us 4.5x$0.70x365 = $1150/yr. I've been playing a lot of Storybook Brawl in the last couple of weeks. The basic concept is that you purchase minions for a small army (max size 7) that fights other people's small armies. Every turn, you get a bit more gold to purchase minions (starting with 2, up to 12), and every three turns, some of the minions available for purchase are substantially stronger. The shops are very random, though; you could have a shop with only early-game minions in it, even at the end of the game! So also always available is the option to ...

External Views on Internal Truth

The way to maintain a truth-seeking culture on your data science team while respecting the constraints of sales and marketing and presenting to the board and simplicity and etc is to separate messy, inexplicable truth-oriented analysis from useful views on top of the truth. This is an example of a more general heuristic: any time you're optimizing for "two objectives at once", take a step back and try to figure out how you can optimize for just one objective. A common solution is to split The Thing into multiple parts, each of which can optimize for just one objective, and then combine the results. That's what we'll do here. (Another common solution is to combine your objectives via some sort of parameter, expose the parameter as a "business" lever, refactor until the parameter you're exposing is a good  business lever, then optimize for the explicit combination of objectives.) Suppose for example you're a ~Kiva making micro-loans to under-served...

Reflexive Consistency About The Right Thing

 Today at work I proposed a sampling scheme for our ML training. Our data is highly imbalanced and more so because we don't know ground truth on most of the examples which are (roughly) at least 2% likely to be the minority class. Our architecture does not allow us to use all the data in training, so we downsample the target=0 examples and to a lesser extent the no-ground-truth examples, giving them weights inversely proportional to their sample rate. The proposal: What if we sample our data in proportion to its probability of target=1? And then calibrate to get probabilities from the new model, then resample and train, calibrate, repeat? That would definitely converge. The rebuttal: "I agree that [it] should get convergence, but can you explain why you think it would converge to the correct result?" No, no I cannot, and I'm pretty sure it would not. I didn't suggest re-labeling, just adjusting the sampling scheme. Why would I expect that sending the training algo...

Dishwasher Tasks

 I do not enjoy unloading the dishwasher. Every time I do, my brain goes into overdrive trying to figure out how to automate this task. But there isn't really any way. You just have to take each dish out individually and put it away. Okay, you can combine some, sometimes, but for the most part it's matter of just doing it. Repetitive and not very improvable. This is a "dishwasher task". At my work, an example of this when we need to clean up data from a new merchant. They've tagged each case with one of 41 different reason codes. Internally we use (say) 8 different labels for our ML algorithm. There's no shortcut here; we just have to look at each of the 41 different codes individually, one-by-one, and make a decision of how to translate them into our own schema. In mathematics, this reminds me strongly of the distaste many people have for case-based proofs. Where all the clever tricks are applied and still we end up with 17 different fundamental cases, each o...

Omelas

If you do not try new things you cannot notice when your existing things are wrong or not-quite-right (aka wrong). It's absolutely okay AND CORRECT to try things that sound ridiculous sometimes. The best long-term results come from a good mix of exploitation AND exploration. Don't get mad at yourself for trying new things, even when they seem silly and in fact fail! Exploration is good! Even when you're pretty sure _that_ instance of exploration was a poor plan, hey, you've got to explore your own sense of when exploration is a poor plan, too. Yes, in the short term, every time your exploration fails to find something new and good, it would have been better to avoid exploration. But you can't know the result of the dice roll before you actually make the roll and look. If you always castigate yourself for the exploration side of your optimization algorithms, you'll use exploration FAR LESS than you should. Now of course that doesn't mean you should just alway...