This gives you a distribution unrelated to active use, puts users in the same bucket (with the same number you’re going to have the same users in the first 10%) and links combinations together.
Often problems are more complex than they seem at first sight and I have found it’s a good approach to think “what am I missing” rather than “lots of people must be making very obviously bad decisions” and reach the latter conclusion only after more work. Usually I’ve missed something.
> This gives you a distribution unrelated to active use
Is that a problem?
> puts users in the same bucket (with the same number you’re going to have the same users in the first 10%) and links combinations together.
Are you trying to do a bunch of separate staged rollouts at the same time?
> Often problems are more complex than they seem at first sight and I have found it’s a good approach to think “what am I missing” rather than “lots of people must be making very obviously bad decisions” and reach the latter conclusion only after more work. Usually I’ve missed something.
They said it gets you 80% of the way there, and that seems fitting with the replies they got.
Yes! Are you totally missing some countries, did you set it to try and hit 20% but it’s actually 60% of traffic, does it happen to include some VIP type users, how does it impact the thing you’re trying to measure while you do this?
Do you give qa user ids in the right ranges to get applied and non applied views?
Even then you’re not able to get specific combinations.
> Are you trying to do a bunch of separate staged rollouts at the same time?
That would be very common yes, otherwise a staged rollout can be a big blocker.
But more than that it means there’s one user who always gets the beta testing of everything for example.
> No developer or SaaS should be needed to make them understand tasks are checked out when you start work and contain instructions and when you are done you move the ticket to DONE.
However jira, trello, linear and basecamp are all SaaS. Then when you create a ticket, what starts the agent? Linear has integrations however that either needs codex/github running things (another SaaS) or you need to do your own agent setup.
This is a replacement for jira/etc for a project, rather than an addon.
A few scattered thoughts but a board with decoration or art of a similar size could be a nice cover, the other (more building required) would be to look if there’s a way you can fold down/away the monitor when not in use.
I wouldn’t discount the value of moving small tasks away from developers, nor the value of fast cheap prototypes.
Product owners can very quickly get, for many problems, an interactive demo without coding. For lots of problems this can be somewhere from a static html page which shows the interactions to a hacked in feature that lets them actually test if it solves the customer need and try several variations before handing over much more concrete specs of what they want to happen. So much time is lost between getting an idea from someone’s head to code to use to then find out it wasn’t communicated well and then finally that the idea didn’t help anyway and we want it in a different way.
Yes yes I know someone is about to say that now there’s pressure to push the prototype out but that’s an organisational level problem that existed anyway.
And small problems can much faster to solve as well, or even move away from devs. Often people just need some text changed somewhere or html putting together, or some basic code for analysis. They could understand the logic, but the task of writing it from scratch and how to run things may be too much - now you don’t need to prioritise work for a dev to get some sql written and they can spend their time on the larger more software engineering level problems.
"that’s an organisational level problem that existed anyway"
That's very true to many organizations. One cannot just slap an AI tool on it when you are dealing with fundamental organizational problems in the first place.
"they can spend their time on the larger more software engineering level problems"
For sure, devs still needs to focus on the right type of work and maintain the balance. I built a tool to just do that: https://worktypefocus.com/
They can ask, they can do a back and forth and they can write documentation to be used from that point onwards and write it in a common style and structure.
These are language models, being able to talk through something with them and have them extract some information is what they excel at. Given that you’d probably get a halfway decent result with a literal fixed set of questions (an Eliza level docbot) gpt 5.5 is going to nail that as a task.
Can’t say you’re wrong but the last anecdote describes many I’ve had to review for jobs long before LLMs. Fizzbuzz is a classic thing that shockingly many devs genuinely cannot do, even at home.
Yeah, I've interviewed people like this 15 years ago. Degrees and experience mean nothing in this field. The best predictor I found was personal passion projects. Let them get as nerdy as possible, then you will see pretty quickly where their skills are at and what their limits are. And you will immediately filter out people who just studied CS because they heard you can make good money.
Completely agree with this, leetcode has become such a business now of memorization for interviews it’s useless to know if someone memorized a solution or not.
you can absolutely know. they do suspiciously well. you just give harder problems until they can't solve it. how they react/approach a problem that they can't immediately solve _is_ the interview - not the "how many things they solved correctly" part.
That said - I seldom need people to be hardcore algorithm solvers
What I typically did was a variation of fizzbuzz (can the candidate code very basic logic?) and then finding a bug or minor requirements extension in their online screening test/"homework" and asking them to solve that on the spot (did they write the code themselves/can they modify it). It's typically enough, there's diminishing returns to test more in-depth the programming skills - the rest you can discuss domain knowledge, general experience, working style etc.
Maybe. There are certainly people in all fields who are book smart and did well in classes but are useless at actually practicing their field (not to mention people who cheated in school and got away with it and aren't even that), and it is worth filtering them out. But I think it is weird that CS expects good workers to have these passion projects. Do we expect civil engineers to build bridges in their back yard on the weekends? Can't someone just be good at their job and have other interests outside it?
I imagine this is simply not such a problem in other fields. Or do civil engineering schools produce that many clueless graduates? I know other engineering fields don't pay bad, but software is another realm.
I agree, however there are so many interviewers who will still treat that as some softball criteria and insist that unless you "prepare" for an interview by memorizing leetcode you are 100% a faker and liar.
Maybe they themselves are fakers and liars / deeply insecure. I got bumped out of an interview rather rudely once because I blanked and couldn’t answer a trivia question about arrays.
Something that is for sure new is the AI interview cheating tools which listen in on the call and provide answers in an overlay invisible to screen sharing. The only way to deal with it would either be invasive spyware on the applicants computer or asking them to do the interview face to face.
A relatively low tech solution could be to give them 2 separate conferencing links, ask them to join each one from a different device, and have the secondary device point the camera and the screen of the primary device.
> If they can ship code that matches a spec, why does it matter if they’re using ai or not?
I am perfectly capable of writing specs, and feeding them to 3 separate copies of Claude Code all by myself. Then I task switch between the tmux windows based on voice messages from the pack of Claudes. This workflow is fine for some things, and deeply awful for others.
Basically, if a developer is just going to take my spec and hand it to Claude Code, then they're providing zero value. I could do that myself, and frequently do.
The actual bottleneck is people who can notice, "The god object is crumbling under the weight of managing 6 separate concerns with insufficient abstraction." Or "Claude has created 5 duplicate frameworks for deploying the app on Docker. We need to simplify this down to 1 or we're in hell." I will happy fight to hire people who can do the latter work. But those people can all solve fizzbuzz in their sleep.
People who just "ship code that matches a spec" without understanding the technical details are providing close to zero value right now.
There is an interesting niche for people with deep knowledge of customer workflows who can prompt Claude Code. These people can't build finished products using Claude. But they can iterate rapidly on designs until they find a hit. Which we can then fix using people with deeper engineering knowledge and taste.
But if you're not bringing either deep customer knowledge or actual engineering knowledge, you're not adding much these days.
Tell Claude you want to set up notifications, using "hooks", including "Notification" and "Stop" and anything new they've added. Claude can figure out how to do this for your operating system.
It's not perfect—sometimes a Claude notifies 3 minutes after it stopped doing anything. But it's helpful when I'm running multiple Claudes and also reviewing code elsewhere.
Your brain may feel like someone put it in a blender. Be warned.
Fizzbuzz is such an incredibly simple problem if you can’t do it I struggle to see how you’d be able to complete any task that requires very basic reasoning and very basic coding knowledge. And if an AI system can do those parts, what am I getting for spending tens of thousands of pounds per year by hiring a person who can’t? Wouldn’t I just tag codex on the tickets?
I’m not talking about gotcha level stuff here where the first time it didn’t compile because of a bracket or anything, or even first time wrong. They couldn’t do Fizzbuzz in a language of their choice, at all.
Those that could were always annoyed at having to do such things because how could someone coming for a contract position not be able to do this? Without seeing what a filter it really was.
I feel the same way about inverting a binary tree, but a lot of people act like it's an arduous request. I am guessing it's because they've never read the description of what inverting a binary tree is, but maybe people are just that bad at recursion.
Right. For the first many decades of computing, recursion was just always the wrong answer for a production software system. (Feel free to provide a counter-example, but please begin with an explanation of how the size of a call stack frame is determined and how exceeding the base allocation is handled on this platform).
So what tree-traversal/quicksort problems tend to measure is how long it's been since you last did CS class homework problems.
Great. Please explain how the size of a call stack frame is determined and how exceeding the base allocation is handled on the particular platform you're proposing to recurse upon.
> If they can ship code that matches a spec, why does it matter if they’re using ai or not?
The inability to write fizzbuzz strongly implies their inability to understand what they've shipped. Review is some significant portion of the job. Understanding of the product is also part of the job.
Specs are also in a sense, scaled down, fuzzy, natural language descriptions of a feature. The fuzziness is the source of a bugs, or at least a mismatch between the actual desired feature and what was written down at spec writing time. As such, just matching a spec is just the bare minimum that a good dev should be doing. They should be understanding what the spec is _not_ saying, understanding holes in their implementation, how their implementation enables or hinders the next feature and the next, next feature, etc. I don't think any of that is possible without understanding what was actually implemented.
For the same reason it's important your mechanic can identify which parts of a car are the wheel.
Who cares as long as the car is fixed, right? As long as the mechanic can Chinese-room his way to a working car, why does it matter how much of it he actually understands?
And why hire the mechanic instead of hiring the Chinese room?
First: FizzBuzz is a test to know if you understand the most basic constructs of programming. The kind of thing you learn in the first week of CS101. I forgot what it was, and when I looked at the problem I knew the answer.
More broadly: In the short/medium term, we still need humans who have the skills to understand software largely on their own. We will always need those who understand software engineering and architecture. Perhaps in 25 years LLMs will be so good that learning Python by hand will be like learning assembly today. But not yet.
The field is not ready for new practitioners to be know-nothing Prompt engineers. If we do that, we cut the legs out from under the education pipeline for programming.
It’s about deeply understanding what you’re doing. Like as a kid before you knew how to ride a bike, you could sit on a bike and peddling, but until it “clicked” you couldn’t balance and keep going forward stable. Fizzbuzz tests your ability to reason through a problem that seems simple on its face, but is easy to get wrong and/or overthink.
I can see this perspective, but FizzBuzz is such a low bar that so many can pass, I'd greatly prefer to hire someone that can ship code that matches a spec do this challenge.
It doesn't. It's just a low-end skill filter that got really popular. It could have easily been replaced by other tests like is this word a palindrome.
I wrote the "function to reverse a string" in a job interview once. Then the interviewer reminded me that strrev() had been part of the standard C library since K&R.
I'd been programming in C(++) for ~15 years by then and had never had the occasion to reverse a string. I still wonder whether that makes it a good job interview question, or a terrible one. Some of both probably.
Firing people is problematic. I'd be okay with it if the economy wasn't utter trash. It's way better to do the work upfront and prefer false negatives over false positives.
Even better would be if we had a well-respected credential, so both employees and employers can both avoid these long interview loops. I'd much rather get hazed once in a big way than tons of little hazings over a life time.
What's the value of a recording inside my house to the police? That requires paying a human to go around recording it?
reply