Mykyta Pavlenko

Opus 4.7 Is the Best Model I’ve Used. It’s Also the One I Trust Least.

Mykyta — Thu, 16 Apr 2026 16:15:10 GMT

1. Introduction - I got a smarter model and a bigger bill on the same day

Okay so. Today Anthropic shipped Opus 4.7. I ran the same prompt I ran a week ago on 4.6. The answer was better. Cleaner plan, fewer tool errors, one fewer retry.

Then I opened the billing dashboard.

My API spend for the last 30 days is up 27% on the same workload. Same users. Same product. Same prompts, give or take. And none of that is because I chose to spend more.

Here’s the part that took me a minute to process: the model never asked me how much thinking to do. It decided. And it decided to think more. Sometimes a lot more. I did not notice until the bill showed up.

This is not a story about Opus 4.7 being bad. On the benchmarks it is a clear step up. +13% on Anthropic’s 93-task coding set. 70% on CursorBench versus 58% for 4.6. 87.6% on SWE-bench Verified, ahead of both GPT-5.4 and Gemini 3.1 Pro on the metrics that map to actual engineering work. The model is real, the gains are real, and for a lot of people it will just feel like a free upgrade.

This is a story about something else. Something you only see if you run a product on top of Claude.

Between February 9 and March 3, Anthropic made two changes to how Claude thinks. On February 9, they moved Opus 4.6 to adaptive thinking by default. On March 3, they dropped the default effort level from high to medium. No release notes in your inbox. No migration guide you had to sign off on. The API still works. Your code still compiles. Your users still get responses. The behavior of your product changed anyway.

Opus 4.7 is the moment this stops being a temporary default and becomes the architecture. The old budget_tokens parameter - the one where you told the model “you have 10,000 tokens to think, no more” - is deprecated. Adaptive thinking is now the only supported thinking mode on 4.7. You cannot go back. You can ask for effort: high. You cannot ask for predictability.

So I want to write about the thing most people are going to miss this week while they trade benchmark screenshots on X. We got a smarter model. And we lost the thermostat.

I am calling the pattern Default Drift. And if you build anything on top of an LLM, I think it is the most important AI story of 2026.

2. Perspective - The thermostat is gone, and nobody voted

Here is the way most people will cover the Opus 4.7 release.

They will put up a table. They will highlight +13% on coding, 94.2% on GPQA Diamond, a new tokenizer, a bigger vision input, claude-opus-4-7 in the model string. They will conclude that Opus is still the best agentic coding model on the market and move on. All of that is accurate. None of it is the story.

The actual story is architectural. Anthropic shipped the first flagship model in Claude’s history that removes a control knob developers had since extended thinking launched. It is not a feature deprecation. It is a philosophy change.

Old world - call it “extended thinking” - worked like a thermostat. You told Claude: “think up to 10,000 tokens on this.” Claude did that. Your latency was bounded. Your cost was bounded. You could tune per endpoint - expensive reasoning for the “plan” step, cheap quick answers for the “clarify” step. If your billing alert triggered, you dropped the budget. If users complained about speed, you dropped the budget. You were the operator.

New world - adaptive thinking - works like a smart home. Claude looks at the request, decides it is complex, and spends what it thinks the request deserves. You get an effort knob with three positions - low, medium, high - but those positions are not promises. Claude still decides how much to think inside each position. At the default (high), Claude almost always thinks. At medium, sometimes. At low, rarely. None of those are numbers. They are moods.

Interleaved thinking is on by default, which means Claude also thinks between tool calls. For a long-running agent that is great. For cost predictability it is a second multiplier you cannot turn off.

Here is how you feel this in practice.

In late February, an AMD engineer named Stella Laurenzo ran telemetry across 6,852 Claude Code sessions, 234,760 tool calls, and 17,871 thinking blocks from January through March. Her GitHub issue claimed a 67% drop in sustained reasoning compared to before the February changes. Boris Cherny, the Claude Code lead, had to respond publicly on Hacker News within a few hours. Anthropic’s counter was not “nothing changed.” Anthropic’s counter was “we changed defaults, we did not downgrade the model.” Which is technically true and beside the point.

The point is that for 52 days, every person building on Claude Code was running a different product than they thought they were running. Not because the model got worse. Because the controls Anthropic shipped underneath the model quietly moved.

Think about that for your product.

Your product’s behavior is downstream of Claude’s behavior. When Claude suddenly spends 2x the thinking tokens on the same user action, three things happen on your side. One, margin compresses. Two, p95 latency shifts and users notice. Three, you have to explain to a support request why “the AI is acting differently today” when you did not ship anything.

The same thing just happened to every serious developer on Claude Code. And 4.7 is not a reversal. It is the consolidation of the new model.

Here is the aha. The benchmarks are measuring capability. They are not measuring controllability. And in 2026, for anyone building a real product on AI, controllability is where the game is actually played.

You can write a great model review that says “4.7 is better than 4.6.” Every benchmark I looked at supports it. You can also write a true PM review that says “my product got harder to run this month,” and have that be true at the same time. Those two things are not in conflict. They are happening on two different axes, and most of the industry only looks at one.

The common frame is “new model, better.” My frame is “new model, new defaults, new operating contract - read it carefully.” I am calling this Default Drift because that names the actual threat. Defaults shift under you. You did not sign anything. Your product is running on someone else’s moving floor.

Anthropic is not the villain here. Anthropic is optimizing for the median developer - the one who never tuned budget_tokens, who wants the model to “just work,” who is building a chat app or a support bot where controllability matters less than raw capability. For that audience, adaptive thinking is a real win.

The trade-off is this: the better the defaults get for the median user, the more invisible the drift becomes for the operator user. And if you are reading this, you are almost certainly the operator.

3. Gamify - Five things I am doing this week because of Opus 4.7

I am not going to tell you to panic. I am not going to tell you to switch to GPT-5.4. I use Claude every day in my work. I am probably going to keep using it. The model is genuinely strong and the Routines feature Anthropic shipped alongside 4.7 is a real upgrade for agentic workflows.

But I am changing five things this week, and if you build on top of any LLM, you probably should too.

Step 1: Log your baselines before the next default moves

The biggest mistake I made in February was not having a “before” snapshot. When people started complaining about 4.6, I had no way to prove or disprove what they were saying. I was arguing from vibes.

Tonight I am going to do what I should have done in December. I am picking 20 real prompts from my own production traffic - a mix of easy ones, hard ones, agentic flows, and single-shot questions. I am going to run each of them once on Opus 4.7, and I am going to log three numbers per run: total output tokens, thinking tokens, end-to-end latency.

That is my new baseline. The next time Anthropic tweaks a default, I will know within a day. You cannot defend a position you did not measure.

Bonus: those 20 prompts become my regression suite. Every release, I rerun them. Every drift, I see it. This is the single highest-leverage change you can make.

Step 2: Pin your effort, do not inherit it

The default effort on Opus 4.7 is high. That sounds great until you remember the February story, when the default quietly dropped to medium and a lot of builders did not notice for weeks.

Every production call I make is now going to set effort explicitly. If I need deep reasoning, I write it. If I want fast and shallow, I write it. Nothing is left to inheritance. The two lines of code it adds are cheaper than a surprise outage.

The meta-point: treat defaults like you treat magic constants in code. Extract them, name them, own them. The moment a default is implicit, it belongs to whoever owns the upstream, and that is not you.

Step 3: Build a 10-prompt eval suite and run it weekly

Your product depends on the model behaving a certain way. That dependency is not tested. Fix that.

Pick 10 prompts that represent what your real users do. For each, define what “right” looks like - not a strict string match, but a structured rubric. Did it use the right tool? Did it include the key data point? Did it finish without looping? Did thinking tokens stay in the expected band?

Run this once a week. Manually, if you have to. Script it if you can. The goal is not perfection. The goal is an early-warning system. If next month your rubric score drops 15 points, you know the conversation to have with your users before they have it with you.

This is the analog to application monitoring from the pre-AI era. You would not run a web app without APM. You should not run an AI product without an eval harness. The drift is already happening. The only question is whether you see it.

Step 4: Write a fallback path into your architecture this quarter

I do not think GPT-5.4 is a better everyday model than Opus 4.7. But I do think a product that can only run on one upstream is fragile.

This quarter I am refactoring my AI layer so that every call goes through an internal abstraction that can route to Claude, GPT, or Gemini based on a config flag. That does not mean I am going to swap providers. That means when Anthropic makes its next quiet default change, I can move the 5% of my workload that is most sensitive to the change in one afternoon instead of one sprint.

This is not multi-cloud maximalism. It is insurance. The premium is a couple of weeks of engineering. The payout is every future default drift and every future outage. For a real product, that math is easy.

If you are pre-revenue or solo, make the abstraction small. One function. Two providers. The point is the optionality, not the elegance.

Step 5: Push back in public when something breaks

This is the part most builders skip, and it is the most important.

The February changes got addressed because Stella Laurenzo wrote a GitHub issue with numbers, ran telemetry, forced a public response. Not because thousands of people tweeted vibes. Anthropic reacts to specific, reproducible, measured evidence. If you see behavior change and you have the data, file the issue. Post the chart. Reply to Boris Cherny with your numbers, not your opinion.

I say this without snark - Anthropic is one of the more responsive labs in the industry. But they cannot read your mind and they cannot prioritize what they cannot see. The eval suite from step 3 is not just for your benefit. It is the evidence base for every future conversation you will have with a model vendor. And every builder who files a real issue raises the floor for every other builder.

You are not a customer. You are a counterparty. Act like one.

Closing

There is a version of this post that reads “Claude got nerfed, I am mad.” That is not the post I wanted to write because that post is not true and it is not useful. Opus 4.7 is the strongest model I have used. Anthropic shipped real, measurable improvements, and they are going to ship more.

The post I wanted to write is the one that nobody else seems to be writing this week.

When your product runs on top of a model that decides how much to think, you are no longer fully the product manager of your own product. You share the role with whichever team inside your vendor owns the default configuration. That team optimizes for their objective function, not yours. Right now those objectives happen to align most of the time. They will not always. And the places they diverge are exactly the places your users feel it first.

Default Drift is the name I am giving this because it describes what is actually happening. Defaults drift. Products built on those defaults drift with them. And most of the drift is invisible unless you measure it.

We are early. Every builder on AI right now is building on infrastructure that is being redesigned underneath us in real time. That is the nature of being early. The people who will win this decade are not the ones with the best prompts. They are the ones who notice when the floor moves, and who built the sensors to notice in time.

Go measure something this week. Pick the 20 prompts. Log the tokens. Pin the effort. Then do the same thing next week, and the week after. You will be shocked at how much information you have been leaving on the table.

Opus 4.7 is a great model. It is also a reminder that the contract between you and your upstream is shorter and softer than you thought. That is fine. But read it carefully.

And bring your own measurements.

If you run a product on top of Claude, GPT, or Gemini - reply and tell me what your baseline looks like. I am collecting examples for a follow-up, and the cleaner the data, the more useful the next piece will be.

Who Owns Your AI Memory? Because It Probably Isn’t You.

Mykyta — Tue, 14 Apr 2026 16:59:01 GMT

1. Introduction - The version of me inside ChatGPT does not exist anymore

Okay so. I was using ChatGPT the other day and it gave me advice that would have been great - if I were still the person I was in 2023.

Same thing happened the week before. And the month before. I have been using ChatGPT since late 2022, and somewhere around the two-year mark I started feeling it: the model was responding to a version of me that no longer exists. Old priorities. Old preferences. Projects I had already killed. Opinions I had reversed.

I tried to fix it inside ChatGPT. I could not. I could not inspect what it remembered. I could not reliably overwrite it. The memory layer was there, somewhere, but it was the vendor’s copy of me, not mine.

Try this for yourself. Ask ChatGPT: “what do you actually remember about me?” You will get a tidy summary. Your name, your job, that you like short emails. That is the shallow layer. The “nice to meet you” layer. The real memory lives underneath - the decisions, the reasoning, the way you have changed your mind over two years of conversations. And none of that belongs to you. It lives inside somebody else’s database, behind a chat interface that hands you the summary and hides everything else.

So I did something a year ago would have sounded crazy: I built my own memory system. Local vector database on a Mac Mini, plugged into Claude, ChatGPT, Claude Desktop, and every MCP-aware tool I use. One brain, many clients. I control what goes in. I control what stays.

This article is not really about the code. It is about the question underneath it: who owns the context of your life with AI? If the answer is “a company whose retention policy I have not read,” that is worth sitting with.

2. Perspective - The lock-in nobody talks about

Here is the thing most people miss.

At the macro level, the top AI models have converged. Blindfold-test Claude, ChatGPT, and Gemini on a hundred real tasks and most people could not tell which is which. The differences are in the micro - tone, edge cases, specific reasoning quirks. Real differences, yes, but not the reason you keep coming back to one tool every day.

So what actually locks you in? Not the model. The model is interchangeable. What keeps you in place is the memory the tool has accumulated about you. That is the moat. And every major AI vendor is quietly building it higher while everyone argues about benchmark charts.

People usually only notice this when they try to switch. They open a new tool, realize it does not know them, feel the friction of re-briefing, and go back. The memory did not just help them - it trapped them. And they call it “preference” because it feels like their choice.

The part I want you to think about: the AI market is not stable. It is not browsers where Chrome won. It is not search where Google won. We are going to be switching tools. We are going to be running three, four, five AI products in parallel for years - one for coding, one for writing, one for research, some that do not exist yet. If your memory lives inside one of them, every switch costs you context. Every new tool starts from zero.

Back to ChatGPT. The reason it kept quoting 2023-me at me was not that it was dumb. It was that its memory was append-only from my side. I could add. I could not really curate. When old facts and new facts conflicted, it often went with the older one because that version had more repetitions backing it up. I was the product manager. I had no admin panel.

So I stopped thinking of memory as a feature of the tool. I started thinking of it as an asset of mine that I had stupidly let the tool hold for me.

I call the alternative BYOM - Bring Your Own Memory.

The analogy is BYOD (bring your own device) and BYOK (bring your own key). In BYOM the vendor brings the model. You bring the memory. The two meet at an open protocol - in my case MCP (Model Context Protocol). The vendor does what vendors are good at: train huge models, run them cheap, ship them fast. You do what only you can do: own the truth about yourself.

Once memory lives behind a protocol instead of inside a product, everything changes. It becomes portable, inspectable, backup-able, forkable. You can delete entries. You can hand a copy to a new tool and it knows you before you have typed a word. The shift - from “feature inside a product” to “service I own” - is the whole point of this piece.

3. Gamify - How to build this in a weekend

Now the practical part. I am going to show you how the whole thing fits together and give you the actual commands. If you are a developer comfortable with Node and a terminal, you can be running by Sunday evening. About 1800 lines of JavaScript total. Five dependencies. No cloud services.

The architecture in one picture

   Claude Code / Claude Desktop / your custom agents
                      │ (MCP over STDIO)
                      ▼
                   mcp.js          ← thin STDIO bridge, spawned/killed freely
                      │ (HTTP on 127.0.0.1:3110)
                      ▼
                  server.js        ← persistent Fastify daemon, owns the DB
                      │
         ┌────────────┼──────────────┐
         ▼            ▼              ▼
       db.js       embed.js      judge.js
       SQLite +    Ollama        (optional)
       sqlite-vec  nomic-embed   auto-tag
       + FTS5                    + rerank

Two processes, not one. This is the single most important structural decision so I will explain why.

Claude Code and Claude Desktop spawn MCP servers as child processes over STDIO, and they kill those processes on every restart. If the SQLite database lived inside the STDIO process, every Claude restart would mean re-opening the DB, losing the WAL journal, risking race conditions with other clients. Bad.

So: a thin STDIO wrapper (mcp.js, ~40 lines) that Claude freely spawns and kills. A persistent HTTP daemon (server.js) that actually owns the database, stays up, and speaks plain HTTP on port 3110. The wrapper proxies calls to the daemon. Both share the same tool schemas from mcp-tools.js.

Bonus: because the daemon speaks HTTP, anything else can hit it too - custom agents, scripts, a web dashboard, future tools. One memory, many clients.

The stack - five dependencies and nothing else

Component Technology Why Language JavaScript (CommonJS) No TypeScript, no build step. node server.js and done. HTTP Fastify 5 Fast, minimal, handles JSON well. Database better-sqlite3 + sqlite-vec + FTS5 One file, zero ops, ACID, vectors and text in one transaction. Embeddings nomic-embed-text via Ollama 768-dim, runs locally on M-series Mac, zero API keys. MCP SDK @modelcontextprotocol/sdk Official SDK, both STDIO and HTTP transports. Validation Zod 4 Schema per MCP tool.

package.json dependencies: 5. That is the whole list.

The schema - the fields that actually matter

One SQLite file, three real tables plus two virtual. The core is entries:

CREATE TABLE entries (
  id              TEXT PRIMARY KEY,
  parent_id       TEXT,                  -- links a chunk to its parent entry
  kind            TEXT DEFAULT 'document', -- document | chunk | fact | preference | event
  type            TEXT NOT NULL,         -- feedback | user | project | reference
  title           TEXT NOT NULL,
  content         TEXT NOT NULL,
  content_hash    TEXT NOT NULL,         -- SHA-256, for exact dedup
  tags            TEXT,                  -- JSON array
  source_tool     TEXT NOT NULL,         -- claude-code | claude-desktop | ...
  source_reason   TEXT NOT NULL,         -- WHY this was written
  source_session  TEXT,                  -- which session produced it
  confidence      REAL DEFAULT 1.0,      -- 1.0 = human, 0.6 = auto, 0.3 = inferred
  created_at      INTEGER NOT NULL,
  updated_at      INTEGER NOT NULL,
  verified_at     INTEGER,               -- last confirmed still true
  expires_at      INTEGER,               -- when to re-check
  supersedes      TEXT,                  -- ID of entry this replaces
  superseded_by   TEXT,                  -- ID of entry that replaced this
  archived_at     INTEGER,               -- soft delete
  access_count    INTEGER DEFAULT 0      -- popularity signal for ranking
);

Simplified view. The full schema has ~22 columns plus a few indexes and an extra JSON field for arbitrary metadata. Above is what matters for understanding.

Two virtual tables sit on top: entries_fts (FTS5 for keyword search, with porter unicode61 tokenizer so it handles Cyrillic), and entries_vec (sqlite-vec, FLOAT[768] for embeddings).

Every field above earns its place. parent_id + kind are how chunking works: long content splits into chunks, each chunk is a row with kind: 'chunk' pointing at its parent, and embeddings live on the chunks so search can find the specific part instead of a diluted average of a long document. confidence lets the AI distinguish “user explicitly said this” from “I inferred this.” verified_at lets a maintenance agent mark old facts as still true. supersedes turns updates into an audit trail instead of destructive edits. access_count is a popularity signal the ranker uses.

And the part that sold me on SQLite in the first place: backup is one command. cp memory.db backup.db and you have a complete snapshot of your entire AI memory - text, vectors, audit log, everything, in a single 3.4 MB file. Try doing that with Pinecone. Try exporting your memory out of ChatGPT. This is what “you own it” looks like in practice - three-second backup, zero vendor involvement, a file you can git-commit if you want version history. SQLite also runs in WAL mode, which is why Claude Code, Claude Desktop, and my background maintenance agent can all read the database concurrently without stepping on each other.

Embeddings: Ollama locally, and the one detail nobody mentions

Install Ollama, pull the model, and embeddings run on your laptop forever free:

brew install ollama
ollama pull nomic-embed-text

The flow: text -> POST http://127.0.0.1:11434/api/embeddings -> Float32Array(768) -> L2 normalize -> write to SQLite.

The L2 normalization step is the detail that costs two hours of debugging if you skip it. sqlite-vec computes Euclidean distance, but if your vectors are unit-length, Euclidean distance equals cosine similarity (cos_sim = 1 - L2² / 2). Without normalization your results feel vaguely wrong. With it they feel right.

function l2Normalize(vec) {
  let sumSq = 0;
  for (let i = 0; i < vec.length; i++) sumSq += vec[i] * vec[i];
  const norm = Math.sqrt(sumSq) || 1;
  const out = new Float32Array(vec.length);
  for (let i = 0; i < vec.length; i++) out[i] = vec[i] / norm;
  return out;
}

Real numbers from my running system: 441 ms per embed, 3 ms for FTS5 search, 3 ms for vector search. The bottleneck is embedding, not search.

Search: hybrid, filterable, fast

Three modes: keyword only, semantic only, hybrid (default). Hybrid runs FTS5 and vector search in parallel, then fuses the results with Reciprocal Rank Fusion:

const RRF_K = 60;
ftsHits.forEach((h, rank) => {
  scores.set(h.id, (scores.get(h.id) || 0) + 1 / (RRF_K + rank));
});
vecHits.forEach((h, rank) => {
  scores.set(h.id, (scores.get(h.id) || 0) + 1 / (RRF_K + rank));
});

Entries that show up in both lists get a higher final score. Then I layer post-fusion boosts - confidence multiplier, access-count boost, recency bump, expiry decay. A handful of lines each.

But the part that makes this actually beat vendor memory is the filters. Every search call accepts type, tags, source_tool, confidence, date ranges. My top tags right now: personal-brand (54), active (52), substack (40), profile (34). Each one a slice I can narrow into with a single filter.

The concrete example that made this click for me: working on Temporal.day, I do not need the AI to recite my age and job title. I need it to surface which pricing models I tried, which I killed, and why. type: project, tags: ["temporal-day", "decision"] returns the list in milliseconds. Built-in memory cannot do that. The fidelity lives in the tags.

Write discipline: the rule that keeps the memory clean

Every write hits a validator before it touches the DB. Required fields (type, title, content, source_tool, reason). Min content length 20 chars. Real-character check to block junk. Then two-stage dedup:

Exact: SHA-256 hash of content. If match, return 409.
Fuzzy: embed the new entry, run vectorSearch(embedding, 1), and if cosine similarity > 0.85, return a similar_exists warning with the existing ID.

The AI client can either force the write with confirm_duplicate: true or, better, call memory_update on the existing record. This single rule cut my duplicate rate from “everywhere” to “basically never.”

Plus rate limiting (60 writes/minute per tool) and an append-only write_log that records every mutation with a 100-char snippet. You always know who wrote what and why.

The eight MCP tools the AI actually uses

The whole interface is eight tools, registered via @modelcontextprotocol/sdk:

Tool What it does memory_describe Self-description: current vocabulary, rules, config. Called first in every session. memory_search Hybrid search with filters (type, tags, source_tool, confidence, dates). memory_get Fetch one entry by ID. memory_list Recent entries, newest first. memory_write Create entry. Runs dedup, chunking, optional auto-tag. memory_update Update entry. Re-embeds if content changed. memory_verify Confirm entry is still accurate (stamps verified_at = now). memory_delete Soft-delete (sets archived_at). memory_supersede Replace old entry with new, linked via supersedes chain.

Self-describing: how the AI learns to use it without you micromanaging

memory_describe returns the live vocabulary - which type values already exist and how often, which tags are popular, what the validation thresholds are, which filters are available. Right now mine returns: feedback (31), reference (29), project (26), user (25) as the top types.

Why it matters: when Claude sees feedback has 31 entries, it uses feedback instead of inventing feedback-notes or user-feedback-misc. The vocabulary converges organically. No hardcoded taxonomy. No drift across tools.

The fifteen-line prompt that wires it into Claude

The server is half. The other half is ~/.claude/CLAUDE.md, which Claude Code reads automatically:

## Memory

Primary memory is `personal-memory` MCP (HTTP :3110).
Always use memory_* MCP tools:
- Read: memory_describe (first call per session), then memory_search
- Write: memory_write with source_tool: "claude-code"
- Update stale facts: memory_update + memory_verify
- Supersede: memory_supersede when an old entry is replaced

### What to save
- Stable facts about user, projects, preferences
- Explicit feedback ("don't do X, prefer Y")
- Reference paths, endpoints, decisions and reasoning

### What NOT to save
- Anything in current code/git/files
- Ephemeral task state
- Conversation summaries

Without this file, the AI does not know the memory exists. With it, the behavior is automatic on every session. The same prompt works for Claude Desktop, custom agents, anything MCP-aware.

The actual setup - copy-paste this

# 1. Install Ollama and the embedding model
brew install ollama
ollama pull nomic-embed-text

# 2. Create the project
mkdir -p ~/.personal-memory/server && cd ~/.personal-memory/server
npm init -y
npm install better-sqlite3 sqlite-vec fastify @modelcontextprotocol/sdk zod

# 3. Create .env
cat > ../.env << 'EOF'
OLLAMA_URL=http://127.0.0.1:11434
EMBED_MODEL=nomic-embed-text
PORT=3110
HOST=127.0.0.1
EOF

# 4. Write the code:
#    server.js, mcp.js, mcp-tools.js, db.js, embed.js, search.js, rules.js
#    (architecture above tells you what each does)

# 5. Start the daemon
node server.js

# 6. Sanity check
curl http://127.0.0.1:3110/health

Then register it in Claude Code (~/.claude/settings.json) and Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "personal-memory": {
      "command": "/path/to/node",
      "args": ["/Users/you/.personal-memory/server/mcp.js"]
    }
  }
}

Drop the CLAUDE.md snippet from above into ~/.claude/CLAUDE.md. Restart Claude. You are running.

What it looks like when it works

A real memory_search response from my system, query = “writing style”, hybrid mode, limit = 2:

{
  "results": [
    {
      "id": "feedback_mnxikwm3_588da6e4b3c1",
      "type": "feedback",
      "title": "Use short dashes, not em dashes",
      "tags": ["writing", "style", "substack"],
      "confidence": 1, "access_count": 84, "score": 0.0391
    },
    {
      "id": "feedback_mntv57h4_c0281c034edb",
      "type": "feedback",
      "title": "Keep Substack Notes short (100-150 words)",
      "tags": ["writing", "substack", "length"],
      "confidence": 1, "access_count": 62, "score": 0.0358
    }
  ],
  "timing": { "fts_ms": 3, "embed_ms": 441, "vec_ms": 3 }
}

Two entries. Both tagged writing and substack. Together they have been retrieved 146 times - which means 146 times across months of sessions my AI tools already knew how I want to write, and I did not type a single reminder.

One more thing: let an AI maintain the memory for you

I do not curate entries by hand. A separate workflow - an agent I call OpenClaw - reads the memory periodically, finds stale entries, and either verifies them, updates them, or supersedes them. It uses the same MCP tools every other client does. It writes back with source_tool: "openclaw-agent". Every maintenance action ends up in the audit log.

This is the piece that turns a database into something alive. Facts do not need humans to stay fresh. They need another process with the same API access as the humans.

Closing

The technical part is the easy part. Hundred entries, 3.4 MB, 3 ms searches, five dependencies, zero API keys, one command to back up. A weekend of work and you are running.

The harder part is the question I want you to actually sit with.

Who controls how you show up to AI?

For most people right now the answer is: the vendor whose product they use most. And people push back on this with “well, I can always ask it what it knows about me.” So try it. You will get the shallow layer - your name, your job, a couple of preferences. What you will miss is everything that actually matters. The project decisions. The half-formed opinions. The pattern of how you change your mind. The texture.

Memory lives in the details, and details only surface when you can slice them - by tag, by project, by time, by confidence. A chat interface flattens. A vector database with filters does not. And no built-in memory is going to close that gap for you, because closing it would mean handing you the tools to leave.

If you are fine with where your context lives right now, fine. That is a real choice.

But I do not think most people have actually made the choice. They defaulted into it, because the tool is convenient and the cost of the default is invisible until the day they want to switch.

We are early. I think each of us is eventually going to carry a digital twin - a persistent, portable memory of who we are that any AI can plug into and understand us immediately. That is where this is going. We are just not taking memory seriously enough yet to see it.

Worth fixing. Bring your own memory.

If you want the full source layout, the schema, or the exact setup commands, reply to this post and I will send the detailed build guide.

Vibe Coding Won’t Save You. This PM Skill Will.

Mykyta — Tue, 07 Apr 2026 17:10:26 GMT

The Vibe Coding Trap

Six months ago, I was a skeptic.

When the whole vibe coding wave started, I looked at it and thought - this is terrifying. Someone else’s AI writes your code, you don’t fully understand what’s happening under the hood, and if something breaks in production, you’re the one holding the bag.

I work in travel-tech. When I think about a bug in our flight booking flow, I don’t think about a Jira ticket. I think about money. Real money. A single bug in an airline ticket purchase can cost more than a developer’s monthly salary. So when people started telling me “just let the AI write it,” my gut reaction was simple: no way.

But here’s the thing about gut reactions. Sometimes they’re right. And sometimes they’re just fear dressed up as wisdom.

94% of product professionals now use AI frequently in their workflows. Nearly half have embedded it deeply into their daily work. The tools got better - fast. Claude Code, Cursor, Copilot - they went from “interesting toy” to “legitimate workflow” in what felt like weeks, not months.

And I realized my skepticism wasn’t protecting me. It was holding me back.

But - and this is the part nobody talks about - the opposite extreme is just as dangerous. The PMs who think vibe coding will replace their engineering team. The founders who believe they can ship production software by typing prompts into an AI. The product people who confuse “I can build a prototype” with “I can build a product.”

Both extremes are wrong. And I learned this the hard way.

The real skill isn’t vibe coding. It’s knowing exactly where AI belongs in your product process - and where it absolutely doesn’t. I call it the Precision Placement approach. And it changed how I build products.

The Problem With How PMs Think About AI Coding

Let me paint you a picture of how most product managers approach vibe coding right now.

Camp 1: The Ignorers. These are the PMs who still don’t use AI at all. Not for code, not for docs, not for anything meaningful. If I’m being direct - this is a red flag. If you’re a PM in 2026 and you haven’t found a single workflow where AI makes you better, you’re not being careful. You’re being left behind.

Camp 2: The Replacers. These are the ones reading headlines about “one-person billion-dollar companies” and thinking they’ll fire their dev team by next quarter. They vibe-coded a landing page once and now they believe they can ship enterprise software solo.

Camp 3: The FOMO Builders. This is where I found myself. The ones who see the potential, start building everything, creating side projects, spinning up apps, prototyping features - and then one day stop and ask: “Wait, is this actually moving me forward?”

Here’s the uncomfortable truth I discovered. Vibe coding creates a very specific kind of FOMO. You see what’s possible. You see people shipping products in a weekend. And something clicks in your brain that says - I should be building MORE. Faster. Everything.

But that FOMO is a trap. Because as a PM, your job was never to build everything. Your job is to decide what gets built, why, and in what order. Vibe coding doesn’t change that equation. It just makes the “building” part faster. Which means the “deciding” part becomes even more critical.

Here’s what I mean with a real example.

I manage a Chrome extension for travel agents - it works across GDS systems like Amadeus, Sabre, and Galileo. For over a year, users kept asking: “Is this available on mobile? Can we use it on a phone?”

We kept saying no. Not because it was a bad idea - it was a perfectly good idea. But it was never P1. It never directly impacted revenue. There were always bigger fires to fight - bugs that affected the core booking flow, features that directly moved the needle on retention, integrations that clients were waiting for.

That mobile version was the classic P2 backlog ghost. Important but never urgent. The kind of task that lives in your backlog for 18 months and eventually gets archived with a sad little “won’t do” label.

Then we tried something different. We gave it to AI.

In three days - not three sprints, three days - we had an 80% functional mobile prototype. Working. Usable. Not a mockup, not a Figma file - actual working software that users could touch and click and test.

And here’s the kicker: we spent more time on organizing the repo and setting up the sprint than on the actual development.

That project would have NEVER happened without vibe coding. Not because it was technically impossible. But because in the traditional prioritization model, it would have never earned its spot at the top of the queue. The ROI calculation would have never justified pulling a developer off P1 work.

This is the “aha” moment. Vibe coding doesn’t replace your developers. It gives your P2 and P3 backlog a chance to exist.

Think about that. How many features are sitting in your backlog right now that are genuinely good ideas but will never see the light of day because you can’t justify the development time? Vibe coding changes that calculation entirely.

Precision Placement: The Skill That Actually Matters

So if it’s not about coding everything with AI, and it’s not about ignoring AI entirely - what is it?

I call it Precision Placement. It’s the skill of knowing exactly where AI-generated code belongs in your product - and where it doesn’t.

Here’s how it works in practice.

Step 1: Map Your Risk Zones

Not all code is created equal. In every product, there’s a spectrum:

On one end - features where a bug is annoying but harmless. A settings page that doesn’t save preferences. A notification that fires twice. Ugly, sure. But nobody loses money.

On the other end - flows where a single bug costs real money. In my case, that’s the flight booking engine. A pricing error, a double charge, a failed cancellation - these aren’t bugs. These are financial incidents.

Before you vibe-code anything, map this spectrum for your product. Draw a line. Everything on the “safe” side is a candidate for AI-assisted development. Everything on the “dangerous” side gets human hands, human eyes, human accountability.

The mobile prototype I mentioned? Safe side. Low risk, high learning potential, zero revenue impact if something breaks. Perfect candidate.

The core booking flow? That’s the other side. I get nervous when AI touches that. And I think that nervousness is healthy.

Step 2: Redefine Your Backlog Categories

We started doing something new in our sprint planning. When we look at a task and estimate that AI can handle 55% or more of the execution, we assign the task differently. AI becomes the “executor.” A developer becomes the “curator” - someone who reviews, adjusts, and ensures quality.

This isn’t about replacing people. It’s about creating a new role dynamic. The developer isn’t writing code from scratch for these tasks - they’re supervising, catching edge cases, ensuring the output meets production standards.

Think of it like this: you wouldn’t hand a junior developer a critical production system on day one. But you’d absolutely let them work on internal tools, experiments, or non-critical features with a senior developer reviewing their work. AI is your extremely fast, extremely tireless junior developer. Treat it accordingly.

Step 3: Build to Learn, Not to Ship

Here’s where most PMs get it wrong with vibe coding. They try to ship AI-generated code directly to production. That’s the wrong frame.

Use vibe coding to learn. Build a prototype in half an evening to test if an idea even makes sense. Show it to users. Get reactions. Gather data. THEN decide if it’s worth proper engineering investment.

I can build a clickable mobile prototype in half an evening now. Not production-ready - but good enough to put in front of a user and ask: “Would this solve your problem?” That answer used to cost us two sprints of design and development. Now it costs an evening.

The prototype isn’t the product. The prototype is the question. And vibe coding lets you ask questions 10x faster.

Step 4: Start From Pain, Not From Courses

I see PMs signing up for “AI for Product Managers” courses from self-proclaimed gurus who teach you to paste a prompt into ChatGPT, get another prompt, paste it into Lovable, and call it a product. That’s not learning. That’s following a recipe without understanding cooking.

Instead - look at your actual day. Where do you waste time? Where do you repeat yourself? Where do you wish you could see something quickly but can’t justify the development effort?

Start there. Start from the pain you can see and touch, not from a curriculum someone designed to sell you a $997 course.

And if you want actual AI education? The courses from Anthropic’s documentation are some of the best out there. Free. Practical. Built by people who actually develop AI, not people who make content about AI.

Step 5: Accept the PM-Dev Tension

Here’s something I haven’t seen anyone write about - the identity crisis that vibe coding creates for PMs.

When you can build things yourself, you start blurring the line between product management and development. And that feels exciting at first. You’re not just deciding what to build - you’re building it. You’re a PM AND a developer. A PM-Dev.

Anthropic actually has an interesting internal principle: if a feature or project takes less than two weeks to launch, the developer IS the PM. They talk to users, coordinate with other teams, make the calls. No separate PM needed.

This is where things are heading. And it creates a genuine tension. Because when you’re building, you’re not doing PM work. You’re not talking to users, analyzing data, thinking about strategy, prioritizing the bigger picture.

I feel this tension constantly. With Temporal.day, with side projects, with the FOMO of “I could build this tonight.” And the honest answer is - I don’t have it perfectly figured out. Nobody does.

But here’s my rule of thumb: if building this thing teaches me something about my users or my product that I can’t learn any other way - build it. If I’m building because it feels productive and exciting but doesn’t actually move the needle - stop. Close the laptop. Go talk to a customer instead.

The Long Game Nobody’s Talking About

Here’s my final thought, and it’s the one I want you to remember.

Right now, if you compare a PM who uses AI to one who doesn’t - the AI user wins. Obviously. It’s a massive productivity multiplier.

But zoom out. In a year, in two years, everyone will use AI. It’ll be as unremarkable as using Google Docs or Slack. The tool becomes table stakes.

And when everyone has the same tool, what differentiates you?

Your skills. Your judgment. Your understanding of users. Your ability to make the right call when the data is ambiguous and the stakes are high. The things that AI can assist with but never replace.

I see products flooding the market right now - apps that were clearly vibe-coded in a weekend, full of bugs, solving problems that don’t exist. The barrier to building dropped. But the barrier to building something GOOD - something users trust, something that works reliably, something that solves a real problem - that barrier hasn’t moved at all.

In the short term, the person who uses AI best wins. In the long term, the person with the deepest skills who ALSO uses AI wins. There’s a difference.

Vibe coding is a tool. A powerful one. But it’s not a strategy, it’s not a skill, and it’s definitely not a replacement for knowing what the hell you’re building and why.

Figure out where you want to be on the map. Understand what you want to develop - in your product and in yourself. Then use every tool available - including AI - to get there deliberately. Not frantically. Not because of FOMO. Deliberately.

That’s the skill that matters. And no amount of vibe coding will teach it to you.

Your Competitors Are Laughing at You. Good.

Mykyta — Tue, 31 Mar 2026 14:55:15 GMT

In 2007, Steve Ballmer stood in front of a camera and laughed.

“$500? Fully subsidized with a plan?” He could barely hold it together. “That is the most expensive phone in the world. And it doesn’t appeal to business customers because it doesn’t have a keyboard.”

He was talking about the iPhone.

Nokia controlled 51% of the global mobile phone market at the time. Microsoft was everywhere. BlackBerry owned the enterprise. Apple had just 5% and a device that most industry experts called a toy.

Six years later, Nokia lost 90% of its market value. BlackBerry became a cautionary tale. And that “toy” redefined how 4 billion people interact with technology.

Here’s what I find fascinating about this story. It’s not that the experts were wrong. It’s that they were wrong because they were experts. They knew the phone market. They knew what customers wanted. They knew what worked. And all of that knowledge became the exact thing that blinded them.

Tony Fadell - the guy who built the iPod and co-created the iPhone - writes about this in his book Build. His core argument is simple: the first version of your product should be disruptive, not evolutionary. You’re not supposed to make a better version of what exists. You’re supposed to make something that changes how people think about the category entirely.

That idea has been living in my head for the past few weeks. Not as theory - but as a lens for what I see happening in the market right now.

Because we’re in a moment where disruption isn’t just possible. It’s easier than ever. And almost nobody is doing it right.

The Problem With How Most Founders Think About Products

Here’s the standard playbook in 2026:

You open ChatGPT. You type “give me a startup idea” or “what’s a good niche for a SaaS product.” The AI gives you something that sounds reasonable - habit trackers, finance apps, productivity tools. You think “great, that makes sense” and start building.

This is the most expensive mistake you can make. And I don’t mean expensive in money. I mean expensive in time - months or years spent in a market where the math doesn’t work in your favor.

Let me show you why.

The habit tracking app market is projected to reach $14.94 billion by 2026. Sounds massive, right? Now look deeper. Over 52% of users quit within the first 30 days. The market is so oversaturated that differentiation is nearly impossible. CPC for paid acquisition keeps climbing. Organic growth? Good luck ranking when there are hundreds of competitors with established SEO and millions in funding.

And here’s the trap: if you ask AI “is the habit tracking market a good opportunity?” - it will say yes. Because the market IS big. Because the data IS positive at a surface level. AI pulls from existing information, from articles that analyze these markets favorably, from optimistic projections. It doesn’t know what it’s like to actually try to acquire users in a space where every keyword costs $3+ per click and retention is garbage.

This is what I call the “AI research trap.” You’re not getting bad information. You’re getting incomplete information. And incomplete information in market selection is worse than no information - because it gives you confidence in the wrong direction.

The same pattern plays out in finance apps, subscription trackers, to-do lists, note-taking tools. Massive markets. Thousands of competitors. Brutal unit economics for anyone without venture capital.

Meanwhile, there are hundreds of niches where the opportunity is wide open. But nobody looks at them because they seem too small.

The Disruption Nobody Sees Coming

Here’s what most people miss about the current moment.

We have roughly 4.7 billion smartphone users globally. A huge chunk of them have never meaningfully used AI. Not ChatGPT. Not AI features in their apps. Nothing. When you show these people a product that uses AI to solve a real problem they have - something as simple as automating a workflow they do manually every day - their reaction isn’t “oh, another AI tool.” Their reaction is “wait, this is possible?”

That reaction - that genuine surprise - IS disruption. Not in the Silicon Valley “we’re disrupting the $50B market” sense. In the real sense. You’re fundamentally changing how someone thinks about what’s possible.

And this is where it connects back to the iPhone story. Apple didn’t make a better phone. They made a product so different that competitors literally couldn’t evaluate it using their existing frameworks. Nokia looked at the iPhone and saw a phone without a keyboard. They missed that it wasn’t a phone at all - it was a pocket computer that happened to make calls.

The same thing is happening right now with AI products. Incumbents in most industries look at AI features and see them through the lens of their existing product. “We’ll add AI to our dashboard.” “We’ll use AI for better recommendations.” They’re building better keyboards while someone is about to remove the keyboard entirely.

Tony Fadell has this concept in Build that I love: “The best ideas are painkillers, not vitamins.” Vitamins are nice to have. You forget to take them and nothing happens. Painkillers solve an immediate, urgent problem. You know instantly if they’re working.

Most AI products being built right now are vitamins. They make existing workflows slightly faster or slightly easier. The real opportunity is building painkillers - products that solve problems people didn’t even know could be solved.

Why “Too Small” Is Your Biggest Advantage

Now here’s where my thinking diverges from what most people teach.

Conventional wisdom says: find a big market. Total addressable market of $1 billion+. Growth rate of 20%+ year over year. That’s what investors want to hear. That’s what accelerators teach. That’s what every startup book recommends.

But there’s a math problem with big markets: big markets attract big teams. Big teams have big budgets. Big budgets mean they can outspend you on acquisition, out-hire you on engineering, and out-market you on brand.

Now flip it. What about a market that’s $10 million? $50 million? Too small for a venture-backed startup with 20 employees and $3 million in annual burn. Way too small for an enterprise team. Nobody writes TechCrunch articles about $10 million markets.

But for a solo founder? A market where one competitor has 60%+ market share and only a handful of other players exist? Where the market grows 15-20% year over year? Where the dominant player hasn’t innovated in years because the market isn’t big enough to justify their R&D budget?

That’s not a bad market. That’s the perfect market.

Solo-founded startups now represent 36.3% of all new companies - up from 23.7% in 2019. And 38% of seven-figure businesses are run by solopreneurs. The reason is simple: AI has collapsed the cost of building. What used to require a team of 10 can now be done by one person with the right tools and workflow. Operating margins for AI-powered solo founders hit 60-80%, compared to 10-20% for traditionally staffed businesses.

This means the “minimum viable market” has shrunk dramatically. Markets that were economically unviable five years ago are now goldmines for solo founders. You don’t need millions of users. You don’t need venture capital. You need a niche where real pain exists, competition is thin, and you can build something genuinely different.

How to Find Your Disruption: A Practical Framework

Theory is great. But what do you actually do on Monday morning? Here are five steps.

Step 1: Stop Asking AI for Ideas. Start Giving It Parameters.

The biggest mistake is treating AI as a creative partner for market selection. It’s not. It’s a research assistant - and a good one - but only if you give it the right constraints.

Don’t ask: “What’s a good SaaS idea?” Ask: “Find me markets where one competitor holds 60%+ market share, there are fewer than 10 significant players globally, and the market has grown 15%+ year over year for the last 3 years.”

Don’t ask: “Is the habit tracking market saturated?” Ask: “What is the CPC for the top 20 keywords in the habit tracking space? What’s the average 30-day retention rate for the top 10 habit tracking apps? How many new habit tracking apps launched in the last 12 months?”

The difference is night and day. The first type of question gets you generic, optimistic answers. The second gets you data you can actually make decisions with.

Step 2: Use Sensor Tower to See What Nobody Else Sees

Most founders do their market research on Product Hunt and Twitter. That’s like trying to understand the ocean by looking at the waves on the surface.

Sensor Tower gives you the submarine view. You can see estimated downloads, revenue, user engagement, and retention by country, category, and platform. You can track how specific apps are performing over time. You can see which markets are growing and which are stagnating.

The key move: look for categories where the top app is making real revenue but hasn’t updated meaningfully in 6-12 months. Where user reviews mention the same complaints over and over. Where the market is growing but the product experience is stuck in 2020.

That’s your opening.

Step 3: Talk to Humans (And Watch Their Eyes)

This is the test from Build that I think about constantly. When you describe your product idea to someone - not another founder, not a tech person, a normal person - watch their reaction.

If they say “oh cool, interesting” and change the subject - you have a vitamin.

If they lean in and start asking questions - “wait, how does that work? Can it do X? What about Y?” - you have a painkiller. That curiosity, that pull toward wanting to understand more - that’s the reaction that predicts product-market fit better than any survey or market analysis.

Don’t skip this step. Don’t rationalize your way past lukewarm reactions. The humans will tell you the truth faster than any data set.

Step 4: Look Where Nobody’s Looking

The AI boom has created a gold rush mentality. Everyone is building AI-powered productivity tools, AI writing assistants, AI analytics dashboards. These are the habit trackers of 2026 - crowded, commoditized, and brutal for newcomers.

Meanwhile, entire industries are barely touched by AI. Think about markets where the users are 40+ years old. Where the existing software looks like it was built in 2012. Where “digital transformation” is still a buzzword, not a reality. Healthcare admin. Construction management. Local government. Small-scale agriculture. Trade services.

These aren’t sexy markets. They’re not going to get you on the front page of Hacker News. But they’re markets with real pain, low competition, and users who will be genuinely amazed by what AI can do - because they’ve never seen it applied to their problems before.

Step 5: Escape the Mediocrity Trap

This is the most dangerous part, and nobody talks about it.

You build your product. You launch. And the results are... mediocre. Not terrible. Not great. You get some users. Some of them stick around. Revenue trickles in. And your brain starts rationalizing.

“I haven’t run ads yet.” “I haven’t built feature X yet.” “People are looking at it - they just haven’t converted.” “I need to give it more time.”

This is the mediocrity trap. And it will eat years of your life if you let it.

Mediocre results are not a sign that you’re close to product-market fit. They’re a sign that you’re probably not. Real product-market fit feels like being pulled forward. Users tell other users. Growth accelerates on its own. People get angry when the product is down.

If you’re three months in and constantly explaining to yourself why the results aren’t better - stop. Go back to Step 1. Rerun your analysis. Find a different market or a different angle.

The courage to kill a mediocre product is worth more than the persistence to keep a mediocre product alive.

The Bottom Line

The best products in history started as jokes. The iPhone was a toy with no keyboard. The Nest thermostat was a “fancy temperature dial.” Airbnb was “who would sleep on a stranger’s couch?”

Competitor laughter isn’t a warning sign. It’s a leading indicator. If everyone in your space immediately understands what you’re building - you’re probably building something evolutionary, not disruptive. And evolutionary products in crowded markets are a death sentence for solo founders.

The opportunity right now is enormous. Billions of people who haven’t used AI for anything meaningful. Hundreds of niches too small for big teams but perfect for one person with the right tools. Markets where the dominant player is asleep at the wheel.

Don’t ask AI for ideas. Give it parameters. Use tools like Sensor Tower to see what others can’t. Talk to humans and watch their eyes. Build where nobody’s looking. And if the results are mediocre - have the courage to walk away and find a better fight.

The market is waiting for products that make people lean in and say “wait - how does that work?”

Go build one.

Inspired by Tony Fadell’s “Build: An Unorthodox Guide to Making Things Worth Making” - one of the best books on product thinking I’ve read this year.

Subscribe now

PMs Don’t Manage Backlogs Anymore. They Manage AI Agents.

Mykyta — Tue, 24 Mar 2026 20:49:23 GMT

Over 73% of product managers now use AI tools in their daily workflow. That’s nearly double from just two years ago.

Scroll through LinkedIn for five minutes and you’ll see it everywhere. “10x your PM productivity with AI.” “How I replaced my entire research process with Claude.” “This one prompt writes your PRDs for you.”

Here’s what nobody says out loud: most of them are doing it wrong.

I know because I’m a PM who builds an AI-powered calendar app — Temporal.day. I use AI agents every single day. Not as a novelty. Not for LinkedIn content. As my actual workflow. And the more I use them, the more I see the same pattern everywhere.

Product managers in 2026 fall into two camps. Both are losing.

The first camp treats AI like a magic oracle. They copy-paste every question, every decision, every piece of thinking into ChatGPT or Claude. They let it write their PRDs, run their research, draft their emails. They essentially outsource their brain. And their work reads like it. Flat. Generic. Missing the context that only a human who’s deep in the problem can provide.

The second camp refuses to touch it. “I prefer to do things properly.” “AI can’t understand my users.” “It’s just a fad.” Meanwhile, the market moves at 10x speed, their competitors ship faster, and they’re stuck in 2023.

The winning move? Neither. It’s something different entirely.

And it starts with understanding one thing: AI is not your replacement. It’s not your assistant either. It’s your junior PM — and you need to manage it like one.

The Problem With How Most PMs Think About AI

Here’s the conventional wisdom: AI tools make you faster. Just plug them in, ask your questions, get your answers, move on.

Sounds logical. It’s also wrong.

When you let AI do your thinking, something subtle happens. Your own skill level starts dropping. Not overnight — slowly. Like a muscle you stop using. You stop forming hypotheses before looking at data. You stop connecting dots between user complaints and product architecture. You stop thinking.

I see this constantly. PMs who generate beautiful-looking PRDs with AI — but can’t defend a single decision in the document when challenged. Research summaries that sound impressive but miss the one insight that actually matters. Strategy docs that read like they were written by someone who’s never talked to a customer.

And here’s the kicker — readers can tell. Studies show AI-generated content receives 43% lower trust ratings. Over 50% of people say they’d lose respect for a writer who relies on AI. Your stakeholders, your engineers, your users — they feel it even if they can’t articulate why.

It’s the same problem as in user research. You can ask users whether a button should be blue or yellow. That’s fine. But you can’t ask them how the underlying system should work to solve their problem. That requires deep product thinking. That’s YOUR job. The moment you outsource that to AI, you’re building a product based on an algorithm’s best guess — not on real understanding.

I had a concrete moment that crystallized this for me. I needed to research how to implement document uploading through a browser extension for Temporal.day. The old me would have gone straight to a developer. We’d sit together, brainstorm solutions, research APIs, evaluate tradeoffs. Days of work.

Instead, I opened Claude. We did several iterations. I asked targeted questions. I challenged its suggestions. I pushed it in directions based on what I knew about our architecture and users. Within hours, I had three viable implementation approaches to bring to the team.

Here’s the important part — one of those solutions was something none of us knew was even possible. The AI found a technical path we’d never considered. But it only found it because I kept steering the conversation with context the AI didn’t have. My knowledge of our users. Our technical constraints. Our business priorities.

The AI didn’t make the decision. I did. But it expanded the solution space dramatically. That’s the difference.

The Shift: From Context Consumer to Context Engineer

There’s a new term floating around in 2026 — “context engineering.” Gartner published a formal definition. Companies are hiring for it. And it perfectly describes what the best PMs actually do with AI.

Context engineering isn’t about writing better prompts. It’s about designing the entire information environment that AI operates in. Your system instructions. Your conversation history. Your persistent knowledge. The documents you feed it. The tools you connect. And — critically — what you deliberately exclude.

For a PM, this translates to something practical. Before you ever ask AI a question, you set up the playing field. Who are you? What does your company do? What are your constraints? What decisions have you already made, and why?

Here’s my actual process. I don’t just open Claude and start typing. I talk to it first in a regular chat. I explain the task. Then I ask it to understand me as a person and my business. I let it ask ME questions. From that conversation, it forms an instruction set — a project context. And that’s what I bring into the actual working project.

The result? AI that actually knows what it’s talking about. Not generic advice you could get from any Google search — specific, contextual thinking that’s grounded in my reality.

This is what separates PMs who use AI well from PMs who use AI a lot. Volume doesn’t matter. Context does.

5 Steps to Lead AI Instead of Following It

Here’s how to make this real in your daily work. Not theory — actual steps I use while building Temporal.day.

Step 1: Build Your AI’s Understanding Before Asking It Anything

Think of it like onboarding a new team member. You wouldn’t throw a fresh hire into a sprint planning meeting on day one and expect useful input. You’d give them context first.

Do the same with AI. Start a conversation where the sole purpose is context transfer. Tell it about your product, your users, your market, your tech stack. Let it ask you questions. The more it understands before you start working, the better every interaction will be afterward.

What this gives you: AI responses that are specific to your situation instead of generic best practices. The difference is night and day.

Step 2: Talk Through Decisions — Don’t Just Ask for Answers

Most PMs use AI like a search engine. “What’s the best way to prioritize features?” That gets you a textbook answer.

Instead, talk through your actual decision. “I’m choosing between building notifications and improving onboarding. Here’s our current activation rate, here’s what users are saying, here’s our runway. What am I not seeing?”

Use AI as a thinking partner, not an answer machine. Share your reasoning. Explain your constraints. Let it challenge your assumptions. But always remember — you make the call. It expands your perspective. You own the decision.

What this gives you: better decisions, faster. Not because AI decides for you, but because it stress-tests your thinking in real time.

Step 3: Run AI in Parallel — Not in Series

Here’s something most people miss. The real power of AI agents isn’t that they answer faster. It’s that they work while you work.

My morning routine includes checking Amplitude AND checking my Claude Cowork scheduled tasks. I set up agents that run automatically — monitoring competitors, surfacing trends, preparing research briefs. They run in the background while I sleep, while I’m in meetings, while I’m doing deep work.

This is the “junior PM” analogy in practice. You don’t sit next to your junior and watch them work. You give them clear tasks, check their output, and redirect when needed. Same with AI agents.

What this gives you: your time goes to high-value thinking while AI handles the information gathering. You review and decide. You don’t wait.

Step 4: Start Small With Agents — And Obsess Over Security

If you’re new to AI agents, don’t try to automate your entire workflow on day one. Find one simple problem. Something repetitive. Something low-risk.

Build a solution with minimal cost — even free tiers work. Deploy it. Watch it run. Learn how it breaks. Because it will break.

And here’s something the LinkedIn “AI guru” crowd never mentions — security matters enormously. I see cases constantly where admin panels get compromised, user data gets exposed, paid customer lists leak. When you’re building with AI agents, security isn’t optional. It’s your first constraint, not your last thought.

The best course on building agents? Not some influencer’s $499 masterclass. Go read Anthropic’s documentation on how to create agents, how they work, how LLMs function. It’s free. It’s from the people who actually build this technology. It gives you the real understanding — not just a prompt template.

What this gives you: real experience with agents without risking your product or your users’ data.

Step 5: Protect Your Thinking Muscle

This is the most counterintuitive step. As you get better at using AI, deliberately maintain skills it could do for you.

Write some PRDs manually. Do some research the old way. Make decisions without asking AI first. Not because AI can’t help — but because your judgment is the thing that makes AI useful in the first place. If you let that muscle atrophy, your AI output gets worse too. Garbage in, garbage out — and “garbage” here means a PM who’s lost the ability to think critically about the problem.

Remember — the reason people can spot AI-generated content isn’t that AI writes badly. It’s that the content lacks the specific, hard-won perspective that only comes from doing the work yourself. Keep doing the work. Then use AI to amplify it.

What this gives you: the rare combination of speed AND depth. That’s the real competitive advantage.

The One Thing to Remember

Here’s what it all comes down to.

Don’t give AI 100% of your work. Keep it at your level — or slightly behind you. You lead. It follows. You include it in your process, you customize it, you teach it your context. But you never hand over the steering wheel completely.

The PMs who will thrive in 2026 and beyond aren’t the ones who use AI the most. They’re the ones who use it the best — as a force multiplier for thinking they’re already doing, not a replacement for thinking they’ve stopped doing.

The backlog isn’t dead. But the PM who only manages a backlog? That role is gone. The new PM manages context, leads agents, and — most importantly — never stops thinking for themselves.

Because the moment you outsource your judgment to a machine, you’re not a product manager anymore. You’re just a very expensive copy-paste operator.

And your Claude already knows that.

Target keywords: agentic product management, PM managing AI agents, AI product manager skills 2026, context engineering, product management AI workflows

I'm Not a Developer. I Built an AI Product in 2 Months. Here's My Entire $150/Month AI Stack.

Mykyta — Sun, 22 Mar 2026 08:57:29 GMT

Most people use AI to write emails faster or summarize articles.

I used it to replace a team I couldn’t afford — and build a product I couldn’t build alone.

Let me back up. I’m a Product Manager. Five years of experience across telecom and travel-tech. I can write a solid PRD, run a user interview, and prioritize a backlog in my sleep. But I can’t code. Not really. I understand architecture at a high level, I can tell you how I want services to talk to each other, but writing production code? That’s not me.

And yet, two months ago, I shipped Temporal.day — an AI-powered calendar that auto-schedules your tasks. A real product. With real users. Built almost entirely with AI tools.

Not a landing page. Not a Figma prototype. A working product with AI auto-scheduling, natural language input, Google Calendar sync, payments, and a live user base.

My total AI spend: roughly $150 per month.

A developer alone would cost me $5,000+ per month. Add a designer, a content person, someone to help with distribution — you’re looking at $10K+ easily. And it would take significantly longer to ship.

I’m not writing this to brag. I’m writing this because most “how I use AI” articles are useless. They list 10 tools, describe each one in two sentences, and leave you with nothing actionable. You’ve read that article. I’ve read that article. It didn’t change how I work.

This is different. This is my actual daily workflow — tool by tool, hour by hour, decision by decision. What I use, why I use it, where AI saves me hours, and where I deliberately don’t use it at all.

If you’re a PM thinking about building your own product, a founder trying to do more with less, or anyone curious about what an AI-first workflow actually looks like in practice — this is everything I know.

The $150 Team

Here’s a mental exercise. Imagine you’re hiring a team to build and launch a product.

You need a developer to write code. A researcher to dig through competitors, find the right tools, analyze markets. A content person to write tweets, blog posts, and product updates. A distribution assistant to monitor Reddit, find relevant conversations, and spot opportunities. And someone to QA everything, find bugs, and write test cases.

That’s five roles. Conservatively $10–15K per month. Realistically more.

Or you can do what I did: pay $150/month and build the team out of AI tools.

Here’s who’s on my team.

Claude — my CTO and Head of Content (~80% of my work)

Claude is the center of everything. I use it for research, strategy, content creation, writing documentation, brainstorming features, analyzing competitors, and building artifacts.

Why Claude over ChatGPT? Three things. First, it holds context significantly better over long sessions. When I’m deep in a product strategy conversation that spans 30+ messages, Claude still remembers what we discussed at the beginning. ChatGPT starts drifting.

Second, it adapts to me. After working with Claude consistently, the responses feel calibrated to how I think and what I need. ChatGPT never quite got there — the personalization felt off, and I could never get comfortable with its responses.

Third, the artifacts. When I need a structured document, a comparison table, a framework — Claude’s artifacts are cleaner and more usable than anything I’ve gotten from other tools.

If Claude is one person on my team, it’s the one I’d keep if I had to fire everyone else.

Claude Code — my developer

This is the big one. I’m not a developer, and Claude Code writes all of Temporal’s production code.

My process looks like this: I describe the architecture I want. How services should interact. What the user experience should feel like. What the edge cases are. Claude Code writes the implementation. I test. We iterate.

Is the code perfect? Probably not by senior engineer standards. But the product works. It ships. It handles real users. And the speed is incomparable — Claude Code writes faster than any human developer. Not better, necessarily. But faster. And when you’re building 0→1, speed of iteration is everything.

The alternative was spending months finding a technical co-founder or burning through savings hiring a freelance dev. Claude Code let me go from idea to working product in two months.

Perplexity — my analyst

Perplexity handles research that Google can’t.

Here’s a real example. I needed a payment processor for Temporal. Sounds simple, right? Google “best payment processors” and pick one. Except I’m a resident of a specific country, which means half the popular services won’t work for me. And the ones Google surfaces are all the same big names — Stripe, Paddle, LemonSqueezy. The SEO game is dominated by the biggest players.

I needed something smaller. Something niche but reliable, with lower commissions, that actually works in my jurisdiction.

Google was useless. I kept seeing the same top-10 lists recycled across every blog.

Perplexity found it. It took a few hours of back and forth — trying different criteria, evaluating options, checking availability. But it surfaced a service I never would have discovered through traditional search. A smaller provider, popular in certain circles, with better terms for my situation.

That’s the pattern: when you need to go beyond the SEO-optimized surface of the internet, Perplexity digs deeper.

ChatGPT Codex — my QA engineer

I still pay for ChatGPT, but my usage dropped to about 5%. Here’s why I keep it: Codex.

Claude Code and ChatGPT Codex have fundamentally different personalities. Claude Code is the fast developer who wants to ship everything now. Codex is the careful reviewer who reads the entire codebase and says, “Hey, did you notice this bug on line 847?”

I use Codex for full project reviews, finding bugs that Claude Code introduced while moving fast, and writing test cases. It’s a different kind of thinking — slower, more thorough, more concerned with the whole picture. Claude Code builds. Codex audits. They complement each other.

Plus, OpenAI’s $20 subscription gives access to the API, which I use for my OpenClaw bot. And honestly — if I pause my subscription, ChatGPT keeps working for another four weeks while sending polite reminders about the failed payment. So effectively it’s $20 per two months. I’m not proud of it. But I’m being honest.

OpenClaw — my distribution assistant who never sleeps

This one changed my mornings completely.

OpenClaw is an open-source AI agent that runs on your machine and connects to messaging apps. I configured mine to work as an automated assistant that operates around the clock.

Every morning when I wake up at 5–6 AM, OpenClaw has already prepared my briefing:

Ideas people are discussing online that could inspire new features or products
A summary of what’s new in the AI and productivity space — things I might have missed
Interesting tweets that accumulated overnight in my niche
Reddit posts where people are discussing productivity apps, calendars, or looking for tools like Temporal — opportunities for me to engage and mention my product

This used to be an hour of manual scrolling. Now it’s a curated feed waiting for me before my first coffee.

The supporting cast

SuperWhispy converts my voice recordings to text — I often brainstorm by talking, then turn those recordings into tweets, notes, or article drafts. AI image generators handle TikTok avatars and visual content. Small tools, but they close gaps that would otherwise require hiring a designer for quick tasks.

Total cost: ~$150/month. Total roles covered: 5+.

5 AM to 10 AM: What an AI-First Workday Actually Looks Like

I wake up between 5 and 6 AM. My main job starts later, so mornings are for Temporal.

First 15 minutes: The OpenClaw briefing.

I check my phone. OpenClaw has already sent me a Telegram summary: trends, ideas, relevant Reddit threads, interesting tweets in my niche. I scan it, star anything worth acting on, and move to my desk.

Next 30–60 minutes: Research and planning in Claude.

Before I write a single line of code, I think. I open Claude — lately through Claude Code on desktop because the research feels more thorough there — and we work through whatever problem I’m tackling.

Say I’m building a new feature. The session looks like this:

First, I research with Claude. What are competitors doing? What are the technical options? What are the tradeoffs? We go back and forth until I have a clear picture.

Then, we formulate the approach together. Not a formal spec, but clear theses: “This is what we’re building. This is how it should work. These are the edge cases.”

I write this down as structured notes. This part is critical — and I’ll explain why in the next section.

Next 2–3 hours: Building with Claude Code.

I take those notes and move to Claude Code. Now it’s execution mode.

I describe what I want. Claude Code writes it. I test. Something breaks. I describe the issue. Claude Code fixes it. We iterate.

On a good morning, I ship a complete feature before my day job starts. On a normal morning, I make solid progress on something complex. Either way — the product moves forward every single day.

Throughout the day: Content in the gaps.

Between meetings at my main job, during lunch, on my commute — I use Claude to draft tweets, brainstorm content ideas, or think through Temporal’s positioning. These micro-sessions add up. Most of my Twitter content is born in 5-minute bursts throughout the day.

Evenings: If time allows, one more session.

After work, gym, or English classes — if I have energy left, I do another 1–2 hour session with Claude Code. But mornings are the sacred time.

What AI Can’t Do (And Where I Refuse to Use It)

This is the part most AI articles skip. And it’s the most important part.

Because if all you hear is “AI can do everything,” you’ll use it wrong. You’ll delegate the things that only you should do. And you’ll end up with a product that technically works but doesn’t make sense.

Here’s where I deliberately keep AI out of my process.

Product vision is mine. Period.

I don’t ask AI to write my product requirements. Ever.

This might sound counterintuitive. Claude is great at writing documentation. It can generate a PRD in seconds. But here’s the problem: if AI writes your spec, you stop understanding your own product.

When I sit down to define how a feature should work, I need to think through every scenario. What happens when a user has 12 meetings and 20 tasks in one day? What should the AI prioritize? What does “urgent” mean in the context of someone’s Tuesday vs. their Friday?

These aren’t technical questions. They’re product questions. And the answers come from my 5+ years of experience, my understanding of the user, and my vision for what Temporal should feel like.

If I outsource this thinking to AI, I become a project manager for a machine. I stop being the product person. And the product becomes generic — because AI will always default to the most common patterns, not the most interesting ones.

So I write the specs. I define the logic. I draw the boundaries. Then, and only then, does AI help me execute.

Testing is human work.

AI can write test cases. And I use AI-generated test cases as a starting point. But the actual testing — clicking through flows, feeling the friction, noticing that something is technically correct but feels wrong — that’s me.

You have to use your own product obsessively. Every day. As a real user, not as a builder. The moment you stop testing personally is the moment your product starts drifting from what users actually need.

Distribution strategy needs a human brain.

AI handles maybe 10% of my distribution work — OpenClaw finding Reddit threads, Claude drafting tweets. But the strategy behind it — which conversations to join, what tone to strike, when to mention my product and when to just help someone — that requires judgment AI doesn’t have.

AI doesn’t understand context the way a human does. It doesn’t know that this particular Reddit thread is the wrong place to self-promote, or that this tweet needs to be vulnerable, not polished. Social intelligence is still a human skill.

The honest limitations.

AI hallucinates. It confidently tells you things that aren’t true. I’ve caught Claude inventing features that don’t exist in competitors’ products. I’ve caught Codex suggesting code patterns that would break other parts of the system.

And context degrades. In long sessions — 50+ messages — AI starts losing the thread. It forgets constraints you mentioned earlier. It contradicts its own recommendations. You have to manage this actively: break complex work into focused sessions, summarize the state regularly, and never blindly trust a response just because it sounds confident.

The bottom line: AI is a multiplier, not a replacement.

Here’s how I think about it. AI multiplies whatever you bring to the table.

If you have strong product vision, AI multiplies it with speed and execution power. You get a product shipped in 2 months instead of 8.

But if your vision is zero? Zero multiplied by anything is still zero. You’ll just get generic output faster.

The skill isn’t using AI. The skill is knowing what to ask, what to keep for yourself, and when to override the machine.

The Math That Changed My Mind

Let me put this in perspective.

Before AI, my options for building Temporal were: find a technical co-founder (months of searching, equity dilution), hire a freelance developer ($5K+/month, slower iteration, communication overhead), or learn to code myself (6–12 months before I could build anything real).

With AI, the math looks like this:

$150/month in AI tools
2 months from idea to working product
0 team members to manage
5 AM to 10 AM daily — my main job stays untouched

This isn’t theoretical. Temporal.day is live. People use it. It has AI auto-scheduling, natural language task input, Google Calendar sync, and a payment system.

Am I saying the code is as clean as what a senior engineer would write? No. Am I saying the product is perfect? Absolutely not — there’s plenty to improve.

But it exists. It works. Users interact with it daily. And it shipped at a fraction of the cost and time that any traditional approach would require.

That’s the real insight: AI didn’t make me a developer. It made development accessible to someone with strong product instincts and no engineering skills. The barrier to building products didn’t disappear — it shifted. It used to be “can you code?” Now it’s “do you know what to build and why?”

That second question? That’s where 5 years of PM experience actually matters.

Start Here

If this resonated and you want to try building your own AI-first workflow, here’s what I’d suggest:

Pick one tool and go deep. Don’t sign up for 10 things. Start with Claude or ChatGPT and use it for everything for two weeks. Research, writing, planning, analysis. Get a feel for what it’s good at and where it breaks down.

Build a real workflow, not a toy demo. Don’t just “try AI.” Apply it to an actual problem you’re solving at work. A competitive analysis. A project plan. A first draft of something real. The value clicks when the stakes are real.

Keep a “human only” list. Decide upfront which decisions stay with you. For me, that’s product vision, final testing, and distribution strategy. Your list will be different. But have one. Otherwise AI will slowly take over the thinking you should be doing yourself.

Start before you’re ready. Two months ago, I had an idea and zero technical ability. If I’d waited until I “knew enough” to start, I’d still be waiting. The tools are here. The gap between “I have an idea” and “I have a product” has never been smaller.

I’m building Temporal.day in public — sharing every decision, metric, and mistake along the way. If you’re on this path too, come say hi @mktpavlenko.

The best time to start building was yesterday. The second best time is this morning at 5 AM with a cup of coffee and Claude open on your screen.