Mike Yanagisawa: May 2026

Monday, May 25, 2026

Progress on CRM

I wrote a few weeks ago that I had started to build a CRM tool. I've made pretty good progress, but I've realized that it's less about the CRM itself and more about building the tools that I want to see, with AI/document processing at its core. I've come to remember that I'm opinionated on making the tools as easy to use as possible, but also insistent on clean audit trails and ease of troubleshooting, as well as clean data structures and modular code.

What's done so far:

Core data models -- I've built out the core data models and data tables into Supabase. Users can manually build companies, investment firms, and people into the system.
Document uploads -- Users can upload documents -- which can then create new companies, etc. in the system and link them automatically. I think this is the holy grail -- just upload documents and have it update your internal notes!
Audit trail -- I have a good audit trail mechanism built out, telling you how a company was created (manually vs. automatically), as well if individual fields were updated. May seem like overkill now, but I think a must longer-term (and something we can build on). Good foundations!
Authentication -- started but not quite working. Google OAuth login enabled through Supabase. Something not 100% working
Designed to be plug-and-play -- I've designed this to be plug-and-play for any investment firm. All you'd need to do is plug in a few of the company's own things (e.g. Supabase information, OpenRouter API key, and a few more things), and you'd be off and running! (This could also be deployed in a Docker and shipped to multiple customers!)
Data pipelines -- Much of the data gets piped in from somewhere, on some sort of cadence. Examples: (1) some VCs look at a16z's website weekly to add to their sourcing queue, (2) CT Innovations could pull companies from the CT business registry weekly, (3) newsletters delivered to my email might contain start-ups I should check out. I have a simple data structure to schedule these pipelines, which will need to be built on with more use cases.

What's next:

Vision of email-driven + automated workflows -- From what I've seen, investment workflows and communications are driven largely by email. Therefore, a good CRM should integrate into existing workflows, while having the capacity to augment them. What this looks like: the system tracks email flow, downloads documents (and automatically uploads them to CRM), and the CRM interprets where we are along in the process. A concrete example: if we're doing diligence on a fund, the CRM tool should (1) automatically download the pitch deck and supporting docs, (2) interpret them, (3) interpret and import important dates (e.g. expected closing date or other funds circled), and (4) ensure we follow up (either with a rejection or a request for updates, if it's been a few months since last communication). The system should drive the investor to do this, without having to configure anything additional in the system.
More granular company/investor data -- Next up from the data model side will be adding in things like revenue (for a single company), funds (for investment firms), and other data (portco allocation, percent ownership, cash flows, etc.) The challenge is finding a good, flexible data structure that can account for all the pieces of info you might want to store discretely.
Showing provenance of data -- Data can come from multiple sources e.g. public/published articles, hearsay from other investors, and source documents from the company itself. The system should be smart enough to track the data provenance -- revenue numbers from the company themselves are much better than a guesstimate from a published article, which are much better than a rough estimate. This distinction is crucial for contextualizing the system data (and thus for downstream AI processing)
Handling Excel and better text extraction -- Right now, the text extraction (i.e. PDF -> text) is pretty good but not bulletproof, which'll be important for trusting the numbers. Likewise, we don't handle Excel yet. (Excel documents are universally tricky for LLMs to handle.) Both are fixable but more technical problems; better to get a rough version 1 of the entire system before honing in on these challenging minutiae.
Integration with other vendors -- So far, I have integration with Google OAuth for login. Perhaps some integrations with other parts of the investment stack (e.g. CRMs, knowledge management tools, financial tools) to allow this tool to orchestrate everything together.

What I've learned/observed about AI and sustained human superiority:

Humans still need to make decisions -- I can use AI to help me build the data models and SQL tables, but I (as the human) still need to make decisions on how simple or complex the data models are. LLMs often over-engineer, so it's still up to me to set a good foundation by building a simple but strong core data model. (These data structure tradeoffs are nearly impossible for AI to handle -- it doesn't know exactly what I want to build next, so it doesn't know what foundation to lay!)

A silly example: if we had robots that could build a house, we would still need expert humans to help guide us through the trade-offs (e.g. how many bathrooms, what materials to use, cost vs. quality of material, which type of siding, etc. etc.). No difference in engineering code.

AI lacks taste -- For some decisions, like on UI layout, the AI adds a bunch of crud which makes the app look/feel distasteful. I'm no designer, but I think on many decisions, I have taste that AI will never quite be able to capture.
Humans know when to quit (and where to be creative) -- My AI coding agent was dogged in trying to fix a particular issue -- it tried to brute force its way through the problem, without success. After about 20 minutes (and a few failed updates), I pushed it to try a different solution. (Technical detail: I asked it to make a synchronous call asynchronous, and update/centralize code to make this work.) The new solution worked well. It's a case where a human (me) knew when to say: "This is taking longer than it should, and I have other good ideas on how to fix this that match my longer-term vision better." This is also a case where knowing how the underlying system works is crucial -- if this were vibe-coded, there'd be no way to know the root cause of the problem, or approaches to fixing it.
AI coding agents are great at execution -- AI coding tools are very good at writing code. The more prescriptive you can be, the better the execution.

Sunday, May 24, 2026

Claude Co-Work vs. In-House Tools

Just a month or two ago, the tech world boasted about how many tokens it burned (i.e. how large its operating expense was). That party didn't last long. The consensus now is that for many jobs, hiring a person is easier/cheaper than hiring an AI agent.

This has been my fear with agents all along: (1) what kind of agent truly needs to run automatically 24/7 and (2) how many tokens would this eat up needlessly. Side note: I got burned with cloud hosting on both Snowflake and Databricks for a personal project -- billed for cloud/GPU capacity that I wasn't using at all -- so I'm more sensitive than most to trusting tech companies with my credit card.

I've also consistently heard great things about Claude Co-Work. I've been tinkering with building my own tools for a while, and I had a moment of panic: is there any use in what I'm building, or am I reproducing things that Claude (and thousands of other smart developers) are already building?

ChatGPT helped drum up where the tradeoffs are. I put them below, as a reminder to myself that in-house tools -- ones that you know how they work, that link to the data you want, etc. -- have lasting value. Maybe not as a venture-backable company, but real time-saving workflow value.

Anyways, here are the top dimensions where building your own software really shine over what Claude (or similar) offers.

Dimension	Claude / Chat UI	Custom Pipeline
Time-to-value	Extremely fast	Slower
Repeatability	Weak/moderate	Strong
Flexibility	Limited	Full control
Structured data extraction	Okay	Excellent
Audit trails	Weak	Strong
Multi-stage workflows	Awkward	Natural
Integration with database	Limited	Native
Cost at scale	Can become expensive	Often cheaper at volume
Vendor lock-in	High	Low
Human-in-loop review	Limited	Fully customizable

Tuesday, May 12, 2026

Intuition vs Process

In the investing world, there's a tension between intuition and process.

Some of the "greats" that I've met seem to eschew frameworks or process. The thinking goes: process makes your lazy -- you think better if you start from "first principles" every time. On the face of it, it feels like it should be correct; great investors are great because they've had to teach it to themselves from the ground up. What makes an investor great is their ability to think and diligence.

It feels like some investment firms (especially smaller ones) are designed with the solo investor in mind. There is no firm standardized process, no standardized sourcing pipeline, no general training. To do so would constrain the investor, who needs to use their gut to make their decisions (so the thinking goes). A framework would stand in the way of intuition.

However ... when you assess funds as an LP, a lot of focus is on process: can we trust this manager to produce alpha, and do they have a replicable process? To assess this at larger investment consultants is the other end of the extreme. The investment framework is truly a process, a 200-plus-line Excel spreadsheet of investment criteria which is then synthesized into a memo. The intuition purists would say: yes, you've checked every box, but you've missed some je ne sais quoi about the investment, something not in your checklist that makes it stand out (or makes it fall apart). There are parallels to Atul Gawande's The Checklist Manifesto, where checklists improve surgery (and air flight safety) despite being initially despised by surgeons (and pilots, etc.)

All this to say: the investment firms that will be most successful with AI will be those that translate process into software-codified systems -- i.e. investment in AI will be all about process. One example: right now, as a firm, sourcing at some places feels spontaneous -- every investor has their own set of connections, resources, rules, etc. Some of this could/should be codified in agentic workflows; I've started to collect a list of "high quality" sources (e.g. a16z speedrun, etc.). Another example: we have a light investment memo template, but many questions are asked beyond what the template contains. These questions should ultimately be subsections of the template, which can then a yardstick by which to measure investment diligence. This means updating a core investment framework template, to ensure that that questions is answered every time.

On one hand, I hate it -- filling out a 200-question survey (and adding questions to it) seems to take the joy out of reading, learning, and investing. On the other hand, it's a bit embarrassing to miss certain pieces of diligence over and over again. And as a newer investor, it's a bit bewildering to be given tens of documents to read, without a rough mental framework in mind.

Ultimately, I'm coming to believe that "investment experience" might just mean "I've built a really solid investment framework in my head." It means you know where to look first when doing diligence, or what the top questions are to ask -- those sections of the framework are raising red flags. So why not hand out this framework to earlier-career investors? I think it's ultimately what we'll be training investment LLMs on, a long-time reckoning that investment process is essential.

Sunday, May 10, 2026

Beginning a larger project: creating a CRM

I've been putting off creating a CRM tool, but I think it's about time. Maybe CRM is the wrong word: I need a tool to store info about all companies, investors, people, etc. to be able to build on down the road. The tipping point: I want to create a sustainable sourcing framework that scrapes data from X source weekly. (Again, I am too "lazy" to do this weekly myself -- I'd rather list all the sources I want to pull from, and have it run automatically.)

Part of this project is to see how long it takes to build something like this. I know a lot of folks are building these kinds of tools in Notion, but who wants to build a "Notion database" when you can build a more robust SQL one? I also think this sort of tooling is going to be critical scaffolding to build other AI tools on top of (e.g. document processing).

My full vision makes this a little clearer:

1. Create core data models. Create data models for "organizations" (companies and investors), funds, people, programs (e.g. a16z speedrun, DARPA) and cohorts, as well as all the relationships between them (company-to-person, investor-to-company, program-to-company, etc.). Allow users to create/update these. All built in Supabase (backend), frontend in Streamlit (for now).

2. Create data pipelines. Populate companies from websites. For example, a16z speedrun cohort 6 was recent; the pipeline should be able to pull from their website and pull companies and relationships.

3. Audit trail. See who updated the companies (pipeline? user?)

4. Add authentication, other usability tools. Let users log in via Google, email users when things are updated in the pipeline, make the scheduler work.

5. Add in investment details and company details. Right now we have the basics (basic company demographics, binary on whether an investment was made). Expand out to support a more full investment structure (e.g. X invested $Y in Z in Series A) as well as more robust company reporting ($X in revenue in 2025, etc.)

6. Add database partition? Find best way to make the database specific to a single investor, so that you can layer on multiple companies who can't see each others' data? (And try to stay in Streamlit for simplicity?)

6. Add in document processing. This is where it gets investor-specific. Process documents to extract revenue, investment #s, etc.

7. Add in other fun workflows? Allow users to connect their emails so you can see if you've emailed X company? Layer in automated meeting notes?

Hoping to be able to get through #1-3 this weekend; would be great proof-of-concept. The hard parts: database design (need well-designed database to withstand all future additions I'm planning!), knowing how to layer in the additions, knowing which order to tackle these in, knowing what the benefits/weaknesses of the tools (like Streamlit, SQL, scraping tools, etc.) are, hoping Google Antigravity is able to suss out what I want it to build with my instructions.

Mike Yanagisawa