Mike Yanagisawa: The various modes of investment work

I've worked on a few deals in the past month, and I now finally have a lull to think a little more about how to use AI to automate/replicate some of the work I've been doing. I've been thinking a lot about the types of work that I do. Investment memo work falls into a few buckets:

Information distillation - given a data room of information, distill it down into the main points
Information search and aggregation - given a topic, try to find as many pieces of info on it and aggregate them together
Information verification - given a piece of information, verify whether it's correct
Financial analysis - digging into the historical financials as well as checking the assumptions in the financial projections
Anomaly detection - given a document or set of data, see if any piece of it is off/wrong
Interviews with founders, experts, customers, etc. - requires a lot more EQ and conversational skills, but also preparation to tee up the highest impact questions for the investment (and the best fitting questions for the interviewee)

This all feels a little abstract and unnecessary, but teasing out these modes of investment work is the way a developer would look to approach a broader fix to the problem. And, more importantly: AI is going to be used in a different way for each of these modes of work. I think that LLMs are sufficiently strong where a lot of problems can be resolved through clever infrastructure (i.e. not just throwing everything into the chatbot and hoping it does everything we want).

Some examples of the investment memo process that are particularly time-consuming:

Industry background - a lot of "information search and aggregation" -- Googling, ChatGPT'ing for trusted sources, listening to podcasts, reading consultant market maps, etc. Requires a lot of exploration and breadth; feels a little like climbing a mountain, and the surrounding landscape becomes clearer bit by bit.
Competitors - "information search and aggregation" -- Google/ChatGPT for competitors, then look up info on each of them (market niche, traction, last fundraise, etc.)
Public comps and recent M&A - "information search and aggregation" - Google/ChatGPT for this info, then look each up (e.g. in Bloomberg/Yahoo Finance for public comps, internet verification for recent M&A)
Company (e.g. team, product, GTM, moats) - "information distillation" -- taking the whole data room and compressing it into a few pages of info. There's additional critical thinking (e.g. does their GTM actually make sense? are the moats really moats?), but otherwise a lot of depth here.
Term sheet review - most terms are typically standard (or within reason), so this is both "information distillation" (taking a 30-page legal document and boiling it down to a few key points) and "anomaly detection" (seeing if there are any particularly egregious or atypical terms).
Traction/Financials - "Financial analysis", but then a lot of critical thinking and gut-checking on top of that.
Investment benefits and risks - this feels like it should be a lot about "gut" ... but I think an LLM could surface many good benefits/risks (and a human could then review and prioritize them)

I've been working on a verifiable, AI-driven "information distillation" process. Given a data room of information (or a pile of industry reports from McKinsey plus a few internet articles I dug up), can I get the AI to synthesize the info rooted in the files I give it?

Part of me feels a little foolish for doing this: the Googles and Anthropics of the world are already doing this, and they have teams that are much smarter than me! However, I think the tech giants are solving a fundamentally different problem. Their chatbots are generally interested in: "can I answer any question given to me well?" and "if the user uploads documents, can I answer any question he asks about it?" The challenges are at least twofold: (1) the chatbot has to retrieve info about any topic I might throw at it, and (2) it has to be able to answer any question I ask.

My little tool is much simpler: given documents, read the info and file it away neatly into folders that I ask it to. The process looks something like this:

Go through each document and extract information relevant to different buckets I give it (e.g. "company background," "team," "moats," etc.)
After processing all the documents, take one final pass to synthesize the extracted data together

Advantages:

You can apply this process to any document in the data room (just need to dial in the "folders" you file info into)
The extracted info and process are verifiable (unlike typical LLMs which are black boxes)
You can apply this process to many use cases -- just swap out the extraction schema and synthesis prompt

I'll have another post with more details, once I tidy up the development.

Mike Yanagisawa

Sunday, April 26, 2026

The various modes of investment work

No comments:

Post a Comment

The various modes of investment work

Search This Blog