Monday, May 25, 2026

Progress on CRM

I wrote a few weeks ago that I had started to build a CRM tool. I've made pretty good progress, but I've realized that it's less about the CRM itself and more about building the tools that I want to see, with AI/document processing at its core. I've come to remember that I'm opinionated on making the tools as easy to use as possible, but also insistent on clean audit trails and ease of troubleshooting, as well as clean data structures and modular code. 

What's done so far:

  • Core data models -- I've built out the core data models and data tables into Supabase. Users can manually build companies, investment firms, and people into the system. 
  • Document uploads -- Users can upload documents -- which can then create new companies, etc. in the system and link them automatically. I think this is the holy grail -- just upload documents and have it update your internal notes! 
  • Audit trail -- I have a good audit trail mechanism built out, telling you how a company was created (manually vs. automatically), as well if individual fields were updated. May seem like overkill now, but I think a must longer-term (and something we can build on). Good foundations!
  • Authentication -- started but not quite working. Google OAuth login enabled through Supabase. Something not 100% working
  • Designed to be plug-and-play -- I've designed this to be plug-and-play for any investment firm. All you'd need to do is plug in a few of the company's own things (e.g. Supabase information, OpenRouter API key, and a few more things), and you'd be off and running! (This could also be deployed in a Docker and shipped to multiple customers!)
  • Data pipelines -- Much of the data gets piped in from somewhere, on some sort of cadence. Examples: (1) some VCs look at a16z's website weekly to add to their sourcing queue, (2) CT Innovations could pull companies from the CT business registry weekly, (3) newsletters delivered to my email might contain start-ups I should check out. I have a simple data structure to schedule these pipelines, which will need to be built on with more use cases.
What's next:
  • Vision of email-driven + automated workflows -- From what I've seen, investment workflows and communications are driven largely by email. Therefore, a good CRM should integrate into existing workflows, while having the capacity to augment them. What this looks like: the system tracks email flow, downloads documents (and automatically uploads them to CRM), and the CRM interprets where we are along in the process. A concrete example: if we're doing diligence on a fund, the CRM tool should (1) automatically download the pitch deck and supporting docs, (2) interpret them, (3) interpret and import important dates (e.g. expected closing date or other funds circled), and (4) ensure we follow up (either with a rejection or a request for updates, if it's been a few months since last communication). The system should drive the investor to do this, without having to configure anything additional in the system. 
  • More granular company/investor data -- Next up from the data model side will be adding in things like revenue (for a single company), funds (for investment firms), and other data (portco allocation, percent ownership, cash flows, etc.) The challenge is finding a good, flexible data structure that can account for all the pieces of info you might want to store discretely. 
  • Showing provenance of data -- Data can come from multiple sources e.g. public/published articles, hearsay from other investors, and source documents from the company itself. The system should be smart enough to track the data provenance -- revenue numbers from the company themselves are much better than a guesstimate from a published article, which are much better than a rough estimate. This distinction is crucial for contextualizing the system data (and thus for downstream AI processing)
  • Handling Excel and better text extraction -- Right now, the text extraction (i.e. PDF -> text) is pretty good but not bulletproof, which'll be important for trusting the numbers. Likewise, we don't handle Excel yet. (Excel documents are universally tricky for LLMs to handle.) Both are fixable but more technical problems; better to get a rough version 1 of the entire system before honing in on these challenging minutiae. 
  • Integration with other vendors -- So far, I have integration with Google OAuth for login. Perhaps some integrations with other parts of the investment stack (e.g. CRMs, knowledge management tools, financial tools) to allow this tool to orchestrate everything together.
What I've learned/observed about AI and sustained human superiority:
  • Humans still need to make decisions -- I can use AI to help me build the data models and SQL tables, but I (as the human) still need to make decisions on how simple or complex the data models are. LLMs often over-engineer, so it's still up to me to set a good foundation by building a simple but strong core data model. (These data structure tradeoffs are nearly impossible for AI to handle -- it doesn't know exactly what I want to build next, so it doesn't know what foundation to lay!)
    • A silly example: if we had robots that could build a house, we would still need expert humans to help guide us through the trade-offs (e.g. how many bathrooms, what materials to use, cost vs. quality of material, which type of siding, etc. etc.). No difference in engineering code.
  • AI lacks taste -- For some decisions, like on UI layout, the AI adds a bunch of crud which makes the app look/feel distasteful. I'm no designer, but I think on many decisions, I have taste that AI will never quite be able to capture. 
  • Humans know when to quit (and where to be creative) -- My AI coding agent was dogged in trying to fix a particular issue -- it tried to brute force its way through the problem, without success. After about 20 minutes (and a few failed updates), I pushed it to try a different solution.  (Technical detail: I asked it to make a synchronous call asynchronous, and update/centralize code to make this work.) The new solution worked well. It's a case where a human (me) knew when to say: "This is taking longer than it should, and I have other good ideas on how to fix this that match my longer-term vision better." This is also a case where knowing how the underlying system works is crucial -- if this were vibe-coded, there'd be no way to know the root cause of the problem, or approaches to fixing it. 
  • AI coding agents are great at execution -- AI coding tools are very good at writing code. The more prescriptive you can be, the better the execution.


No comments:

Post a Comment

Progress on CRM

I wrote a few weeks ago that I had started to build a CRM tool. I've made pretty good progress, but I've realized that it's les...