Antigravity continues to impress me. I've been excited by how much I've been able to get done, that I've continued to push the system to make updates. The updates I've made:
- A chorus of LLMs: one way to "fix" the deficiencies of LLMs is to query multiple LLM providers (e.g. OpenAI, Gemini, Anthropic), then have another LLM summarize all the answers. The thinking goes: one of them might be hallucinate, but the chances of all of them hallucinating in the same wrong direction is low. (It's similar to how blockchain consensus works!) I created a tool that queries a few LLMs and then uses another to summarize them all -- you can play with this tool with any question you might have.
- PDF Parser: I ultimately want to tackle the problem of processing and summarizing a bunch of files sitting in a folder. One use case: "read" through a term sheet and pull out all the key terms. (I would love to build up to "find me the bad/anomalous parts of the contract" but for now "get me the terms" would suffice!) I built this out, using OpenRouter to read the PDFs (using mistral-ocr, which costs $2/1000 pages). I then added in an LLM call that says, "if this looks like a term sheet, then pull out XYZ fields." It's just one use case for now, but it seems to work fairly well. (For testing, I used the Buffer term sheet found here.)
- Competitors and public comps: I applied the "chorus of LLMs" to help find me competitors and public comps. My initial thought was to build this agent like I would run this: run a Google query, read the pages, then make sense of it. I tried using LLMs by themselves, hoping that throwing 5 different LLMs would provide good enough coverage across the internet. It's worked fairly well!
Lots of other doodads added on -- ability to download the report, etc. And, not too bad for a few hours of work!
So far, the app is a ChatGPT (plus other LLMs) wrapper, but the value this app is adding above ChatGPT alone:
- The chorus architecture adds a legitimate benefit (and one that would be hard to do without this app!)
- The agent adds a lot of extra context to the LLM query -- e.g. "don't hallucinate, return sources, etc." This is built into the series of LLM queries automatically. (In other words, great "prompt engineering"!)
- Configurability -- I can tweak this to deliver the information to me however I want it! With the death of SaaS, this is the dream -- being able to build tools exactly as you want them!
Where this app will hit walls:
- Analyzing tables/graphs -- say we upload a board deck. It will struggle with tables, figures, and graphs! (This is a known weakness of LLMs -- as they say, a picture is worth a thousand words.) There may be some tricks with tables and graphs, and that'll get me more into the weeds of LLM/AI capabilities. (A similar need: trying to figure out if a term sheet has been signed or not, which can be important if we have multiple drafts in a folder!)
- Processing very long documents -- dealing with very long documents (e.g. a 10-K) is challenging. This is what things like RAG were built for -- but that adds a-whole-nother dimension of complexity.
- Finding good data -- this is a forever challenge! For example, I typically get my private market data from Pitchbook now, but I couldn't get my ChatGPT API to integrate with Pitchbook. Crunchbase (via the chorus of LLMs) seemed to get pretty good information, but having good, reliable data will continue to be a need throughout the AI era.
No comments:
Post a Comment