Why Well-Funded Legal AI Tools May Not Deliver Better Results

Do well-funded AI tools always outperform the competition?

A

By Anna Guo

output usefulness ranking of AI legal tools

A few people shared the the Harvey ex-employee Reddit thread with me today. Here are some of my thoughts after reading their comments about Harvey’s product quality and comparisons to general purpose AI tools like ChatGPT.

First, a caveat: no one person’s account tells the whole story. Their perspective may contain bias and the observations may not be unique to Harvey.

However, some of what was discussed re product performance does resonate and I can add some nuance based on our own latest contract drafting benchmark research and discussions with users, buyers and vendors helping sell big well-funded enterprise legal solutions.

In our latest benchmarking research on contract drafting, we included one anonymized “big player” (labeled Anon) to see how larger, well-funded teams stack up against newer entrants.

What we’re seeing in the legal tech market:

  • Funding ≠ performance. In our contract drafting benchmark, an anonymized “big player” (Anon) ranked last in usefulness and only mid-pack in reliability (56.7%).
  • Some well-funded vendors focus more on capturing market sharethan on product improvements, retention, or customer support.
  • Many legal AI vendors don’t rigorously test their own tools, or leave quality control to engineers without legal oversight.
  • Best Model ≠ Best outcome. Using the “best” base model doesn’t guarantee quality.
  • Several lawyers have shared that their enterprise legal ai tools sometimes perform worse than ChatGPT. In fact, more than 80% of lawyers we surveyed use more than one AI tool for legal work.
  • Some buyers of big enterprise solutions are getting customized, white-glove service, others cannot get an email response.
  • Buying decisions can be driven as much by data security, integrations, and existing contracts as by functionality. So in the B2B context, product innovation and performance may not be a key factor to adoption.

Takeaway for legal teams:

  • Bigger doesn’t mean better. Assess tools on how they perform for your actual use cases, not their brand or funding.
  • Push for transparency. Ask vendors how and who is testing their own tools internally. Don’t assume a vendor using the “best” model = the best outcomes. Vendors make a lot of trade offs in product design that can lead impact on quality.
  • Check where the vendor’s customer focus lies. If you’re not a Fortune 500, will the vendor actually invest resources in you?
  • Finally, remember: one person’s post reflects their experience. To make good procurement decisions, compare multiple perspectives, stress-test on your own data, and look beyond the marketing and gossip.

About the Author

A

Anna Guo

Founder of legalbenchmarks.ai, with previous experience as in-house counsel at both Google and Alipay.