Large language models are the technology powering applications such as OpenAI’s ChatGPT and Google’s Bard. So called because of their size, a large language model can have 100s of billions of parameters. These models are typically trained on datasets ranging in the order of trillions of words and can take weeks, if not months of time to train. As such, research and development costs can be of the order tens of millions if not more. One way of protecting such large investments is to seek patent protection.

The European Patent Office’s (EPO) general approach towards the patentability of core AI however means that obtaining patent protection for large language models is challenging but not necessarily impossible. Our experts routinely handle patent applications directed towards large language models and core AI and can help you with navigating through the restrictions imposed by the EPO.

When we speak of core AI, we mean the underlying technology such as neural networks and language large models that drive AI-based applications. Under the EPO approach, such models and algorithms are classified as mathematical methods. On its own, a mathematical method is not eligible for patent protection. However, under the EPO’s approach, a mathematical method can be patented when the method produces a technical effect.

The easiest approach for satisfying the technical effect requirement is by limiting the patent application to one or more technical uses of the mathematical method such as image classification or speech recognition. This approach however is typically not available to large language models as the EPO does not generally consider the processing of text to be a technical use.

A second path that is less well-used in general is what the EPO defines as “a specific technical implementation of the method and the mathematical method is particularly adapted for that implementation in that its design is motivated by the internal functioning of the computer system or network” (EPO Guidelines for Examination, G-II, 3.3). Whilst this sounds like a high bar, in practice, it is typically not so restrictive. The EPO provides an example of “assigning the execution of data-intensive training steps of a machine-learning algorithm to a graphical processing unit (GPU) and preparatory steps to a standard central processing unit (CPU) to take advantage of the parallel architecture of the computing platform.” Our experience is that where a mathematical method is adapted for parallel processing on a distributed system, the technical effect requirement can be satisfied. This is one route through which developments relating to large language models can sometimes be patented.

Going further, here at Venner Shipley we have also had success in arguing for the existence of technical effects without the need to limit to either technical uses or some form of distributed system. Typically, these involve applications where there are special uses of memory and the applicability of such an approach is likely to be specific to the case in hand.

Thus, whilst obtaining patent protection in Europe for large language models can be challenging, our experts at Venner Shipley are well placed to help you obtain the broadest possible protection available.