Genetic algorithms can examine problems without assumptions of what the correct answer should be.

This paper examines several widespread assumptions about artificial intelligence, particularly machine learning, that are often taken as factual premises in discussions on the future of patent law in the wake of ‘artificial ingenuity’. The objective is to draw a more realistic and nuanced picture of the human-computer interaction in solving technical problems than where ‘intelligent’ systems autonomously yield inventions. A detailed technical perspective is presented for each assumption, followed by a discussion of pertinent uncertainties for patent law. Overall, it is argued that implications of machine learning for the patent system in its core tenets appear far less revolutionary than is often posited.

Table of Contents Show

I. Introduction
1. The controversy: A tool or more than a tool?
2. The purpose and scope of the analysis
II. Clarifying the assumptions
1. A human states a problem – an ML system solves it
2. ML systems learn ‘complex rules’ based on ‘simple rules’
3. ML systems are autonomous, non-deterministic and unpredictable
4. ML is a ‘black box’
5. ML is a ‘general-purpose technology’
6. ML is a ‘general-purpose method of invention’
7. One ML algorithm – many inventions
III. Synthesis and research outlook
1. ML as computational techniques of problem-solving
2. Implications for patent law
ACKNOWLEDGEMENTS
Abbreviations

I. Introduction

1. The controversy: A tool or more than a tool?

2. The purpose and scope of the analysis

II. Clarifying the assumptions

1. A human states a problem – an ML system solves it

2. ML systems learn ‘complex rules’ based on ‘simple rules’

3. ML systems are autonomous, non-deterministic and unpredictable

4. ML is a ‘black box’

5. ML is a ‘general-purpose technology’

6. ML is a ‘general-purpose method of invention’

7. One ML algorithm – many inventions

III. Synthesis and research outlook

1. ML as computational techniques of problem-solving

2. Implications for patent law

I. Introduction

1. The controversy: A tool or more than a tool?

The European Commission’s initiative ‘Promote the Scientific Exploration of Computational Creativity’ launched in 2013 defines computational creativity as a ‘burgeoning area of creativity research that brings together academics and practitioners from diverse disciplines, genres and modalities, to explore the potential of computers to be autonomously creative or to collaborate as co-creators with humans’.8 From an engineering perspective, the aspiration is ‘to construct autonomous software artifacts that achieve novel and useful ends that are deserving of the label “creative”’.9

Some researchers share the view that artificial intelligence (AI) is ‘no longer a tool’, not even ‘a very sophisticated tool’.10 Stephen Thaler refers to an artificial neural network (ANN)-based system invented and patented by him in 1997 as the ‘creativity machine’ and claims that it can ‘perform imaginative feats that extend beyond technological invention into the realms of aesthetics and emotions’.11 More recently, Thaler designated the ANN-based system DABUS,12 also previously patented by him,13 as an inventor in two patent applications.14 Thaler is not original in his intention to enhance the status of computational systems. The mathematician Doron Zeilberger, for instance, often credits a computer – which he named Shalosh B. Ekhad – as his co-author.15

In contrast, others have viewed mathematics as ‘a fundamental intellectual tool in computing’, computers as a tool in mathematical problem-solving, and Shalosh B. Ekhad as Doron Zeilberger’s pseudonym.16 Commentators have expressed reservations regarding the ability of AI systems to invent autonomously,17 arguing that the assertion that ‘machines have been autonomously generating patentable results for at least twenty years’18 is ‘simply wrong’;19 that AI techniques are tools in a (human) inventor’s hands;20 and that ‘fully autonomous creation or invention by AI does not exist, and will not exist for the foreseeable future’.21 In the realm of artistic expression it has been argued that ‘[c]omputers do not create art, people using computers create art’;22 that ‘all “computer-generated art” is the result of human invention, software development, tweaking, and other kinds of direct control and authorship’; and that ‘the human is always the mastermind behind the work’.23

Notably, the anthropomorphisation of AI has been criticised within the AI community itself.24 As one professor of computer science once observed, a researcher who employs a ‘wishful’ language endowing computers with understanding and intentionality when referring to computer programs and data structures is misleading the general public and, ‘most prominently […] himself’.25 Studies show that the anthropomorphised depiction of AI can affect the human perception of AI systems, particularly the level of trust and the attribution of responsibility in automated driving26 and even the allocation of credit for AI-generated artistic output.27 AI researchers caution that ‘we must be careful’ about anthropomorphising AI, which is ‘math, statistical analysis and pattern-matching’,28 and that the pervasive use of anthropomorphic language ‘inadvertently promotes misleading interpretations of and beliefs about what AI is and what its capacities are’, and can also ‘have epistemological impact on the AI research community itself’.29

2. The purpose and scope of the analysis

The assumption that AI has already replaced30 – or is about to replace31 – human creators and inventors prompted the arguments that the output ‘autonomously’ generated by such systems does not and should not merit IP protection.32 The perception that AI systems are autonomous agents endowed with problem-solving, decision-making and inventive capabilities33 provoked several reformist proposals concerning patent law, including abolishing the patent system altogether,34 recognising AI as an inventor35 and replacing a ‘POSITA’ (a person ordinarily skilled in the art) with a ‘MOSITA’ (a machine ordinarily skilled in the art).36 Claims that computers autonomously generate inventions have extended beyond the legal scholarship37 and entered litigation.38

On closer inspection, the polarisation of the discourse on patent law and AI ostensibly stems from divergent understandings of AI techniques and state of the art research on AI. As for the normative perspective, there can hardly be room for disagreement on the point that, if inventions could indeed be generated by computers autonomously, the need for the patent system and its design would need to be fundamentally reconsidered. The question is, precisely, whether the anthropomorphised depictions of AI and its claimed capabilities can be taken for granted as a basis for legal analysis and policy recommendations.

In this article, we take a close look at a set of widespread assumptions about AI – more specifically, machine learning (ML) – that are often taken as factual premises in the discussions on the future of patent law in the wake of ‘artificial ingenuity’. The common denominator is that, if taken at face value, such assumptions prompt a radical update of the patent system, if not call into question its very existence. The objective is to draw a more realistic and nuanced picture of the human-computer interaction in solving (technical) problems than the one where AI ‘spits out’ inventions ‘after a button press’.39 While it is generally recognised that ML is not a ‘magic’ tool,40 several technical aspects require clarification to understand more precisely their implications for patent law.

The discussion focuses on ML techniques41 – in particular, ANNs and genetic programming (GP)/evolutionary algorithms (EAs)42 – given their increasing application in technical problem-solving such as technical constraint-satisfaction and constraint-optimisation problems.43 By an ‘AI/ML system’, this paper refers to the combination of software and hardware necessary to implement an AI/ML method.

As for the legal context, the analysis draws mainly on European patent law, with occasional recourse to other jurisdictions. At the same time, the technical aspects of AI discussed in this paper concern the basic tenets of patent law that can be relevant for any jurisdiction dealing with the implications of AI for patent law and policy. Besides the much-debated issue of the genesis and allocation of inventor’s rights in inventions allegedly ‘generated’ by AI, the scope of the analysis includes substantive patent law questions such as patent eligibility, the sufficiency-of-disclosure requirement and standard, the definition of a skilled person and the fulfilment of the inventive step requirement. More broadly, we reflect on the role of patent protection given the general-purpose-technology nature of ML. Part II discusses each assumption according to a threefold structure: first, an assumption and challenges it provokes for patent law are stated; second, a detailed technical perspective is presented; third, patent law uncertainties are revisited and clarified in light of the technical explanation. Part III situates the question of whether AI is a tool or ‘more’ than a tool within a broader context of research in cognitive science and puts the implications of AI for deontological and economic underpinnings of patent law into perspective.

II. Clarifying the assumptions

1. A human states a problem – an ML system solves it

Narratives about ML might create an impression that a mere statement of what needs to be achieved without how that should be done can be enough for a computer to accomplish a task. For instance, EAs are vested with a genie’s capability to solve technical problems – a human only needs to make a wish.44 According to the High-Level Expert Group on Artificial Intelligence (HLEG-AI), AI systems are ‘goal-directed, meaning that they receive the specification of a goal to achieve from a human being and use some techniques to achieve such goal’.45

a) Patent law uncertainties

(i) Inventorship

If a human could merely state a problem that AI systems could solve ‘on their own’, there would be a strong argument that not only the allocation of inventor’s rights but the need for the patent system in the first place would need to be fundamentally reassessed. Some legal scholars argue that the patent system should be eliminated ‘altogether’46 in the age of artificial ingenuity. Others suggest applying rules on employee inventions to ‘AI-generated’ inventions, provided that an ‘electronic personality’ can be attributed to AI.47 Before such policy proposals can be considered, it is necessary to clarify whether ‘making a wish’ might be sufficient to solve a problem by AI.

(ii) Inventive step/non-obviousness

The perception that the most significant input by a human subsists in merely supplying a problem, which is then solved by AI, might prompt an analogy with the so-called ‘problem inventions’. Under the EPO approach, the appreciation of an unrecognised technical problem can confer an inventive step, even if the claimed solution is ‘retrospectively trivial and in itself obvious’,48 provided that the problem could not have been ‘readily posed’ by any skilled person.49 To consider the aptness of the analogy, let us first clarify whether the human role in ML, in fact, subsists solely in posing a problem.

b) The problem-solving capacity of ML systems from a technical perspective

(i) An algorithm as an explicit and specific problem-solving route

The assumption that a mere statement of a goal suffices for a computer to solve a problem is largely inaccurate – otherwise, we would have had a ready-to-deploy vaccine against SARS-Cov-2 as soon as we realised it was urgently needed.50 While telling a computer ‘what’ to do without ‘how’ has long been an aspiration of research on automated programming, it is considered ‘unrealistic, at least in the foreseeable future’.51

The ‘how’ in ML – and any computer-implemented process in general – is represented by computational operators and configurations that determine the sequence of computational states, through which the given inputs are transformed into the desired output. In computer science, an algorithm is defined as ‘a finite set of rules that gives a sequence of operations for solving a specific type of problem’. It is distinguished by the following features: definiteness (each step is precisely, rigorously, and unambiguously specified); input (a set of data is provided before the procedure starts); output (the procedure generates ‘a nonempty set of results’); finiteness (the procedure consists of a finite number of steps); and effectiveness (‘operations must all be sufficiently basic that they can in principle be done exactly and in a finite length of time by someone using pencil and paper’).52 This definition holds for ML algorithms as they specify a sequence of steps on how a model for a particular problem should be developed. These steps are precise, rigorous and unambiguous and are clearly defined by logical reasoning, conditional structures and loops. While the process might seem to be running ‘by itself’ to an outsider, the designer of an ML system knows what a computer is supposed to do at each step. Neither conventional nor quantum computers can deviate from a given algorithm. Furthermore, it should be noted that ML algorithms can work ‘off the shelf’ only in limited cases and usually require special-purpose adjustment.53

(ii) How are problems specified in ML?

The definition of ML as techniques allowing ‘an AI system to learn how to solve problems that cannot be precisely specified’54 is somewhat confusing. To translate a real-life problem into something that can be processed and solved by a computer, a problem essentially needs to be expressed in an abstract way – by using a formal notation (a mathematical model, functions, logic rules, etc.) – that a computer can decipher and implement. In optimisation problems55 where one seeks to maximise or minimise certain values of interest, a problem is typically defined as an objective function that assigns numerical values to certain outcomes, such as a cost function in ANNs or a fitness function in GP/EA.

What is not ‘precisely specified’ in ML is the solution. To give a simple example, assume that we do not know which number multiplied by 4 produces 16 (a problem). To convey this problem to a computer, we express it as 4*x = 16. One approach would be to brute force by testing all possible values of x until value ‘4’ is reached. An ML-based approach would be to provide a relevant56 annotated (labelled) dataset of mathematical operations and build a model that will derive x = 4 from the observed cases. To guide the process of deriving x, one would need to specify a cost function providing a mathematical measure of how close or far-off a trained model is from the actual data. A third approach would be to use an optimisation algorithm to perform a ‘guided search’ by altering the solution based on the proximity to the number 16. No labelled data would be required for that, but one would need to model 4*x – 16 as a fitness function to be minimised. Irrespective of which method is used, deriving the ‘non-precisely specified’ output in ML is akin to solving a mathematical equation when the exact solution is not known ex ante.57

(iii) (Non-)explicit programming in ML

Arthur Samuel is credited with coining the term ‘machine learning’.58 He is also falsely59 said to have defined ML as a ‘field of study that gives computers the ability to learn without being explicitly programmed’.60 To clarify, ML algorithms themselves are explicitly programmed. As noted above, the general definition of an algorithm as a procedure that is specified precisely, rigorously and unambiguously applies to ML algorithms as well.61

As discussed above, what is not ‘explicitly programmed’ in ML is the output. The optimisation of ML model parameters is governed by the inputs and configurations set up before an algorithm is implemented on a computer. In the case of ANNs, for instance, one needs to provide a relevant pre-processed dataset and configure the hyperparameters62 that, in turn, determine the model parameters (weights and biases). Thus, one could say that a human does not directly (‘explicitly’) program the output. This concerns the core distinction between two types of AI: rule-based63 and learning-based systems. However, with both approaches, human input is decisive for the computational output.64

(iv) ‘High-level’ statements in GP

GP is defined as ‘an automatic method for solving problems’65 that starts with ‘a high-level statement of the requirements of a problem and attempts to produce a computer program that solves the problem’.66 ‘High-level requirements’ refer to multiple, often conflicting optimisation constraints and considerations, including the requirements regarding robustness and stability of a solution.67 Such requirements are expressed as a fitness function that provides ‘the search’s desired direction’.68 Besides, mutation and recombination operators strongly impact the performance of EAs.69 These configurations operationalise the iterative process of search and improvement until the optimal solution is found, or constraints are proved impossible to satisfy, or other termination criteria apply.70 In other words, ‘high-level statements’ do not mean the absence of human guidance on how computation should be implemented.

In sum, the assumption that ML systems can solve problems without human guidance does not hold. If a problem is solved through computation, it means that the combination of inputs and instructions, irrespective of the level of granularity, is sufficient for finding the solution. This echoes Herbert Simon’s observation that solving a problem ‘simply means representing it so as to make the solution transparent’.71

c) Patent law uncertainties revisited

(i) Implications for the inventorship

In view of the above, it appears clear that ML systems do not replace human problem-solving skills, and the assumption that an ML system can solve a given problem by itself is unfounded.

As a general principle, a normative analysis should be undertaken if there is genuine uncertainty about the appropriateness of the existing legal order. Hence, the currently applicable definitional criteria for inventorship should serve as a starting point. As the study by Shemtov72 on the concept of inventorship in the context of AI concludes, ‘[c]reative or intelligent conception of the invention, or contribution thereto, is a feature that runs either explicitly or implicitly throughout the definition of inventorship in all of the [examined] jurisdictions’.73 To meet this requirement, the ‘engagement in the conception phase [should go] beyond the provision of abstract ideas on the one hand, and mere execution of those provided by others on the other hand’74 and should be made ‘on an intelligent and creative level rather than financial, material or mere administrative level’.75 The subsequent analysis will rely on this standard as a general guiding principle,76 while it is acknowledged that jurisdictions might differ in fleshing out the concept of the inventor.

Conceptualisation is about abstract thinking, a capability even the ‘most sophisticated AIs in the world today have trouble with’.77 To solve problems through computational modelling, including in the fields of technology and engineering, one needs to grasp a problem and gain ‘an extensive understanding of mathematical structures in order to match them to the problem at hand’.78 One also needs to define accurate assumptions, boundaries and parameters that reflect and characterise inherent properties of a system or process being modelled, etc. Such cognitive activities can refer to the conceptualisation phase of computational problem-solving that precedes programming and algorithm implementation.79 As noted above, ML algorithms usually require special-purpose application. Hence, in situations where ML is used in the inventive process, it appears appropriate to refer to the decision-making in applying ML techniques to a technical problem at hand as a proxy for the ‘intelligent engagement’ in invention conception.

Accordingly, as long as the use of problem-solving tools and techniques is not prejudicial to allocating the inventor’s rights to a human, the person making necessary decisions as to how to solve a technical problem by applying ML can and should be deemed an inventor, or a co-inventor (where such process involves joint human effort), provided that the co-inventorship criteria are fulfilled.80 In contrast, the one who develops the basic algorithm of a general-purpose nature – such as backpropagation81 – but is not involved in its application to a specific task should not be regarded as an inventor. Furthermore, even though the inventorship standard applied to humans cannot be directly transposed to human-computer interaction, the underlying principle that the mere implementation of instructions should not suffice for the inventor entitlement can counteract the claim that the computer should be recognised as a standalone inventor.82

As clarified in the preceding section, human input in ML is not confined to giving a problem to a computer. However, a normative doubt might arise about whether such input should be deemed sufficient relative to the computer’s contribution, often depicted as overshadowing the human role in terms of complexity and significance. Section II.2 will address the assumption underlying such doubt in detail.

(ii) Implications for the inventive step requirement

The inaptness of the analogy with ‘problem inventions’

Given that it is the appreciation of a problem – not how it is solved – that can confer an inventive step in the case of ‘problem inventions’,83 the use of ML techniques in developing an invention (problem-solving) would not be a relevant factor for assessing obviousness. While one could argue that identifying a problem and solving it should not be appraised equally when assessing inventive step, this is not an AI-specific issue and, therefore, it is not analysed here.

Knowledge and skills in ML should be factored into the definition of a skilled person where relevant

To fulfil the inventive step requirement, an invention must represent an achievement that lies beyond the reach of a hypothetical practitioner with ‘average’ skills. The examination is fact-specific and methodologies differ among national patent offices. In the EPO’s practice, the crux of the assessment is whether a skilled person, having regard to state of the art, would (i.e. not just could) have arrived at ‘something falling within the terms of the claims, and thus achieving what the invention achieves’.84

As clarified, ML systems do not solve problems ‘on their own’ but can be applied as computational problem-solving techniques. Accordingly, skills and knowledge in ML should be factored into the definition of a skilled person85 when assessing the obviousness of inventions that could or would86 have been developed by applying ML. The gradual integration of ML techniques into a skilled person’s arsenal of tools is inevitable, given the broad applicability of ML techniques on the one hand and the expectation that the skilled person is ‘involved in constant development in the relevant technical field’87 on the other hand. Given that ML (as computational problem-solving techniques based on optimisation and modelling) is broadly applicable across technological and engineering fields,88 technical experts in such fields would often cooperate with ML researchers and data scientists when developing or improving technical solutions. Such situations could fall within the ambit of the current EPO methodology, as it allows the definition of a skilled person to be extended to an interdisciplinary team of skilled practitioners.89

Accordingly, in instances where a technical problem underlying the claimed invention could have been solved through ML, inventive step should be assessed through the lens of an interdisciplinary team availed inter alia of knowledge and skills in ML, in addition to knowledge and skills in the field of technology to which the problem underlying an invention pertains.90 For that, the assessment would need to establish that a skilled person from the field of a technical problem91 would be prompted ‘to seek a solution’ and ‘look for suggestions’92 in the field of ML. The combination of knowledge – even from remote technical fields93 – in the definition of a skilled person is not excluded, which can accommodate the diversity of ML applications. Notably, not only can a skilled person search for ‘clues’ in a different technical field, but they can also transfer a technology, provided that such transfer involves ‘routine experimental work’.94 Conversely, an inventive step can be acknowledged where technology transfer requires efforts beyond routine work, such as scientific research.95 To summarise, the EPO assessment methodology, in principle, allows to integrate skills in ML into the definition of an interdisciplinary team of skilled persons. Accordingly, the patent examiner would need to establish objectively, first, whether the skilled person would be motivated to look for suggestions in the field of ML and, second, whether the ‘routine’ application of ML would allow the skilled person to arrive at ‘something falling within’ the claims of the invention at issue.

2. ML systems learn ‘complex rules’ based on ‘simple rules’

In his book Genie in the Machine, Robert Plotkin observes that some AI techniques can ‘discover complex rules and patterns […] given only an abstract problem definition and simple rules for generating and evaluating possible solutions to the problem’.96 For instance, providing an airflow equation to an AI system would suffice to produce a car frame.97 Silver et al. report that they used ‘the simplest possible search algorithm’ to develop AlphaGo Zero, a program which eventually achieved ‘superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo’.98 They conclude that ‘it is possible to train [an RL system] to superhuman level, without human examples or guidance, given no knowledge of the domain beyond basic rules’.99

a) Patent law uncertainties

(i) Inventorship

The accounts presented above can create an impression that remarkable achievements can be accomplished with surprisingly negligible intellectual effort. Besides, the term ‘learning’ can create an impression that computers gradually become independent.100 Even though it was already clarified under Assumption 1 that ML researchers do not merely state a problem but essentially configure a problem-solving route to the solution, one could still doubt whether human input should be viewed as sufficient to merit the inventor entitlement.101

(ii) Inventive step/non-obviousness

The alleged simplicity of human input in ML can bear on the assessment of obviousness of an invention. If only trivial knowledge and skills on the human’s part could suffice to solve a technical problem through ML,102 one would question whether the purpose of the inquiry into inventive step – to discern the achievements beyond those requiring average knowledge and skills – is still relevant, or whether any invention actually or potentially103 resulting from the application of ML should prima facie be deemed obvious.104

Before these legal implications can be examined, let us clarify what exactly ‘learned rules’ mean in the ML context and how human and computer’s contributions can be compared in terms of simplicity or complexity, if at all.

b) The simplicity of human input vs complexity of ML outcome from a technical perspective

(i) The term ‘learning’

Whether computers can ‘learn’ depends on how one defines ‘learning’. As a cognitive phenomenon, learning comprises ‘inductive knowledge acquisition, behavior generation, and intelligence aggregation’.105 The extent to which human and computer’s learning processes can be comparable can hardly be ascertained, given that human learning alone cannot be considered to be fully understood. As for the analogy between ML and human learning, current ML approaches based on ‘statistical induction over massive training sets’ are viewed to ‘radically differ’ from human-like learning.106 In this regard, the pervasive use of the anthropomorphic language, ‘exacerbated by the ML literature itself’,107 has been criticised for ‘inadvertently promot[ing] misleading interpretations of and beliefs about what AI is and what its capacities are’ among the general public.108

In computer science, a ‘learning automaton’ refers to a system that can be trained ‘to associate with each context the particular action which maximises payoff’.109 If by learning, one means that a system can improve its performance over time based on an in-built feedback mechanism, one can say that computers can learn, but they do so within the pre-specified rules and boundaries. However, a more appropriate terminology in the case of ANNs would be ‘inferring’ a numeric function or ‘fitting’ data into a model.110

(ii) ‘Learned rules’

There is no specific technical definition of a ‘rule’ in ML. The views cited above111 assume a distinction between the ‘rules’ determining ML process and the ‘learned rules’ as ML output. The former can refer to instructions on how to build a model from the input data; the latter can be understood as an immediate output of ML (a numeric model derived from the input data based on the ‘given rules’). Such model can be deemed as ‘learnt rules’ if it determines how a prediction is generated in supervised ML. In RL, what is ‘learnt’ (calculated) is a ‘policy’112 which determines which action to take.

Important is that the calculation of the numeric output through ML is guided113 by the pre-programmed instructions and setup configurations. In this regard, the portrayals of ML techniques as ‘self-learning’114 or ‘self-teaching’ networks115 is not accurate. ML systems are products of software engineering. For instance, even though the developers of AlphaGo did not impart to the system ready-made ‘rules’ as to when to make which move,116 they provided inputs, including data and computational operators (mathematical functions),117 that determined how the moves maximising the long-term reward were computed.

Furthermore, if the ‘learned rules’ refer to the correlations inferred from the training data, it would not be accurate to refer to such correlations as ‘rules’, as long as the causes behind the revealed regularities are not exposed or as long as it remains unclear why the inferred correlations may or may not work when applied to new data.118 Consider the recent achievement of AlphaFold credited with ‘solving’ the protein folding problem in molecular biology. The problem of protein folding subsists in understanding the rules governing the relationship between the structure of a protein and its amino-acid sequence. Despite what the media headlines touted,119 AlphaFold has not ‘cracked’ the protein folding problem – it revealed ‘nothing about the mechanism of folding, but just predicts the structure using standard machine learning […] being trained on the 170,000 or so known structures in the Protein Data Base’.120 Meanwhile, the ‘why’ behind protein folding remains ‘a black box’.121

(iii) The alleged simplicity of given rules vs complexity of ‘learned’ rules

Robert Plotkin argues that computers do not always ‘operate according to the adage “garbage in, garbage out”’,122 and AI systems can ‘follow [given] simple rules to discover complex rules’.123 In his book, he refers to GP as an ‘artificial invention technology’.124

Solving a problem through GP consists of two phases, namely preparatory steps, and executional steps (the implementation of an algorithm).125 As defined by Koza et al., preparatory steps are ‘problem-specific and domain-specific steps that are performed by the human user prior to launching a run of the problem-solving method’.126 In contrast, executional steps are ‘automatically executed during a run of the problem-solving method’.127

Furthermore, Koza et al. introduce the concept of an ‘artificial-to-intelligence ratio’ (an ‘AI ratio’), where ‘I’ refers to the knowledge provided by a human expert in a particular field, and ‘A’ indicates the value added by the genetic programming.128 Some problems can require ‘a small amount’ of ‘I’,129 while others can involve ‘non-trivial amount’ of ‘I’.130 Where ‘A’ is high and ‘I’ is low, GP is considered to have delivered a ‘human-competitive’ result.131 As problems differ in complexity, they can require varying levels of knowledge and skills in GP.

However, issue can be taken regarding both the concept of ‘artificial-to-intelligence ratio’ and the notion of ‘human-competitive’ results. First, to measure an ‘artificial-to-intelligence ratio’, one needs to draw a line between the output of running a GP algorithm and ‘intelligence that is supplied by the human applying the method to a particular problem’.132 The preparatory steps in GP include defining ‘the set of terminals […], the set of primitive functions for each branch of the to-be-evolved computer program, the fitness measure (for explicitly or implicitly measuring the fitness of candidate individuals in the population), certain parameters for controlling the run, a termination criterion and method for designating the result of the run’.133 As iterative search techniques,134 genetic algorithms evolve a candidate population according to the given search operators until the fitness function is satisfied or other termination criteria apply.135 In this view, ‘the problem-specific preparatory steps’ done by a human are determinative of the outcome of implementing an algorithm on a computer. If so, it appears pointless to split up the two phases to define which contributes more to the solution.

Second, even if one wanted to compare a human versus a computer’s contributions in terms of their ‘complexity’, what would be a suitable objective measure? For instance, should the complexity of a mathematical function, which can be both the input and the output of ML, be measured by its length, the amount of energy consumed by the human brain as opposed to computational power, or how many operations are necessary to derive it? One could say that it is equally simple for a computer to execute instructions, irrespective of whether it calculates y = 2x or y = 10x^3 + 4x-2x^2 + 777x^21. For a human, devising computational instructions can vary in complexity depending on the problem to be solved and the level of expertise.

In GP, the design of a well-suited problem-specific fitness function and selection of the problem representation and genetic operators are not trivial tasks. In the case of ANNs, even if a standard algorithm can be applied ‘off the shelf’, hyperparameters136 need to be attuned by a researcher, and this might or might not be straightforward depending on the problem at hand.137 Setting the right hyperparameters for an ANN and defining a value function in RL are decisive for the computational outcome. Furthermore, the selection of input data, which predetermines the ‘learned’ correlations, should not be downplayed. The selection and preparation of training datasets require expertise and a keen understanding of the problem at hand. Both AphaGo and AlphaFold involved a team of highly skilled engineers and researchers; in both cases, a computer alone could not have achieved results without complex design decisions on the human part.138

In sum, if instructions provided by a human are decisive for the computational outcome, they are probably not ‘garbage’139 but rather illustrate the adage that a genius subsists in making complex ideas simple.140

(iv) The tendency to put a spotlight on an algorithm

As noted by Pat Langley, research papers on automated discovery ‘typically give the algorithm center stage, but they pay little attention to the developer’s efforts to modulate the algorithm’s behavior for given inputs’.141 Such tendency can be observed both in mass media and research papers. As one commentator points out, while RL – the approach applied in training AlphaGo – might be viewed as ‘one of the closest things that looks anything like [Artificial General Intelligence], beautiful demos of learned agents hide all the blood, sweat, and tears that go into creating them’.142 For research papers, the tendency to bring an ML system to the forefront is, in a way, natural and unavoidable. A paper should report the results and therefore presents an algorithm and results achieved with it as they are, not the process that has led to the system design. Usually, the researcher would not describe how much thinking and effort was devoted to creating an algorithm before reaching the version that works, or how many attempts failed before a model was tuned correctly.

c) Patent law uncertainties revisited

(i) Inventorship

At the outset, it is worth noting that the ‘simplicity’ of arriving at an invention, de lege lata, has not been prejudicial to the genesis or allocation of the inventor’s rights.143 It might simply be impossible to objectively assess in terms of complexity the mental process of conceiving an idea – ‘the formation in the mind of the inventor, of [the] idea of [an] invention’.144 The argument de lege ferenda that inventor’s rights should not be allocated to a human applying ‘self-learning’ AI systems appears to be somewhat fairness-based, given the seeming triviality of the human input relative to the alleged complexity of a computer’s contribution. However, as clarified in the preceding section, ML output is the function of human decision-making in applying ML. Such decision-making can merit the inventor entitlement under both the criterion of an intelligent conception of an invention145 and fairness considerations.

(ii) Inventive step/non-obviousness

When confronted with the allegations regarding the simplicity of human input in solving tasks through ML, one needs to be aware of the hindsight problem. As a cognitive phenomenon, the hindsight bias refers to a ‘tendency to judge events to be more predictable, knowable, and certain in hindsight than in foresight’.146 In patent law, an invention might, at first sight, appear obvious because, once ‘a new idea has been formulated, it can often be shown theoretically how it might be arrived at, starting from something known, by a series of apparently easy steps’.147 The EPO methodology stipulates that patent examiners ‘must be wary of ex post facto analysis [and] seek to make a “real-life” assessment of [the] relevant factors’.148 The principle of avoiding the foreknowledge of an invention is embedded within the ‘problem-solution approach’,149 which instructs to start the analysis from the closest prior art reference and formulate the technical problem addressed by an invention so ‘as not to anticipate the solution’.150

Given that the ‘simplicity’ of human input in ML cannot be alleged in general, the obviousness of inventions which could or would have been developed with the aid of ML can only be assessed on a case-by-case basis. As will be discussed later, solving problems through ML can involve considerable decision-making.151 Hence, the mere existence of ML methods applicable to technical problems, by itself, does not render obsolete the purpose of inventive step assessment to discern whether practitioners from the relevant technical fields with average knowledge and skills could or would have arrived at the claimed invention.

3. ML systems are autonomous, non-deterministic and unpredictable

Some legal scholars assume that AI is ‘capable of defining or modifying decision-making rules autonomously’152 and can ‘determine for themselves the means of completing their goals’.153 AI systems are often depicted as performing randomly. For example, an ANN makes ‘a random guess’ regarding the output;154 the training of AlphaGo Zero started reportedly from ‘completely random behaviour and continued without human intervention’.155 One legal scholar submits that AI refers to the ‘so-called non-deterministic algorithms: computer programs whose function and output are not exclusively determined by human creators’.156 A corollary of such assumptions is the allegedly ‘surprising’ effect of the output of AI/ML systems.

While the perceived autonomy of AI systems might have crucial implications for various regulatory frameworks, such as personal data protection, consumer protection, transparency and product liability, it poses distinct issues for patent law. In either case, legal challenges are triggered by the assumption that AI-based systems have the capacity to ‘decide’ and that humans have limited or no control over such decision-making.

a) Patent law uncertainties

(i) Inventorship in relation to a patentable outcome of the allegedly autonomous or non-deterministic computational process

The purported autonomy of ML systems can cast doubt on whether a human applying AI to a problem should be credited for finding a solution. For instance, Thaler describes his patented invention – ‘Device for the autonomous generation of useful information’ – as a device that ‘allows for the totally autonomous generation of new concepts, designs, music, processes, discovery, and problem-solving using recent developments in the area of artificial neural network (ANN) technology’.157 Relying on Thaler’s depiction of DABUS, one judge has recently held that he is ‘simply recognising the reality by according artificial intelligence the label of “inventor”’.158

It is worth noting here that the (perceived) randomness159 of a process leading to a patentable invention, de lege lata, has not been a material factor for the genesis or allocation of the inventor’s rights. Consider, for instance, serendipitous solutions160 or ‘a-ha’ moments (which are, nevertheless, regarded to be preceded by the orientation, preparation and incubation stages of creative thought161). Uncertainty de lege ferenda can arise if one assumes that the task of finding a solution to a problem ‘is handed over to the non-deterministic evolution of the algorithm’,162 and that ‘the nexus between the human contribution to the search process and the ultimate finding of a solution is severed during the AI’s evolution and transformation’.163 The view that AI systems can yield solutions that ‘cannot be predicted by programmers, operators, or any other entities involved’164 led some legal scholars to argue that the allocation of patent rights to a human might no longer be justified.165 Accounts crediting ML systems with making ‘brilliant moves’ not understood by humans166 and delivering ‘flummoxing’167 results that ‘[a]lmost no human pro would’ve thought of it’168 might create an impression of a computer’s ‘cognitive superiority’ and a dissipating causal link between the human input and computational output.

In other words, the contention is that ‘non-deterministic’ algorithms imply the lack of causality between human input and computational outcome. If determinism is defined as a ‘doctrine that all phenomena are causally determined by prior events’,169 even switching on a computer can be viewed as a ‘prior event’ that causes computation. However, if a problem is then solved by the computer ‘on its own’, one would doubt whether allocating the inventor’s rights to a human is justified. Even though it was already clarified under Assumptions 1 and 2 that ML systems do not solve problems ‘by themselves’, a technical perspective on a computer’s ‘autonomy’ and the relationship between determinism, randomisation and human guidance in ML would be welcome.

(ii) The assessment of inventive step in situations where the outcome of a randomised ML process is claimed within an invention

Expectations of success of a skilled person

Even if human decision-making in ML cannot be excluded, the mixed contribution by allegedly ‘autonomous’ or ‘non-deterministic’ AI systems and humans in the process of solving a technical problem raises the question of how such ‘autonomy’ or ‘non-determinism’ should be factored into the assessment of inventive step. Under the EPO’s approach, for an invention to be deemed obvious, it is sufficient to establish that a skilled person would have arrived at the invention following the prior art teaching with a reasonable expectation of success.170 Notably, where a technique involves randomness (e.g. mutagenesis) and where chance can play ‘a key role in the achievement of success, as no form of control can be exerted over the [course of] events’,171 it is not considered ‘appropriate to attempt to evaluate the expectation of success’.172 Accordingly, the assessment of inventive step in situations where the output of an ML method involving randomisation is claimed within an invention raises a two-fold question: (i) whether a human can exercise control over achieving such output; and (ii) if so, how the use of randomisation can affect a skilled person’s expectations of success in solving a problem through ML (provided that the relevant skilled person is versed in ML173).

A surprising effect of an invention

Furthermore, in some jurisdictions, a ‘surprising’ or ‘unexpected’ effect of an invention over the prior art can be considered as an indication of an inventive step (non-obviousness). The Patent Cooperation Treaty Examination Guidelines instruct that, in situations where the claimed invention holds a considerable technical value attributed to one or more of the claimed technical features and provides an advantage over the prior art ‘which is new and surprising […], the examiner should be hesitant in raising a negative determination that such a claim lacks inventive step’.174 Under the EPO approach, an unexpected effect over the prior art might confirm an inventive step.175

The developers of a NASA antenna report that the antenna ‘evolved’ by applying an evolutionary algorithm had ‘an unusual organic-looking structure, one that expert antenna designers would likely not produce’.176 As pictured by Plotkin, to produce a car frame, one ‘can just provide an airflow equation’ to an AI system.177 Later, when assessing that car frame under the non-obviousness standard, a patent examiner would determine, ‘quite correctly, that [the] car frame would not have been obvious to an automotive engineer of ordinary skill [because] the frame has a shape that is surprising to automotive engineers and violates the principles [they] learned in engineering school [which] makes it a classic case of a nonobvious design’.178 The contention is, thus, that ‘surprising’ technical solutions resulting from the application of ML can be deemed inventive (non-obvious) under the existing legal standard, even where they, in fact, resulted from the application of a routine technique and ‘merely by applying ordinary skill’.179

(iii) The fulfilment of the sufficiency-of-disclosurerequirement in situations where an ML method involving randomisation is claimed as an invention

The requirement of sufficiency of disclosure means that an invention needs to be disclosed in a patent application in a manner reproducible by a skilled person without undue burden.180 At issue is whether randomisation in ML poses a challenge to fulfilling this requirement in situations where an ML-based method is claimed as an invention.181 An analogy with ‘non-deterministic’ biotechnological inventions182 was drawn in this regard. In particular, it was hypothesised that ML-based techniques face a similar challenge regarding reproducibility and plausibility of an invention as in the case of biological materials which ‘variability is unavoidable’.183 The European patent system, for instance, provides for a deposit system for such inventions.184 Before the necessity of a similar instrument can be contemplated for inventions claiming ML techniques, it should be clarified how the use of randomisation in ML can affect the reproducibility of ML methods.

Before the above-outlined legal implications can be examined further, let us clarify in what sense ML output might be ‘unpredictable’, if at all.

b) A technical perspective on autonomy, (un)predictability and (non-)determinism of ML systems

(i) The term ‘autonomy’ is used mistakenly in the sense of ‘automation’

In everyday discourse and technical literature on AI, the terms ‘autonomy’ and ‘automation’ are sometimes used interchangeably.185 However, there is a significant conceptual and technical difference.

Autonomy implies self-governance and self-determination,186 the existence and the ability to exercise free will concerning own decision-making and behaviour.187 To what extent humans and computers can be comparable in this regard has been a contested issue and perhaps cannot be stated conclusively.188 While no universally accepted definition of autonomy exists,189 philosophers distinguish between two aspects: autonomy as ‘the negative condition of freedom from the external constraints’190 and ‘the positive condition of a self-determined will’.191 One could say that neither humans nor computers are free from absolute constraints, such as gravity. However, as far as man-made rules are concerned, humans have in many situations at least a theoretical possibility not to comply as a manifestation of self-determination (whether or not non-compliance is a rational choice is a different question). In contrast, computers cannot ‘decide’ to violate human-imposed constraints – in other words, constraints in programming are inherently inviolable. In computer science, an ‘automaton’ means a machine or mechanism that either reacts to a ‘predetermined set of rules or adapts to the environmental dynamics in which it operates’ to accomplish a goal.192 In either case, it is bound by the pre-programmed instructions.193 The interaction with the real world perceived through the sensors might cause the loss of control over an AI system by its designer or user, but the lack of control would be attributed to the environment’s unpredictability and not to a computer’s capacity for ‘self-determination’.

Automation means that a task can be carried out without direct human intervention during its implementation.194 For that, however, the computational process needs to be conceptualised and configured by a human in the first place. AI systems can automate operations such as data processing or mathematical optimisation to the extent that such tasks can be modelled as computational processes executed without direct human intervention during the implementation phase.195 The conceptualisation phase includes identifying a problem to be solved through computation, problem abstraction (i.e. reducing a problem to the elements and relations necessary for understanding and solving it) and its formal representation (e.g. mathematical expressions). While computer programming becomes increasingly automated, conceptualisation is carried out by humans, and it is difficult to see how computers can substitute human intention, aspiration, insight and decision-making in the conception phase.196

In popular media, one comes across anthropomorphic depictions of AlphaGo, such as:

drawing on all its other training with millions of moves generated by games with itself, AlphaGo came to view Move 37 in a different way. It came to realize that, although no professional would play it, the move would likely prove quite successful. It discovered this for itself […] through its own process of introspection and analysis. […] AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving. […] In other words, AlphaGo knew this was not a move that a professional Go player would make.197

While it is understandable that popular media outlets use language that appeals to a broad audience, such depiction of AlphaGo can be confusing for the lay audience. The use of anthropomorphic language – ‘came to view’, ‘came to realise’, ‘discovered for itself’ – approximates the AI performance to cognitive processes and creates an impression of an independent agency. At the same time, researchers point out that vesting computers with ‘self-determination’, ‘independence’, ‘freedom’ and the ability to choose a course of action ‘for own reasons’,198 is misleading199 and ‘potentially dangerous’.200 In technical terms, the system’s ‘realisation’ and ‘discovery’ subsist in calculating the probability of whether a certain move will be successful based on the given reward and value functions.

ML systems might come across as ‘non-deterministic’ or ‘random due to a lack of understanding of cause-and-effect relationships and a lack of resources for controlling sources of variability’.201 Such perception of ML can illustrate the notion of ‘deterministic chaos’ where deterministic laws govern seemingly random systems.202 In reality, the term autonomy used in the context of computers ‘does not mean that machines are free in the choices that they make [because] the conditions for deciding on how to proceed are carefully set by human actors’.203

(ii) A ‘decision’ as the choice of an option from a set of possibilities

If AI systems do not have autonomy in decision-making in the sense of ‘self-determination’ free from the human-imposed constraints, what does a ‘decision’ mean in the ML context?

There is no particular technical definition of a ‘decision’ or ‘decision-making’ in ML. While the HLEG-AI defines AI as systems that can ‘reason’ and ‘decide’,204 it notes that ‘AI researchers use mostly the notion of rationality [to refer] to the ability to choose the best action to take in order to achieve a certain goal, given certain criteria to be optimized and the available resources’.205 Thus, in general terms, a ‘decision’ can then be understood as the choice of the most suitable option from a set of possibilities, given the instructions.206

(iii) ‘Decision-making’ of AI systems is a function of the decision-making of their designers

In a policy document, one comes across statements that ‘AI-based products can act autonomously by perceiving their environment and without following a set of pre-determined […] instructions, [whereby] their behaviour is constrained by the goal they are given and other relevant design choices made by their developers’.207 ‘Without following a set of pre-determined instructions’ means in this context that the human does not provide explicit instruction, e.g. ‘in situation A, perform action B’, but a system calculates which action is more optimal based on the given mathematical functions and data. One could say that the inferred ‘rules’ or ‘decisions’ as to which action to take are not provided explicitly and directly by the human user. For instance, instead of being programmed to ‘turn left if there is a wall’, a system’s performance is based on the probability estimation, e.g. based on the training data, the probability of succeeding is 35 per cent if turning right and 65 per cent if turning left. Such probabilities result from complex mathematical calculations involving numerous features and variables guided by the objective function. While an ML model performs in a way it was trained to perform, its performance is not predefined ex ante explicitly and directly by the designers; instead, it is numerically derived from training data through mathematical optimisation, the process set up purposefully and methodically by humans.

Thus, contrary to the assumption that ML-based systems can have ‘freedom to decide which path to take to achieve the given goal’,208 there is no decision made by a computer in the sense of ‘freely choosing’ computational steps. A ‘decision’ as to whether this weight should be smaller or bigger than it was in the previous iteration depends on the optimisation algorithm.209 A ‘decision’ in the model optimisation is, thus, nothing more than purely mathematical calculations based on the given functions. In GP and EAs, what is ‘decided’ is which candidate solution will survive to the next generation or be mated for recombination or mutation. Such choices are made based on the selection operators and the fitness values of the population individuals (solution candidates).

A computer does not have a choice whether to perform computations or not, or whether to deviate from the given instructions. Where an RNG is used, there is no freedom for a computer to select a ‘random’ number other than what was ‘seeded’ by a human.210 Neither do quantum computers have the freedom to decide but can only respond to the signals from the external environment (and one might further ponder to what extent the environment is deterministic). In this regard, the depiction of ML techniques as ‘self-learning’ or ‘self-teaching’ networks211 is inaccurate.212

The bottom line is that the ‘decision-making’ of AI systems is a product of the decision-making of their designers. Consider an example where one tries to understand why a self-driving car caused an accident. As one commentator explains, the true answer will be not because a car ‘decided’ to do so, but because ‘it applied a transparent and deterministic computation using the values of its parameters, given its current input, and this determined its actions’, whereby ‘those particular parameters are the result of the model that was chosen, the data it was trained on, and the details of the learning algorithm that was used’.213

(iv) ML output is predictable

If ‘predictable’ means that something happens in a way that one expects,214 ML algorithms and models cannot be viewed as delivering ‘unpredictable’ outcomes. First of all, the task for which an algorithm is deployed or a model is developed is known upfront. Second, the process of deriving the output is also known and transparent. At the heart of ML is the process of optimising an objective function and, even if the exact numeric values optimising the objective function might be unknown ex ante, it is certain that such values will be derived based on input data according to an algorithm and predefined configurations, such as model hyperparameters in ANNs, or mutation and recombination operators in GP. Given that computers cannot violate instructions,215 there is nothing fundamentally unpredictable about the computational output. The application of randomisation does not make the outcome unpredictable. In ANNs, a good training setup would yield quite similar models, no matter how initial weights are randomised. In GP, the output can be more sensitive to randomisation,216 but it is ‘expected’ because it is known how the optima are to be found within the given search space.

If predictability is understood as ‘the state of knowing what something is like’,217 one could say that where ANNs or EAs are applied in technical design and engineering, the exact visual representation when numeric values are converted into the attributes of a technical design might not be imagined before implementing an algorithm and might perhaps cause a surprising effect.218 However, one would argue that such visual representation might be equally surprising if humans had completed the same computations on paper. To emphasise, everything that a computer does – be it inferring a function from the input data according to an objective function or evolving the fittest solution given the optimisation parameters – can be done by a human capable of performing mathematical calculations without a computer, except for it would take a prohibitively long time. If humans can perform the same computational operations and arrive at the same numeric values, one cannot say that such values are ‘unpredictable’.

(v) (Non-)deterministic algorithms in computer science

In computer science, an algorithm is deterministic if, given the same input, the output of implementing an algorithm is the same. A truly non-deterministic computational process would be one where choices among possible successive computational states are executed subject to no predefined conditions. As researchers point out, ‘in any random process the presence of deterministic computational structures can never be ruled out’ and ‘complete randomness is an information theory abstraction only’.219

Two methods are currently used to introduce randomness in computation – physical and computational.220 Physical methods use naturally occurring sources of entropy (such as nuclear decay or atmospheric noise);221 computational methods use an algorithm that generates a strand of numbers from an initial ‘seed’222 serving as an input to a mathematical function.223 The latter type is also known as ‘pseudo-random’ number generators since the ‘seed’ is chosen by a human, while a mathematical function controls the process of generating numbers. Such method is a state-of-the-art technique most commonly applied in ML today. Both physical and computational methods are deterministic in the sense that the same input would produce the same outcome (which, in the case of natural sources of entropy, might be difficult or impossible to ensure in practice). In neither case can a computer ‘decide’ by itself to choose a different number.

The key reason for applying randomisation is that the route to the solution (optimum) can be shorter with different starting points.224 Therefore, when implementing an algorithm, researchers usually try different initial conditions because the search result can depend on the starting point.225 Very complicated problems (‘NP-Hard problems’) cannot be solved with the current computer power in a reasonable time. Non-deterministic algorithms do not guarantee to find the optimum, but they can provide a near-optimum solution within a suitable period. To emphasise, there is nothing extravagant about what an RNG does – programmers could select random numbers manually (by rolling dice, for example), but that would be overwhelming and inefficient, given how many random numbers might be required.

(vi) Randomisation and reproducibility

The output of implementing the same algorithm will differ if an RNG produces different numbers, which can occur if, for instance, an algorithm was executed at two different points of time and the timestamp served as the ‘seed’. However, if the function and the seed are identical, then the sequence of generated random numbers will be the same. Consider the example of the initialisation of weights in an ANN coded in Python as w=numpy.random.randn. Even though the code does not specify the exact number, the computer will extract a mathematical function from the library that would initialise the weights in response to this instruction. On the outside, this may look like a ‘random’ process, but from a library writer’s perspective, or the designer of the random generator program, the process is neither random nor unexplainable. Libraries get modified and updated, which can be one of the factors of model irreproducibility. However, if the same library version is used and other conditions are observed, the same outcome can be obtained. RNGs are also broadly applied in GP, particularly in the initialisation of the population, the recombination of solutions, selection and mutation.226 All these instances are deterministic and reproducible as long as an RNG is known and the initial seed is fixed.

Conditions of reproducibility of ML models are specific to an algorithm and a technique applied. If one keeps due records of all relevant information – the algorithm’s code, the RNG used and the seed for an RNG, etc. – the output of ML methods is completely reproducible. As for the training data, it might be sufficient to describe the training dataset (size, selection criteria, etc.) to build another model with comparable performance (in terms of accuracy, precision, etc.) as the original model, without providing the actual training dataset used to train the original model.

(vii) Randomisation, human control and researchers’ expectations of success

ML algorithms, such as backpropagation and EAs, are based on optimisation through iterative search guided by an objective function. In other words, if the optimisation time were unlimited, the optimum could be found irrespective of the particular random numbers – even random search would find the optimum if it had infinite time, provided such optimum or optima exist within the given search space. Even though the training of AlphaGo included randomisation, success was determined by how the computational process was set up by researchers, including the value and reward functions, according to which the probability of successful moves was estimated.

The relationship between randomisation and researchers’ expectations of success is not straightforward. Randomisation does not guarantee that the optimum will be found, no matter what. Applying the wrong ML method to a problem will not lead anywhere; an algorithm can get ‘stuck’ in a local optimum.227 Randomisation is a part of ML, but it requires expert knowledge of ML and understanding of the problem at hand to configure a computational process.228 In general, one can be more confident if a technique was successfully applied in the past to a problem with similar mathematical properties. Whether the use of randomisation can increase a researcher’s confidence in solving a problem is highly individual to the respective application, given that success usually depends on multiple factors.

c) Patent law uncertainties revisited

(i) Inventorship

In view of the foregoing, the contention that ‘neither programming nor data training will directly effectuate or shape the outcome of the search’229 does not hold. Quite the opposite, it is precisely the inputs and the overall computational set-up that ‘shapes’ the outcome of ML. Given that randomisation does not indicate the absence of human guidance and decision-making in ML, the assumption that inventing is ‘handed over’ to non-deterministic ML systems appears unfounded.

Neither does there seem to be a reason why predictability of ML output should be a material factor for allocating inventor’s rights de lege ferenda in situations where application of ML might lead to a patentable invention. First, this criterion would be contrary to the very notion of invention as a solution to a problem230 – a goal that is not ‘immediately attainable’.231 Second, predictability of results and human guidance are not linearly related: even a carefully set up process, such as a scientific experiment conducted according to a study protocol, can produce results that might not be envisaged by researchers ex ante.232 The unpredictability of ML output could cast a normative doubt as to whether the inventor’s rights should be allocated to a human if it could indicate the computer’s ability to generate ‘innovative outcomes independently, rather than merely by following digital orders’.233 As clarified in this and the preceding sections, such contention is unfounded as far as current ML techniques are concerned. The fact that the precise numeric values are unknown before implementing an ML algorithm is not more relevant for the allocation of inventor’s rights than in cases where other mathematical optimisation methods or model-based techniques are applied in technical problem-solving.

(ii) The assessment of expectations for success as a factor of inventive step/non-obviousness

Expectations of success of a skilled person

In contrast to biological processes where, ‘as in a lottery game, the expectation of success always ranged irrationally from nil to high’,234 the use of randomisation in ML does not appear to exclude the possibility of assessing a skilled person’s expectations of success. AI systems are about configuring multiple elements in view of the pursued objective – once the elements are configured correctly, one would have reasonable expectations of success. Randomisation can affect the duration and the computational route but not the existence of optimum or optima within the given search space. Thus, expectations of success would depend on the ability of a skilled person to configure the ‘right’ computational setup to find them. Accordingly, in situations where the claimed invention could have been developed by applying ML, it needs to be assessed whether a skilled person235 following the prior art teaching would have applied an ML method in that particular way that would lead to the claimed invention. As will be discussed later, there can be substantial room for decision-making preceding the implementation of an algorithm, which might pose a challenge regarding how the obviousness of such decisions can be objectively assessed.236

A surprising effect of an invention

Whether an effect is surprising or expected needs to be assessed vis-à-vis the relevant prior art on a case-by-case basis. Therefore, it cannot be alleged across the board that ML techniques generate inventive (non-obvious) solutions because they are ‘surprising’ in the abstract. Under the EPO approach, a surprising effect in the case of chemical inventions can be demonstrated by way of a comparative test with the structurally closest prior art to the subject matter of the invention:237 the greater the structural difference, ‘the less unexpected are any differences in their effects’.238

An unexpected effect alone does not automatically confer the presence of an inventive step – other factors need to be considered, such as whether it would be obvious for the skilled person to combine the prior art teachings in expectation of an advantageous effect.239 An invention would lack an inventive step not only where the results are ‘clearly predictable’ but also where a skilled person would follow the teaching of the prior art with a reasonable expectation of success.240 For instance, where an invention claims numeric parameters of a technical device obtained by applying the state-of-the-art optimisation method, and where indications in the prior art suggest that ‘favourable results might be obtained by the method of calculation applied’, the effect of the claimed solution would not be considered ‘surprising’.241 For that, the applied method needs to be disclosed in a patent application or identified otherwise in the course of the patent examination.

While the result of applying known equations242 can be inferred relatively straightforwardly, ML techniques require multiple elements to be configured to accomplish a task. Even if it might be known from the prior art that a technical problem at issue could have been solved through ML, the assessment of inventive step would need to examine in light of individual circumstances whether ML would have been applied by an average skilled person in such a way that it would lead to the claimed invention.243

(iii) Implications for the sufficiency-of-disclosure requirement

As clarified, ML methods incorporating randomisation are, in principle, reproducible as long as their application is properly documented. In this view, the analogy between ML output and non-deterministic biological inventions is inapt and the reproducibility of ML models can, in principle, be ensured without a deposit system.

Assessment of sufficiency of disclosure is based on a patent application as a whole. Even if two ANN models trained with the same algorithm and hyperparameters might slightly differ due to the initial randomisation of weights, such variations within the claimed process would not be material for fulfilling the disclosure requirement, as long as ‘the claimed process reliably leads to the desired products’.244 Accordingly, for inventions comprising inter alia245 an ML model – e.g. where the functioning of a technical system is enabled through predictions generated by a trained ANN model246 – the exact numeric model would not need to be reproduced, provided that the disclosed process allows the skilled person to implement the claimed training process.247

4. ML is a ‘black box’

AI systems248 are often called ‘black boxes’ – and even a ‘magical black box of code’249 – that can generate output without providing ‘any information about what exactly makes them arrive at their predictions’.250

a) Patent law uncertainties

(i) Inventorship where an invention results from an ‘unexplainable’ or ‘unpredictable’ process

It is worth noting that, de lege lata, explainability of the inventive process has been relevant for inventorship. Such requirement might be unnecessarily burdensome and, in some situations, unfeasible to fulfil and enforce. From a de lege ferenda perspective, the lack of explainability of AI performance could call into question the allocation of inventor’s rights to a human if it could indicate an autonomous, ‘supernatural agency’251 of computers.

As explained under Assumptions 1-3, to solve a problem through ML, the computational process needs to be thoughtfully staged by those developing and applying ML techniques. To settle any remaining doubts, it appears pertinent to clarify why some ML models are considered ‘black boxes’, given that the process of building them is known and understood.

(ii) The assessment of inventive step where an invention results from an ‘unexplainable’ ML process

The ‘black-box’ characterisation of ML raises the question of whether the likelihood that a skilled person applying ML would have arrived at the claimed invention can be objectively assessed if the workings of ML techniques cannot be explained.

(iii) The sufficiency-of-disclosure requirement in situations where an ‘unexplainable’ ML method is claimed within an invention

It has been argued that ‘the “black-box” nature of certain AI systems […] may make it challenging to provide a sufficiently clear and complete disclosure for the invention to be carried out by a [person skilled in the art]’.252 In this view, it needs to be clarified how the allegedly limited explainability of ML systems interacts with their reproducibility.

b) A technical perspective on explainability of ML

(i) What exactly is unexplainable about ML models?

There is no single specific definition of ‘explainability’253 of ML and, hence, the lack thereof. A ‘black box’ computational model is defined as the one ‘for which the inputs and outputs are visible to the user, but its internal workings are not’.254 In the case of ML models, however, ‘visibility’ of internal workings is usually possible as mathematical computations forming a model can be saved as a data file and examined. Moreover, commentators point out that, contrary to a popular view depicting ANNs ‘as black boxes which mystically determine complex patterns in data[, …] neural network designers typically perform extensive knowledge engineering’.255 While this quote dates back a couple of decades, contemporary AI systems are not mystical either – they are complex. The issue of explainability concerns the meaning of input-output correlations revealed by an algorithm. What needs to be explained are factors behind the specific weighting of features, e.g. why feature x in input data correlates with an output/target y. When a ‘black-box’ ML model is defined as a situation ‘where it is not possible to trace back to the reason for certain decisions’,256 it means that factors (‘reasons’) determining the weighting of features that underlie a prediction (‘decision’257) are not well understood. As tautological as it might sound, statistical correlations do not denote causation.258 While model features are found mathematically, the semantics behind numbers are not revealed unless one ‘translates’ an ANN into a meaningful narrative.

The depictions of ML techniques being ‘very successful from the accuracy point of view [but] very opaque in terms of understanding how they make decisions’259 create an impression that AI systems ‘know’ the reasons but keep them secret. However, the truth is that there is no ‘ready-made’ knowledge within ANNs – it remains a collection of data points unless one construes a sensible narrative. Construing a narrative requires a skilled researcher with the relevant knowledge, willingness to treat an ML model as a ‘glass box’ and not a ‘black box’260 and access to the necessary information about the model design. The AlphaFold example mentioned earlier261 highlights the importance of interpreting the numeric output of ML – while the results revealed by AlphaFold can certainly advance scientific understanding, the factors determining protein folding remain ‘a black box’.262

Portrayals of AlphaGo making ‘a brilliant move’ that ‘no human could understand’263 are typical examples of mass media’s attempt to sensationalise. AlphaGo’s ‘brilliant’ move might not have been understood so far, but it does not mean that it is fundamentally incomprehensible, because, in essence, it is based on mathematical optimisation. AlphaGo was described as the software that ‘modifies itself, [while] its analyses cannot be understood by humans even as it outstrips human performance’.264 Such portrayal is not quite accurate – everything AlphaGo does is doable by a human, albeit with a significant difference in the speed of performing computations. The concept of ‘understood’ might refer here to the ability to generalise the steps of AlphaGo as the rules that explain why the moves optimised through RL ultimately led to the successful performance.265 Given the complexity of computations, it can take years for a human to understand them. Yet, it is not impossible.

The key difference between humans and AI systems such as AlphaGo is that, while a human can contemplate many steps in advance in a board game, a computer can calculate several times more steps beforehand and choose the one with the highest likelihood of success in the long run.266 A human would not have considered that landmark move, because the benefit of it comes many moves afterwards, and it is quite difficult for a human to calculate that many steps in advance. However, without the time limitation, and maybe with a piece of paper for note-taking, a human could, in principle, have arrived at that ‘brilliant move’ as well.

(ii) What causes computational complexity of ANNs?

Complexity is caused by the interaction of numerous (millions of) units (neurons) within a multi-layer non-linear structure of ANNs. As one can be puzzled when looking under the bonnet of a car, one might not readily make sense of the internal states of ANNs composed of lots of numbers. Yet, both a car and an ANN model are products of deliberate and well thought-through engineering. The process of implementing a model is purely mathematical – ANNs are ‘just a set of matrix operations and finding derivatives’.267 For instance, there can be tens of thousands of input features for training a model for cat/dog image classification, while an ANN can contain tens of millions of weights and bias parameters derived when input features are passed through its layers, each containing hundreds of neurons.268

(iii) Explainability and causality

Computational complexity does not denote the lack of causality between human input and decision-making and the performance of an ML model. How a model is built is a transparent process. The answer to why an ML model was built and performs in a certain way ‘lies in the combination of the assumptions [made by researchers], the data it was trained on, and various decisions made about how to learn the parameters, including the randomness in the initialization’.269 In other words, from the perspective of designers of ML systems and data scientists, there is nothing fundamentally unexplainable about how correlations between the data points are found.270

(iv) Model explainability and reproducibility

The limited explainability of ML models does not affect their reproducibility – an ‘unexplainable’ model can be consistently reproduced. For that, one needs to obtain a detailed description of the applied method, the type of model and algorithm used, exact values of the hyperparameters (where they are required for a model), the seed for an RNG, the number of epochs, evaluation metrics, computer configurations (software, operating system, hardware), and so on. Ultimately one needs to know all decisions made to develop the first model, otherwise model reproduction might be difficult and unreliable.

c) Patent law uncertainties revisited

(i) Implications for inventorship

Given that a ‘black box’ notion in the context of ML does not denote the absence of a causal link between the human input, the way computation is performed and ML outcome, there appears to be no reason to deny the human contribution to the conception and development of an invention, in situations where an underlying technical problem was solved by applying ML.

(ii) Implications for the assessment of inventive step

As clarified, the ‘black box’ characterisation of ML models does not denote the absence of deliberation and careful decision-making on the part of researchers designing a model and interpreting computational output. Accordingly, in situations where a technical problem could have been solved through ML, it needs to be assessed whether a skilled person would have designed a model and interpreted the computational outcome in the way leading to the claimed invention.

(iii) Implications for the sufficiency-of-disclosure requirement

Given that ML models can be consistently reproduced even if they might not be readily understood, the limited explainability of ML models does not present an insurmountable challenge for the sufficiency-of-disclosure requirement. Where an invention comprises inter alia an ML model, the earlier discussed conditions 271 would need to be ensured to fulfil the disclosure requirement.

5. ML is a ‘general-purpose technology’

AI has been characterised as the ‘next general purpose technology’ (GPT),272 given its broad applicability and the potential to enable innovation across economic sectors.273 As a GPT, AI holds a promise to drive ‘sweeping transformative processes’274 and to generate ‘a wave of complementary innovations in a wide and ever expanding range of applications sectors’.275 As an ‘enabling technology’,276 it can open up new opportunities rather than offer final products and solutions.

a) The importance of access to inputs for ML from an innovation policy perspective

From an innovation policy perspective, a significant implication of the GPT nature of ML is that access to inputs for building ML models – which in turn form the basis for developing sector- and case-specific applications – plays a paramount role in realising the potential of ML as an enabling technology across the economic sectors. In this regard, exclusive IP rights – to an extent applicable to inputs for developing ML models and models themselves – and the way such rights are exercised, can have a sizable impact on the innovation processes enabled through ML.

In the context of enabling technologies (e.g. research tools), the question of whether patents facilitate or hinder innovation has been subject to a long-standing debate.277 The facilitating effect is associated with the role of patent rights as innovation incentives. The risk of a hampering effect has been hypothesised in the context of follow-on innovation, where a patented technology can be used as an input in knowledge creation. By definition, enabling technologies are broadly applicable innovation inputs.278 Consequently, where usage rights in such technologies cannot be allocated efficiently, the suboptimal realisation of the innovative potential is posited to entail a welfare loss. Furthermore, AI techniques such as ML are peculiar in that they consist of multiple elements, which sets a pre-condition for rights fragmentation and high transaction costs of their convergence.279

Against this backdrop, one might contemplate that exclusive IP rights in AI techniques might have a ‘stifling effect’ on AI-enabled innovation280 if the usage rights cannot be efficiently allocated. Accordingly, two key issues need to be clarified: (i) which constitutive elements of ML can function as ‘general-purpose’ building blocks; and (ii) whether such generic components might be subject to patent protection.281

b) A technical perspective on ML as a GPT

The characterisation of ML as a GPT is accurate in that ML, as a variety of computational model-based techniques, is broadly applicable to problem-solving across technological and engineering fields.282 At the heart of ML is optimisation.283 The ubiquitous284 nature of optimisation explains why the scope of ML applications is expansive.

The so-called ‘no free lunch theorem’ in ML285 postulates that there is no true ‘general-purpose’ ML method because no single algorithm can solve all ML problems better than any other algorithm. Some methods can be applied across a specific subset of problems sharing certain similar properties (e.g. a specific domain application, text and image recognition).

Some basic techniques and principles applied in ML can be viewed as ‘generic’. Examples include core algorithms such as gradient descent and backpropagation of errors that constitute the basis for the families of algorithms used in ML; basic mathematical tools, such as the derivation of a function; and methods of pre-processing training data, such as the methods for dimensionality reduction or outlier removal. Various software used to pre-process the training data is usually generic. It should be mentioned that the notion ‘generic’ does not imply that a method is guaranteed to work under any conditions; for instance, numerous dimensionality reduction methods exist, but only one might work for a given dataset and purpose. In GP and EAs, the crossover and mutation operators can be considered generic as they can be applied to a specific encoding of the solutions, regardless of the task to be solved. Such generic techniques are usually in the public domain.

For ANNs, two key components are training data and algorithms. Most algorithms are publicly available and shared as part of software libraries.286 In contrast, access to training data, especially held by private companies, can be tricky. Collaboration with private companies often involves contractual restrictions regarding sharing datasets used for developing ML models. Without sharing datasets, however, the research community might not be able to reproduce results.

ML models are usually developed to perform a narrow task and their generalisability depends on multiple factors. Under the so-called ‘transfer learning’ approach, an ANN model initially trained for one task might be re-used as a foundation for developing another model for a related but distinct task. Whether or not transfer learning can be a relevant and efficient solution is individual to an ML method and its particular application. As far as access is concerned, companies tend to hold ML models privately, either entirely or partially, while academics tend to make their work accessible.

ML is characterised by high modularity, as its application to a specific problem requires configuring multiple elements and combining various methods. If certain ‘ingredients’ are not available, they can usually be substituted with the available alternatives to achieve comparable results.287 Substitutability of training data is, however, most difficult due to the limited access.

c) Implications for patent policy and law revisited

As discussed, the core ‘building blocks’ of ML, except for training data, tend to be disclosed in academic publications,288 which can alleviate concerns regarding their accessibility. Nevertheless, given the characterisation of ML as a broadly applicable enabling technique, the patentability of ML systems and their constitutive elements de lege lata merits a closer look, especially as regards the scope of protection.

Under European patent law, the essential elements of AI – mathematical methods289 and computer programs290 – fall within the categories of ‘non-inventions’ for the lack of technical character.291 Besides, core AI applications based on predictive analytics or related to knowledge representation, reasoning, planning and scheduling292 might qualify as business methods and, hence, come within the scope of the excluded subject matter as well.293 The exclusion applies ‘to the extent to which a European patent application or European patent relates to such subject-matter or activities as such’.294 While such approach appears, in essence, balanced and reasonable, some uncertainty persists between the clear-cut cases,295 and the question of under what conditions computational methods should be considered ‘technical’ remains open-ended.296

In principle, ML algorithms and models can constitute only part of a patentable invention. According to the EPO Examination Guidelines, implementing a mathematical method on a computer would be sufficient to confer a technical character on the subject matter as a whole297 and, thus, overcome the exclusion. An analysis of whether the reference to implementation on a computer should suffice to overcome the exclusion under Art. 52(2)(a) of the EPC goes beyond the scope of this paper. The point worth highlighting is that the patent system’s goal of affording protection only to the technical subject matter and the policy concern regarding the scope of patent rights in enabling technologies298 are related yet distinct issues. Adding the reference to a generic computer as a technical device to an independent patent claim directed to a ‘core’ ML method which is otherwise excluded as a mathematical method,299 would not limit the scope of exclusive rights. In contrast, consider the distinction between patents for ‘core’ and ‘applied’ AI techniques: the former can claim a broadly applicable method of ML;300 the latter can be confined to its particular application in a technical use-case. The difference in terms of the scope of protection is palpable.

In the context of European patent law, this issue is reminiscent of the debate on absolute vs purpose-bound patent protection for chemical substances (in particular, DNA sequences), which remains unsettled both from a de lege lata301 and a de lege ferenda302 perspective. In the US, the connection between the excluded subject matter and the scope of patent protection is reflected in the ‘building blocks’ doctrine applied by the US Supreme Court. By excluding ‘abstract ideas’, the Court in Mayo303 and Alice304 intended to keep innovation inputs – ‘the “buildin[g] block[s]” of human ingenuity’305 – outside protection by exclusive rights. In this regard, the exclusion can be deemed instrumental for preserving ‘abstract intellectual concepts […] not patentable, as they are the basic tools of scientific and technological work’.306

While discoveries, scientific theories and mathematical methods are excluded from patentability for the lack of technical character, they also happen to be broadly applicable knowledge inputs akin to research tools necessary to create further knowledge. For the reasons better explained by economists,307 granting exclusive IP rights in such inputs might not be an optimal innovation policy from a welfare perspective.308 In this view, the present regime where an independent patent claim can be directed to an ML method as such, not limited to an application in a specific technical use-case, might need to be re-considered from a de lege ferenda perspective.309

Furthermore, the question of access to training data, which is often subject to factual control and trade secret protection, merits closer analysis. Overall, further research needs to examine how access measures should be optimally designed to remedy a market failure of data-sharing in the context of ML-enabled innovation.

6. ML is a ‘general-purpose method of invention’

Besides being characterised as GPT, AI – particularly DL – has been viewed as ‘a general-purpose method of invention’.310

a) Uncertainty regarding the assessment of inventive step

From a patent law perspective, the main implication of ML being characterised as a ‘method of invention’ is that the use of ML, as a computational problem-solving method, should be factored in the definition of a relevant skilled person311 when assessing the obviousness of inventions that could or would have been developed by applying ML.312 The skilled person is ‘free to employ the best means already available for his purposes […] provided this involves a choice from a multiplicity of possibilities’.313 Otherwise, the lack of alternatives may ‘create a “one-way-street” situation leading to predictable advantages which remain obvious in spite of the existence of some unexpected “bonus” effect’.314 Besides, where a skilled person would apply routine trial and error, the outcome is deemed obvious/lacking an inventive step.315 ML techniques involve trial and error316 and iterative search.317 According to Abbott, ‘as research is augmented and then automated by machines, the average worker will routinely generate patentable output’,318 whereby a low obviousness standard would entail a danger of the stifling effect on innovation.319 Furthermore, the attribute ‘general-purpose’ might suggest that ML algorithms can be applied ‘out of the box’ to different tasks.

As clarified under Assumption 2, the seeming simplicity of human input vis-à-vis computational complexity is not a reliable indicator of the obviousness of an invention. To clarify whether the application of ML techniques represents a ‘one-way street’ or involves only ‘routine’ trial and error, let us take a closer look at the decision-making involved in ML.

b) A technical perspective on ML as a ‘general-purpose method of invention’

(i) ML as a special-purpose technique

Contrary to the ‘general-purpose’ attribute, ML applications need to be ‘surgically altered or purpose-built’.320 This requires ‘lots of preparation by human researchers or engineers, special-purpose coding, special-purpose sets of training data, and a custom learning structure for each new problem domain’.321 ML models are only ‘as good as the assumptions that they were created with and the data that was used to train them’.322 As there is no ‘one-size-fits-all’ ML technique, choosing a method that can work better for a particular problem involves expert knowledge and insight. Most ML algorithms can work ‘out of the box’ – more precisely, ‘out of the library’323 – only for a rather constrained set of relatively simple problems. In some cases – such as image or speech recognition – only minimum adjustment might be necessary, while in other cases almost every aspect of an ML technique might need to be configured or fine-tuned.

(ii) Which decisions are crucial for the successful application of ML?

The saying ‘garbage in, garbage out’ holds in ML, as the whole preparation is more important than implementing an algorithm. Decisions during the preparation phase require a good grasp of a problem and adequate knowledge and experience. Which decisions specifically might be crucial for the successful application of ML is individual to a particular ML technique and a use case.

In ANNs, one needs to configure hyperparameters324 that would lead to the best performance of an algorithm and the most suitable model for a given problem. Usually, there is no prior knowledge about which hyperparameters are right for a particular task – as one commentator puts it: ‘Typically, the hyper-parameter exploration process is painstakingly manual, given that the search space is vast and evaluation of each configuration can be expensive’.325 Other aspects that involve decision-making include choosing training datasets, encoders, RNGs, etc. While simpler models tend to ‘underfit’, more complex ones tend to ‘overfit’.326 As suitable models are usually located somewhere in the middle of those two extremes, researchers need to try different models and compare them through cross-validation to find an optimal model for a particular problem.

In GP/EAs, one almost always needs to develop specific encodings for a problem representation and design a unique fitness function (a decisive part) and particular operators (recombination and mutation) that can work with that encoding.327 In the case of decision trees, configuration choices concern the depth of a tree, the number of trees, the size of each leaf, etc.

All relevant factors need to be considered when choosing, adjusting or designing an objective function. Fitness functions in GP or EAs are highly specific to the application, and their design usually requires a great deal of expertise. While boilerplate functions most commonly applied with ANNs exist, human influence manifests in the decision-making regarding the hyperparameters and data preparation. Where an existing objective function is used, it might need to be adjusted to a specific problem.

In sum, for a problem to be solved successfully through ML, one should have a good grasp of computational model optimisation, the-state-of-the-art algorithms and the application domain, be able to understand and model a problem, design an objective function, specify the right success metric, select and prepare328 input data relevant for the objective, and make sense of the results. As with any software, the designer might not need to build all functionalities and constitutive elements of ML from scratch, but the results achieved by individual users applying ML can vary significantly depending on the purpose and skills. The importance of interpreting ML output should not be downplayed. At a basic level, the immediate output of ML is coded in binary digits. At a higher level, it can be represented as yes/no, numbers, options, words, etc. In any case, such output cannot be equated with ready-made knowledge or problem solution.

(iii) ‘Average’ knowledge and skills in ML

ML methods are characterised by modularisation. Like a LEGO constructor, they involve multiple building blocks that can be configured in various ways depending on the objective. As with any engineering work, computational modelling requires intuition and insight to make the right decisions and choices, which comes with experience. Setting up the configurations mentioned above is a true indicator of knowledge, skills and expertise in ML. Given a large variety of ML techniques, it is hard to define in the abstract what level of skills in ML can generally be considered ‘average’. This problem is compounded even more when it comes to a team of researchers, given that collaboration is quite common in ML.

Whether an average practitioner could accomplish the ‘right’ decision-making can only be considered vis-à-vis a particular problem solved through ML. For instance, while AlphaGo was developed based on state-of-the-art ML methods, the combination of techniques and configurations determined AlphaGo’s success.329 Would any person with average skills and knowledge in ML be able to come up with such configurations? In the authors’ view, if a Master’s degree in computer science is considered a measure of ‘average’ knowledge and skills in ML, this is unlikely to be enough to design and implement AlphaGo. In addition, non-human factors, such as access to data and hardware (i.e. computational resources to implement training in parallel within a reasonable time), should be taken into account when considering what can be achieved with ML techniques.

c) Implications for the assessment of inventive step revisited

Given that there can be considerable room for decision-making in applying even known ML methods, an invention developed through ML cannot be considered a mere ‘bonus’ obtained in a ‘one-way-street’ situation.330 Besides, it cannot be alleged across the board that ML can solve any problem with no or only trivial adjustment. Otherwise, we would not have incurable diseases, given that AI techniques have been applied in drug discovery and development for decades.331

Even though the implementation of an iterative search-based algorithm332 can be viewed as routine trial and error, it would be erroneous to reduce the entire problem-solving to the implementation phase.333 Rather, it should be considered in light of individual circumstances of a case – against the relevant prior art – whether an average skilled person would have come up with a particular set-up of an ML system that would lead to the claimed invention. The question of whether and how such decision-making can be ‘reverse-engineered’ from the final result and objectively assessed by patent examiners calls for a more detailed inquiry.

7. One ML algorithm – many inventions

a) The anticipation of technological singularity

Some legal scholars foresee ‘creative singularity in which computers overtake human inventors as the primary source of new discoveries’334 and assume that one AI algorithm can generate multiple inventions – which could be closely related or unconnected335 – each time it is implemented.336 The view that AI can yield a ‘surprisingly large number of inventions’337 echoes the notion of technological singularity. Besides, patent infringement would practically be excluded if alternative technical solutions could be ‘invented around’ effortlessly. The assumed prolificacy of AI algorithms calls into question the need for behavioural and economic incentives for innovation and the role of the patent system in this regard.

As discussed earlier, numeric output in ML can vary due to randomisation when one algorithm is implemented multiple times.338 However, what matters is whether such varying output can constitute distinct solutions substitutable in their quality. For instance, would multiple runs of the same evolutionary algorithm used by the NASA researchers339 generate disparate space antenna designs with comparable fitness and robustness in satisfying technical constraints?

b) A technical perspective on an algorithm’s prolificacy

(i) The variability of the output depends on a technique

The variability of the numeric output in ML depends on a particular method. In ANNs, the chances are high that, irrespective of the initial randomisation of weights, a good training setup would yield very similar models. Thus, a robust algorithm would generate a model that would perform with a comparable level of accuracy340 even if, due to the randomisation, the weights optimising the cost function can be slightly different each time the training algorithm is applied to the same training data. For instance, the second AlphaGo may not make every move exactly the same way as the first one, but the overall performance would most likely be the same.

In cases where GP/EAs are applied in technical design and engineering, the output variability would depend on different factors, including the number of iterations run by an algorithm, how much time is given to the algorithm to find the optimum, and the mutation rate. Where the RNG’s influence is enough to escape local optima,341 multiple executions of the same EA would generate the same results. Alternatively, if the influence of the RNG is insufficient or the search time is limited, multiple executions of an algorithm can generate different solutions. The two main reasons why an algorithm might generate different solutions are that the seed makes the algorithm ‘get stuck’ in local optima, or the time of algorithm execution is insufficient to reach the optimum.

In research papers, authors usually report the averages/median and a corresponding spread measure (standard deviation/interquartile range). Since the same algorithm is typically implemented many times with different seeds, researchers draw several samples from the algorithm’s performance distribution. The spread of that distribution is measured as the standard deviation of that sample. The smaller the spread, the less sensitive an algorithm is to the choice of the random numbers, and the more robust an algorithm is. Where randomisation is applied to various aspects of ML, it cannot be guaranteed that a model would perform exactly like the one in the previous independent run of an algorithm, but the average performance is a good indicator of what results can be expected.

(ii) The output substitutability

To answer whether the same genetic algorithm can generate multiple distinct solutions with comparable fitness, we first need to define the terms ‘global’ and ‘local’ optima in optimisation problems. Two essential components of an optimisation problem are a search space of potential solutions and an objective function that evaluates the quality of each candidate solution.342 Local optima are the extrema343 that minimise or maximise the objective function for a given region of the search space, while a global optimum is the extrema (minimum or maximum) of the objective function for the entire search space.344 Multiple global and local optima can exist in a given search space,345 depending on how the search space and an objective function are defined. Whether a local optimum can be a satisfactory solution can hinge on the individual needs in a particular case – in many cases, the ‘near-best solution’ can suffice.

As noted earlier, in ANNs, different initial weights can result in slightly different models, but the chances are high that such models will perform with a comparable level of accuracy. In GP and EAs, different initial conditions can lead to different local or global optima. As far as the NASA antenna is concerned, the researchers did not perform a statistical sampling across multiple runs of the algorithms, as the goal was to design the antennas that would comply with the specifications of the NASA mission.346 Theoretically, two algorithms applied by the NASA researchers could have identified more solutions comparable in terms of technical parameters optimisation. Such possibility would depend on the used algorithm, configurations and the modelling of the fitness functions. However, even if more solutions were identified, one could not say that an algorithm ‘invented’ multiple inventions. What happens is that an algorithm finds through the iterative search the optima that already exist within a search space.

c) Implications for patent law

Even though some optimisation techniques can identify several optima within a search space, the assumption that an algorithm can yield a ‘large number of inventions’ appears to be an exaggeration. Quite to the contrary, in some cases – for instance, in the field of molecular design – ML models can produce ‘to a large degree invalid’ output.347

Computational model-based techniques can make problem-solving more efficient compared with brute force computation, random screening or blunt trial and error. However, this does not necessarily mean that the application of ML techniques might have a sizable cost-saving effect.348 To date, the relationship between patents, AI and innovation remain underexplored. At the same time, without a detailed understanding of ML-based business models, one cannot argue that the investment amortisation function of patent law becomes redundant.

III. Synthesis and research outlook

1. ML as computational techniques of problem-solving

a) ML systems are, technically, tools

The question of whether AI in its most advanced forms is a tool or ‘more than a tool’ obviously depends on the definition of a ‘tool’. Etymologically and in common parlance, tools are means that assist in performing an activity.349 All tools are considered to share one attribute – they require human guidance.350 As long as AI systems are not self-organising systems capable of performing automated tasks without pre-programmed instructions or disobeying such instructions, they are, technically, tools. More precisely, techniques commonly referred to as AI, are computational model-based methods that can be applied in technical optimisation use cases.

Importantly, AI systems can replace humans in implementing computation not because it is unfeasible for a human to perform the same computational operations but because it is rather impractical. Humans are slower in computational tasks – some computations may take years for a human, while a computer can perform them in seconds. Automation is inevitable. Just as it is more practical to use cranes and excavators to move stones and build constructions, today we use search engines because it is more practical than gathering physical copies and searching them manually.351 However, when computers are used to perform search and data processing tasks, it is the implementation of computation that is automated, not human thinking, cognition or problem-solving. Even where certain problem-solving segments – such as search, optimisation and modelling – can be automated because they can be expressed as computational tasks, their implementation on a computer should not be equated with the automation of problem-solving as such.

b) Looking beyond the hype

Popular depictions of AI are often saturated with hyped perceptions and might vest in AI systems qualities and capabilities that still need to be proved by research in AI. For instance, in the case of AI translation, some assume that GPT3 can write poetry; others argue that ‘despite the teasing of mainstream press headlines to the contrary, GPT-3 doesn’t signal the beginning of the end for humanity’.352 Even though machine translation has undergone significant developments, it has still not become a mundane task. Even seemingly factual information, such as sports or stock market reports, might contain multi-word expressions353 and figurative language354 – i.e. rhetorical and stylistic markers – that the best NLP and machine translation systems currently available cannot always deal with.355

The perception of computers’ superiority in problem-solving might be prompted by the widely publicised victories of AI systems in games, which, according to some commentators, ‘left humans in its dust’356 and ‘demonstrate the futility of competition between humankind and machine’.357 Others, however, have not viewed the capacity of Deep Blue to beat a human at chess as ‘more portentous than that a tractor can beat a human at ploughing’358 and consider AlphaGo’s accomplishments as illustrating Level 1 of AI where ‘engineered intelligence’ can achieve ‘technically defined goals’ but is not capable of human-like reasoning.359

What the victory of AlphaGo can tell us about human intelligence is that a computer can efficiently perform calculations that would take a lifetime for a human without a computer. It also tells us that a human can be intelligent enough to model those processes: in essence a group of humans came up with a way to train AlphaGo by using certain rules and methods. However, it does not tell us what human intelligence is and how it is developed, e.g. whether the human brain computes some loss function and builds a numeric model when learning to play chess. One cannot deduce superiority because a computer can outperform a human in playing a specific game.

c) Comparing ‘apples to apples’

AI can augment human capacity to solve problems rather than replace a human. Even though races between humans and horses360 can be entertaining, the two are obviously incompatible, while a horse’s outperformance cannot be seriously taken as a sign of the apocalypse of the human race. If we wanted to compare ‘apples to apples’, the competition would need to be arranged between ML teams developing systems playing chess, Go, etc. The same can be said about the ‘Humies’ competition held annually as part of the Genetic and Evolutionary Computation Conference.361 The concept ‘human-competitive performance’ is deceptive, as long as a human orchestrates the implementation of GP/EAs through task-specific preparation. It would be more sensible to compare the results achieved by humans using GP vis-à-vis those achieved by humans not applying GP or applying other techniques.

d) A matter of conceptual perspective?

The question of whether AI might be ‘more than a tool’ pertains to the central inquiry in the philosophy of AI and cognitive science, namely: where algorithms can express mental states and processes, do computers executing such algorithms merely simulate mental states and processes, or do they exhibit them?362 Research in the field of human-computer interaction and cognitive science produced several approaches illustrating how computer-mediated cognitive processes can be conceptualised. These include the theories of ‘extended mind’363 and ‘distributed cognition’;364 and concepts such as ‘blended cognition’,365 ‘cognitive systems’,366 ‘cognitive artifacts’,367 ‘instruments of mind’,368 ‘things that make us smart’,369 and ‘tools for thought’.370 While an in-depth discussion of these frameworks goes beyond the scope of this paper, the diversity of these perspectives suggests that the questions of whether and where to draw a boundary between cognitive agents and cognitive artefacts depends on the frame of reference and can even be viewed as a matter of ‘one’s belief system’.371

What is clear is that AI systems are based on computational modelling, which is by definition an approximation and simplification of the phenomena that are being modelled (e.g. cognitive functions and processes).372 Research in AI is concerned with a ‘computational understanding of what is commonly called intelligent behavior, and with the creation of artefacts that exhibit such behavior’.373 Thus, cognitive functions can be reproduced in computational artefacts to the extent to which they are comprehended by their (human) designers and engineers.374 Currently, there are significant gaps in our understanding of human cognition, including questions such as ‘what is the nature of consciousness’, ‘what is the neurological basis of creativity’,375 ‘how exactly biological processes interact with cognitive phenomena’,376 and ‘whether cognitive processes are necessarily computational in nature’.377 Success in computational modelling of cognitive processes will depend on bridging those gaps.

In general, views on when ‘high-level machine intelligence’ (‘Strong AI’) can be achieved diverge substantially within the ML community.378 Some researchers assume that ‘all human thought, or at least intelligent thought, can be reduced to computation, and [hence, computers can] exhibit human-like intelligence, and eventually intelligence even superior to that of humans’.379 Others argue that ‘not all intelligent human thought and its consequent behavior can be reduced to simple computation, or even logic’.380

As far as the application of ML in technical problem-solving is concerned, a 2003 article on technical design emphasised that ‘computers are used only as an aid’ in technical design and engineering, which ‘involves extensive decision-making and subjective evaluation, activities that are aided greatly by using computers, but that are generally carried out by computers under human direction’.381 In the authors’ view, this holds good for contemporary ML systems, while predicting future AI developments is akin to gazing into a crystal ball.

2. Implications for patent law

Whether AI is a tool or more than a tool can be a relevant factor for the definition of inventor and the allocation of the inventor’s rights from a deontological perspective. In light of the technical explanations provided in this paper, hardly any uncertainty persists in situations where ML techniques are applied in problem-solving either regarding the fulfilment of the requirement of intellectual engagement in the conception of an invention de lege lata,382 or regarding the appropriateness of such requirement de lege ferenda. Like a sword that never kills by itself but is a tool in the killer’s hand,383 computational modelling and computers executing models do not invent by themselves but are powerful problem-solving tools.384 As emphasised throughout this paper, the successful performance of ML is a function of knowledge, skills and expertise of humans who configure, modulate and apply ML techniques.

As far as the economic justification of patents is concerned, it is not prima facie evident that the application of ML in technical problem-solving renders innovation activity so effortless and low-cost that the investment amortisation function of patent rights becomes no longer relevant.385 More economic insight into the role of patent protection for AI-incorporating or AI-induced inventions is needed as statistical data showing the increasing386 number of AI-related patents alone do not lend themselves to drawing any conclusion on the causal relationship between patents and innovation.

On balance, implications of ML for the patent system in its core tenets, on closer examination, appear less revolutionary387 than is often posited. Some of the identified issues – in particular, how ML techniques should be factored in the definition of a skilled person, the patentability of purpose-unbound ML methods and, more broadly, the role of AI-related patents in innovation388 – need to be examined more in-depth. While this paper mainly drew on European patent law and the patent examination practice of the European Patent Office, the presented analysis concerns the core aspects of the patent system and, thus, our findings can inform policy and legal discussions on patent law and AI in other jurisdictions.

ACKNOWLEDGEMENTS

The authors thank Preston Richard for his review and valuable comments on the draft. The feedback of two anonymous reviewers and the editorial assistance of GRUR International are much appreciated. Any errors in the legal analysis remain solely those of Daria Kim.

Abbreviations

(artificial intelligence)
(artificial neural network)
(European Patent Convention)
(general purpose technology)
(Generative Pre-trained Transformer 3)
(High-Level Expert Group on Artificial Intelligence)
(random number generator)

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Q&a Wer