When It’s Your Data But Another’s Stack, Who Owns The Trained AI Model?

Cloud-based machine learning algorithms, made available as a service, have opened up the world of artificial intelligence to companies without the resources to organically develop their own AI models. Tech companies that provide these services promise to help companies extract insights from the company’s unique customer, employee, product, business process, and other data, and to use those insights to improve decisions, recommendations, and predictions without the company having an army of data scientists and full stack developers. Simply open an account, provide data to the service’s algorithms, train and test an algorithm, and then incorporate the final model into the company’s toolbox.

While it seems reasonable to assume a company owns a model it develops with its own data–even one based on an algorithm residing on another’s platform–the practice across the industry is not universal. Why this matters is simple: a company’s model (characterized in part by model parameters, network architecture, and architecture-specific hyperparameters associated with the model) may provide the company with an advantage over competitors. For instance, the company may have unique and proprietary data that its competitors do not have. If a company wants to extract the most value from its data, it should take steps to not only protect its valuable data, but also the models created based on that data.

How does a company know if it has not given away any rights to its own data uploaded to another’s cloud server, and that it owns the models it created based on its data? Conversely, how can a company confirm the cloud-based machine learning service has not reserved any rights to the model and data for its own use? The answer, of course, is likely embedded in multiple terms of service, privacy, and user license agreements that apply to the use of the service. If important provisions are missing, vague, or otherwise unfavorable, a company may want to look at alternative cloud-based platforms.

Consider the following example. Suppose a company wants to develop an AI model to improve an internal production process, one the company has enhanced over the years and that gives it a competitive advantage over others. Maybe its unique data set derives from a trade secret process or reflects expertise that its competitors could not easily replicate. With data in hand, the company enters into an agreement with a cloud-based machine learning service, uploads its data, and builds a unique model from the service’s many AI technologies, such as natural language processing (NLP), computer vision classifiers, and supervised learning tools. Once the best algorithms are selected, the data is used to train them and a model is created. The model can then be used in the company’s operations to improve efficiency and cut costs.

Now let us assume the cloud service provider’s terms of service (TOS) states something like the following hypothetical:

“This agreement does not impliedly or otherwise grant either party any rights in or to the other’s content, or in or to any of the other’s trade secret or rights under intellectual property laws. The parties acknowledge and agree that Company owns all of its existing and future intellectual property and other rights in and concerning its data, the applications or models Company creates using the services, and Company’s project information provided as part of using the service, and Service owns all of its existing and future intellectual property and other rights in and to the services and software downloaded by Company to access the services. Service will not access nor use Company’s data, except as necessary to provide the services to Company.”

These terms would appear to generally protect certain of the company’s rights and interest in its data and any models created using the company’s data, and further the terms indicate the machine learning service will not use the company’s data and the model trained using the data, except to provide the service. That last part–the exception–needs careful attention, because how a company defines the services it performs can be stated broadly.

Now consider the following additional hypothetical TOS:

“Company acknowledges that Service may access Company’s data submitted to the service for the purpose of developing and improving the service, and any other of Service’s current, future, similar, or related services, and Company agrees to grant Service, its licensees, affiliates, assigns, and agents an irrevocable, perpetual right and permission to use Company’s data, because without those rights and permission Service cannot provide or offer the services to Company.”

The company may not be comfortable agreeing to those terms, unless the terms are superseded with other, more favorable terms in another applicable agreement related to using the cloud-based service.

So while AI may be “the new electricity” powering large portions of the tech sector today, data is an important commodity all on its own, and so are the models behind an AI company’s products. So don’t forget to review the fine print before uploading company data to a cloud-based machine learning service.

Evaluating and Valuing an AI Business: Don’t Forget the IP

After record-breaking funding and deals involving artificial intelligence startups in 2017, it may be tempting to invest in the next AI business or business idea without a close look beyond a company’s data, products, user-base, and talent. Indeed, big tech companies seem willing to acquire, and investors seem happy to invest in, AI startups even before the founders have built anything. Defensible business valuations, however, involve many more factors, all of which need careful consideration during early planning of a new AI business or investing in one. One factor that should never be overlooked is a company’s actual or potential intellectual property rights underpinning its products.

Last year, Andrew Ng (of Coursera and Stanford; formerly Baidu and Google Brain) spoke about a Data-Product-Users model for evaluating whether an AI business is “defensible.” In this model, data holds a prominent position because information extracted from data drives development of products, which involve algorithms and networks trained using the data. Products in turn attract users who engage with the products and generate even more data.

While an AI startup’s data, and its ability to accumulate data, will remain a key valuation factor for investors, excellent products and product ideas are crucial for long-term data generation and growth. Thus, for an AI business to be defensible in today’s hot AI market, its products, more than its data, need to be defensible. One way to accomplish that is through patents.

It can be a challenge, though, to obtain patents for certain AI technologies. That’s partly due to application stack developers and network architects relying on open source software and in-licensed third-party hardware tools with known utilities. Publicly-disclosing information about products too early, and publishing novel problem-solutions related to their development, including describing algorithms and networks and their performance and accuracy, also can hinder a company’s ability to protect product-specific IP rights around the world. US federal court decisions and US Patent and Trademark Office proceedings can also be obstacles to obtaining and defending software-related patents (as discussed here). Even so, seeking patents (as well as carefully conceived brands and associated trademarks for products) is one of the best options for demonstrating to potential investors that a company’s products or product ideas are defensible and can survive in a competitive market.

Patents of course are not just important for AI startups, but also for established tech companies that acquire startups. IBM, for example, reportedly obtained or acquired about 1,400 patents in artificial intelligence in 2017. Amazon, Cisco, Google, and Microsoft were also among the top companies receiving machine learning patents in 2017 (as discussed here).

Patents may never generate direct revenues for an AI business like a company’s products can (unless a company can find willing licensees for its patents). But protecting the IP aspects of a product’s core technology can pay dividends in other ways, and thus adds value. So when brainstorming ideas for your company’s next AI product or considering possible investment targets involving AI technologies, don’t forget to consider whether the idea or investment opportunity has any IP associated with the AI.