Decentralized and Collaborative AI: How Microsoft Research is Using Blockchains to Build More Transparent Machine Learning Models
Recently, AI researchers from Microsoft open sourced the Decentralized & Collaborative AI on Blockchain project that enables the implementation of decentralized machine learning models based on blockchain technologies.
The biggest challenge of the next decade of artificial intelligence(AI) is going to be based on determining whether data and intelligence remains a privilege of a handful of large technology companies based in a few countries or it can be democratized to the rest of the world. The centralized nature of machine learning and AI applications foments a “rich get richer” dynamic in which only the companies with access to high quality datasets and data science talent are able to take advantage of AI opportunities. The field of decentralized AI is one of the leading trends that is looking to address this challenge. Although still impractical for many real world implementations, the decentralized AI space have been steadily gaining traction within the AI community. Recently, AI researchers from Microsoft open sourced the Decentralized & Collaborative AI on Blockchain project that enables the implementation of decentralized machine learning models based on blockchain technologies.
From training to optimization, every single step in the lifecycle of machine learning models can be improved with certain degrees of decentralization. Let’s take the example of a simple prediction model that is designed to forecast sales of a given product. In the traditional centralized approach, we need to implicitly trust a group of data scientists to select the right neural network architecture, build the correct datasets, train the model efficiently, tune the hyperparameters in order to optimize performance and a dozen of other tasks. After all that, we can’t really be sure the model is performing optimally. This problem gets even worse once we start introducing new versions of the model as its nearly impossible to correlate specific changes with a particular performance. Decentralized AI methods look to simplify this problem by enabling transparent accountability and organic collaboration across all stages of the machine learning lifecycle.
The raise in popularity and maturity of blockchain technologies has been an important catalyzer for decentralized AI architectures. The immutability and distributed consensus models of blockchain technologies intrinsically introduce a level of trust and enables collaborative dynamics in machine learning applications. The Microsoft Research team leveraged some of the native capabilities of blockchain technologies to enable different levels of decentralization in machine learning models.
Microsoft Decentralized and Collaborative AI
Decentralized & Collaborative AI on Blockchain(DCAI) is a framework to host and train machine learning models on a blockchain infrastructure. The current version of DCAI is contrained to the Ethereum blockchain and leverages smart contracts as the main encapsulation mechanism for machine learning programs. Conceptually, smart contracts are immutable programs that contain business logic that executes on a blockchain runtime. In the case of the DCAI framework, smart contracts are used to enable decentralized training mechanisms in machine learning models.
From a functional standpoint, DCAI structures the process of adding data/training to a machine learning model based on three main components:
- The Incentive Mechanism: This component is designed to encourage the contribution of high quality data. The Incentive Mechanisms is responsible for validating the transaction, for instance, in some cases a “stake” or deposit is required.
- The DataHandler: This component stores data and meta-data on the blockchain. This ensures that it is accessible for all future uses, not limited to this smart contract.
- The Model: This component encapsulates a specific machine learning model which is updated according to predefined training algorithms.
One of the fundamental challenges of decentralized AI application is to rely on the correct incentive mechanisms to encourage different parties to contribute new datasets or train machine learning learnings. In the current version, DCAI relies on two main incentive models:
- Gamified: Using this incentive mechanism, data contributors can earn points and badges when other contributors validate their contributions. This proposal relies solely on the willingness of contributors to collaborate for a common good — the betterment of the model.
- Prediction Market–Based: In this model, Contributors get rewarded if their contribution improves the performance of the model when evaluated using a specific test set.
The following animation illustrates the incentive mechanism in a sentiment classification model for IMDB reviews. The participants that contribute high quality datasets are able to profit based on the performance of the model while the parties whose contributions were not effective loose their funds.
From the programming model perspective, DCAI abstracts the training of machine learning models via smart contracts that look like the following:
DCAI is still in experimental stages but already introduces important benefits to AI models:
- Accountability: DCAI maintains immutable records of the datasets and the performance of the models in the Ethereum blockchain.
- Data Reusability: DCAI Data Handlers records training datasets into the Ethereum blockchain for future use.
- Collaboration: DCAI’s Invective Mechanisms creates a feasible model for collaboration on the training of machine learning models.
Other Interesting Decentralized AI Initiatives
DCAI is not the only relevant initiative in the decentralized AI space. While decentralized AI remains a nascent and still experimental market, there are several initiatives that have gained some notoriety beyond the research phase:
- SingularityNet: SingularityNet is, arguably, the most ambitious company in the decentralized AI space. Famous for powering the popular Sophia robot, SingularityNet is looking to introduce decentralization across all aspects of the AI lifecycle. Technically, SingularityNet is a platform that enables the implementation and consumption of AI services in a decentralized model. Built on the Ethereum blockchain, SingularityNet provides a model in which different participants in the network are incentivized for the implementation or usage of AI services.
- Ocean Protocol: The Ocean Protocol provides a decentralized network of data providers and consumers that enables the implementation and usage of AI applications. Ocean exposes many of the common infrastructure elements of AI applications such as storage, compute or algorithms via atokenized service layer that abstract the core building blocks of decentralized AI programs. Ocean Protocol recently announced a token sale via the exclusive token launch platform CoinList.
- Erasure: Created by the innovative hedge fund Numerai, Erasure provides a decentralized protocol for creating and running predictive models. The goal of Erasure is to provide a decentralized marketplace in which data scientists can upload predictions based on available data, stake those predictions using crypto tokens and earn rewards based on the performance of the prediction. The fact that Erasure is powering the data science interactions at Numerai shows up in the simplicity and practical applicability of the protocol.
- OpenMined: OpenMined is one of the most active projects in the decentralized AI market. More than a platform, OpenMined is an ecosystem of tools and frameworks for the implementation of decentralized AI applications. OpenMined has been able to build a very active developer community and provide seamless integration with mainstream machine learning technology stacks.
Microsoft is one of the market leaders in both machine learning and blockchain technologies. With Microsoft’s backing, initiatives such as DCAI have an opportunity to bring more transparent, accountable machine learning models to a large number of users.
Jesus RodriguezCEO of IntoTheBlock, Chief Scientist at Invector Labs, Founder of The Sequence AI Newsletter, Guest Lecturer at Columbia, Investor, Author, AI Fanatic.