Forest AI came from the question: "What if the 175 Terra Watt hours going to BitCoin mining was used for AI innovation?"

Abstract

Forest is an AI-focused innovation engine leveraging positive learnings from Bitcoin mining and preventing tokenomics pitfalls that allow oligarchies of large token holders to control networks to their benefit like in Bittensor.

In Forest, all services are available for purchase by real-world clients on day one. This allows us to move away from corruption-prone stake-weighted voting and instead distribute tokens based on verified blockchain data on who attracts the most new customers to the network.

Furthermore we ensure fair competition by utilizing cheat-proof decentralized AI benchmarking in the reward function and collateral slashing for attempted cheating.

To ensure long-term economic viability the network treasury earns a commission on revenue generated by participants.

Network Actors

Customers: Purchase AI services and influence token rewards. The protocols and providers with most customers get the most token emissions.

Providers: Compete to offer the best AI services within a protocol and earn tokens based on performance.

Validators: Evaluate providers' performance and ensuring cross-compatibility

Protocols: Standardize competition over an AI task. Define API standards for providers and quality evaluation criteria for validators.

Protocol Admin: Defines protocol goal and hyperparameters.

Root Contract: Smart contract that creates and distributes tokens to protocols based on their customer revenue.

Token flow

Providers must stake the token as collateral to each protocol they want to compete in and hence get rewards from

Validators must stake the token as collateral to each protocol they want to score and hence get rewards from.

The stake is required to ensure the validator is taking their task seriously and giving the DAO something to slash in defense of network attacks.

Token Emissions

Each epoch 1000 tokens are generated
Each Protocol gets tokens proportional to its share of Sales Fees
Each Provider in a protocol gets tokens proportionally to AI performance score

Above example is simplified for illistrative purposes. Protocol administrators can define how exactly tokens are distributed over all participants : Validators, Providers and Admins. For exact details see /contracts directory.

Validation

Every Epoch inside each protocol the participating validators score performance of all participating providers. The scoring process is as follows.

Validators generate new test data locally and keep them secret
At any random time within the epoch validators purchase services from each provider and send test prompts
Providers immediately reply with a signed response to each prompt
Validators score each response relative to other participating proiders
During Voting Window Validators hash their score vector and submit it on-chain
When Reveal Window is opened: Validators reveal raw scores for each of the tests they have performed

Provider Registration Process

%%{init: {'flowchart': {'nodeSpacing': 60, 'rankSpacing': 50}}}%%
flowchart LR
    subgraph Blockchain[Blockchain]
        direction TB
        A[Provider registers with a protocol]:::color2
        subgraph PaymentGroup[ ]
            direction LR
            A1[Pays Protocol Fee]:::color2 
            A2[Submits Collateral for Stake]:::color2
        end
        A --> A1
        A --> A2
    end

    A1 --> C
    A2 --> C

    subgraph Process[Service Validation Process off Chain]
        direction TB
        C[Testing]:::colorCDE 
        D[Engage as Pseudonymous Customers to Validate]:::colorCDE
        E[Assess the Performance]:::colorCDE
        H[Randomized Monthly Reassessment]:::color8
        C --> D
        D --> E
        H --> E 
    end

    subgraph Outcome[Validation Outcome on Chain]
        direction TB
        F[Provider Penalized and Deregistered]:::colorFG 
        G[Provider Successfully Verified & Authorized]:::colorFG 
    end

    E --|Scores Below Threshold|--> F
    E --|Scores Satisfactory|--> G

    H --> G

    linkStyle default curve: linear;
    
    classDef color2 fill:#FFABAB,stroke:#000,stroke-width:2px;
    classDef colorCDE fill:#D5AAFF,stroke:#000,stroke-width:2px;
    classDef colorFG fill:#B9FBC0,stroke:#000,stroke-width:2px;
    classDef color8 fill:#FFCBCB,stroke:#000,stroke-width:2px;
    
    style PaymentGroup fill:none,stroke:none,stroke-width:0px

Attacks

Sybil Attack

Providers way try to cheat the token reward mechanism by creating many fake accounts and buying from themselves cycling the money. Multiple robust anti-sybil mechanism are utilized to deter these type of attacks. However a sybil attack like this is not immediately harmful for the network for the following reasons:

A) Every purchase has a significant network fee that goes to the Treasury
B) The network's total value locked (TVL) increases because customers must prepurchase at least a full month and providers can not immediately withdraw
C) The customer revenue brought in by one provider does not only give them rewards but also other provider on the same protocol with a high performance score
D) Token rewards have a lockup period

In order for an attacker to benefit from a sybil attack they would need to dedicate a significant amount of capital to the attack (minimum staking requirements, fees and pre-payments) , they must have the capabilities to run a high scoring AI model in their protocol and then they must accept the risk that their rewards and stake can get slashed during the lockup period if they are found to have been run a sybil farm.

Sybil detection methods may include:

FIP21 Shannon entropy over purchase vector
FIP24 Off-chain verifiable identifiers
FIP27 Proof of B2B transaction and new KYB onboarding
FIP28 chain analysis clique detection

Validator Vote Copying

Validators are an important part of the decentralized system and hence get rewarded in emissions. Some might try to game the system to get emissions without actually doing the work of evaluating results (like creating new test data) but instead simply waiting for other validators to do this work and then copying their vote vector.
We mitigate this issue by Commit-Reveal: All validators must hash commit their vote vector prior to the commit deadline. And then after the reveal deadline all validators publish the original full vote vector which must match their hash. It is computationally infeasible to reverse the hashed vector so the only way to get validation rewards is by actually doing the work.

Validator-Provider Collusion

Fundamental to our contribution to the AI industry are robust trustworthy performance scores. Additionally, an actor or group controlling both a validator and a provider may attempt to extract unfair token rewards by scoring themselves unfairly. In prior Ai Crypto projects this kind of self-dealing was rampant and made it unattractive for fair players to compete.

We mitigate this issue by requiring all validators to keep a publically auditable log of every score they have given for every prompt to every provider. If unfair scoring of one provider is suspected any token holder can request an audit and a DAO vote to slash the validator's collateral and simultaneously rendering the validators votes ineffective.
The rules on how a validator should score providers is defined by the protocol administrator hence also the auditing and slashing rules are defined by them. The vote may be a vote by a panel of experts over Q, a full public token holder snapshot or a combination of both.

Due to the decentralized nature of our AI performance evaluation it is easily visible if one validator unfairly favors one provider due to their scores being a statistical outlier versus the remaining validators. Knowing that naive collusion is easily visible will stop most people from taking the next step as they know it will take significant effort to hide the collusion and not get their stake slashed. Furthermore, the unfairly elevated score will not influence significantly customers who can easily compare providers themselves. When there is a mismatch between the provider with the best score and the provider with the most customers it might get investigated.
Even a large validators of go through a sophisticated attack disadvantaged providers will notice it and trigger an audit.

Motivation

Addressing Industry Wide Overfitting

Public benchmark test data sets make AI model performance comparable. But this creates an incentivization for closed source models in particular to game the benchmarks by creating heuristics for them or overfitting their training data to include solutions to the known testsets.

For open source models Dynabench attempts to solve the problem of overfitting on test datasets with a community of humans intentionally creating new test data designed to be hard for models. But the Dynabench only works with open source models. Additionally, Dynabench has not seen significant adoption even after being managed by mlCommons, we believe this lack of traction is due to a lack of incentives for evaluators or AI model owners to participate. Forest Protocols’ aims to properly incentivize both AI model owners and those that evaluate them for sustainable long term adoption.

Centralized private test data evaluation is another approach that has been attempted to resolve the problem of AI companies gaming benchmark results. One currently active private evaluator is the SEAL LLM Leaderboards by Scale.ai. Private test sets are a fundamental part of the strategy at Forest Protocols but one individual centralized evaluator must be trusted to not be paid off to favor one AI model company over another. Forest protocol enhances resilience by requiring all protocols to have multiple independent validators each of which having economic collateral that can get slashed if a public audit of their votes and test data reveals that they were clearly biased towards one model.
Current private validators like SEAL could become part of the Forest Protocols network if they are willing to put collateral behind the trust in their fair evaluations.

Forest AI combines the approaches of SEAL and Dynabench adding corruption resistance and a funding mechanism for the continuous creation of new private test data by multiple independent parties.

Fair Competition Brings Innovation

Capitalism incentivizes technology companies to play nice on the surface to attract customers while covertly being anti-compeditive with tactics like technology lock-in. Customers know what's happening and they hate it. The issue can be even worse with vendors one has worked with for many years, because the vendor knows the customer would need to spend millions in development costs to migrate to another solution; they can continually increase the price going significantly above the average market price.

Forest AI provides an incentive system that rewards companies who do not partake in this anti-compeditive tactic. Validators are not only scoring providers for performance but also for cross compatibility, in most protocols this is a side effect of needing to be compatible with the API being tested if they aren't compatible it immediately results in a zero score. If a customer wants to try out a new provider within the same protocol they can continue using their same code simply changing connection strings potentially saving millions in development cost.

Fair competition also naturally brings prices down. If the largest providers can’t lock-in their customers they need to keep reasonable prices and then smaller providers will need to bring prices down even further to compete. Our network even enables protocol administrators to directly link the token rewards to performance score per dollar.

Deterministic innovation funding

Since the majority of mining rewards go to providers who are typically AI startups one could consider the network a form of startup funding.
The network invests in innovative AI startups and can collect a return on this investment from the cumulative future customer fees and increased demand for the network token. Token reward are directly linked to new customer revenue coming into the network and validators further direct the rewards to the most innovative providers with the best AI.

Traditional venture capital (VC) funding entrusted a lump sum of money to a group of people that have gone through an extensive but subjective vetting process. The overhead of traditional funding decision-making is huge not only on the VC side but also on the side of the startup. Founders must dedicate huge amounts of time to outreach, networking events and relationship building to get invited to a VC meeting, and they may need hundreds of such meetings over the course of months as each has <1% probability of success. And once a funding decision has been made it is followed by even more overhead with legal contracting.

The Forest AI smart contracts can make funding decisions deterministically leaning on the self-motivated signals coming from customers and validators. Funding requires no political connections or networking. An AI PhD student living in a smaller country without a significant VC network can simply register their model to a protocol where they know it will win the highest scores, and they will immediately get funding to further its development.

Mining rewards that follow the market

Blockchain mining rewards have been rigid historically. Forest AI is introducing a novel system of diverse mining rewards that adjust with the market.

1st Gen: Bitcoin mining is just a competition for who can randomly guess the right numbers to hash together and hence seen as a significant waste of energy.

2nd Gen: Projects like LivePeer, FileCoin or Render Network that give mining rewards for workloads that have real world utility, but they define that utility in a narrow way with rigid smart contracts making it hard for the networks to adapt to a changing market.

3rd Gen: Forest AI’s mining rewards are dynamically adjusted to reward mining workloads that have the most utility to the market. Any workload that can be framed as an AI Agent can be proposed and minded in a permissionless manner.

Use Cases

Machine Translation [Text to Text]:

A text to text problem with multiple correct answers such as translating from English to Korean. Validators will create specialized test datasets with one English prompt and 3-5 correct reference translations. When validators get translated answers from each of the participating miners they will score them based on how close they are to the reference translations using the BLEU scoring method.

Python Code Generation [Text To Text]:

This protocol would be the continually evaluated alternative to current standards in academia like MBPP[1][2] or APPS or HumanEval [1][2]. The current standards have known questions and hence particularly closed source models can pretrain on this test data to overfit their results making the models seem superior while they only overfit this specific testset.
Validators in a Text To Code protocol would independently generate their own coding challenges with test cases on how to check for correctness (in python for instance). The text prompts are sent to the miners (aka AI model providers) whom must return functioning code that is the executed by the validator to check for correctness and CPU efficiency of the solution. While validators might

Image Generation [Text to Image]:

While being one of the most common everyday uses of AI it is also one of the hardest to define a scoring standard for due to the subjectivity of what is the most accurate or most beautiful image. Here it is important to note that the validators have two scoring mechanism, one is a boolean vote if the miner(aka AI model provider) is in compliance with the basic requirements such as responding promptly with an image, abiding by the defined API and producing an image which on a very basic level can be considered to depict most of the text prompt.The validators output a scaler value which is designed as a relative score against the other answers provided. In the case of text to image it will be easy for users themselves to test multiple providers since the queries are standardized over all the providers a user can directly ask for the same prompt from all providers (aided by the user interface) and after a few prompts choosing only their favorite providers for the remaining prompts or let the validator ranking guide their choice by default. As previously stated the protocol values users' purchasing decisions higher than validator votes themselves and in the case of text to image this alleviates the issue of ambiguity of evaluation by the wisdom of the crowd.

Future Event Prediction [Text to Boolean]:

Future events prediction has utility in many industries such as insurance, finance, energy markets among others. Each of these specialized prediction markets could become their own protocols. Real world events suit themselves particularly well for decentralized validation as there is little ambiguity of scoring results once the event has happened or time has passed before which the event should have happened. Each validator independently defines events to be predicted such as will the price of bitcoin be above $100,000 within the next 7 days, OPEC announced a reduction in oil production within the next 30days or will the US Federal Reserve increase interest rates next month. All events have a boolean outcome with in a specific date and each miner will announce their predictions to each validator who can then score them with accuracy 0 or 1 and average over all event questions.

Moat

Strong Network Effects

Every provider adds to the network effect of getting more customers into one platform and bringing more services into one platform. New providers will always prefer to integrate with the platform that has the most users. And customers prefer the platform with the largest offering.

Providers already have the common vested interest with other providers as they all hold the network token which goes up in price when the network gores. But additionally Forest AI fosters the network effect by directly rewarding providers if their customers also make purchases with other protocols in the network ( FIP21 FIP22 ).