As AI applications penetrate various industries, accurately assessing model performance and enhancing user trust has become a pressing issue. Traditional evaluations often rely on centralized mechanisms, making it difficult to cover diverse scenarios and failing to reflect true user preferences; at the same time, the problem of model “hallucination” frequently arises, causing users to often fall into information silos when making choices.
In this context, Yupp, as a new platform, is attempting to reshape the discovery, comparison, and utilization of AI models with its unique crowdsourcing model and incentive mechanism, bringing a paradigm shift to the field of AI evaluation. This article will delve into Yupp’s core mechanisms, technical highlights, team background, and its potential impact on the AI ecosystem.
Yupp is focused on solving the long-standing evaluation challenges in the AI field, dedicated to building a “trustless” AI feedback market—allowing diverse user feedback to circulate freely under the protection of blockchain and crypto-economic incentives, thereby forming a scalable, fair, and transparent model evaluation layer. By incentivizing the distribution of high-quality manually annotated data, Yupp can promptly capture the real needs and preferences of users in different scenarios, helping AI developers optimize model performance in an iterative manner.
The project was founded in June 2024 by Pankaj Gupta (Co-founder and CEO) and Gilad Mishne (Co-founder and Head of AI), with Chief Scientist Jimmy Lin (Professor at the University of Waterloo) also participating in the core team. The three had previously worked together at Twitter in 2010, where they built and optimized large-scale recommendation and search systems, and later gained extensive experience at Google and Coinbase.
Due to its vision of decentralization and transparency of data value, which can meet the dual demands of AI manufacturers for credible evaluation and user participation, as well as benefiting from the rich experience of its core team, Yupp has gained high recognition from well-known figures in the tech industry and top venture capitalists.
Last week, Yupp announced the completion of a $33 million seed round financing, led by A16z partner Chris Dixon. Other investors include Google Chief Scientist Jeff Dean, Twitter co-founder Biz Stone, Pinterest co-founder Evan Sharp, Perplexity CEO Aravind Srinivas, Stanford University’s Dan Boneh, Chris Re, Nick McKeown, and Balaji Prabhakar, among 45 well-known angels and corporate executives, as well as Coinbase Ventures.
As a centralized AI evaluation platform, Yupp adheres to the philosophy of “Every AI for everyone,” allowing users to easily discover, compare, and utilize the latest AI models. Unlike traditional single responses, Yupp returns answers from two (or even more) models simultaneously for each prompt, forming an “AI parliament.” This design not only meets users’ demands for diverse choices but also effectively identifies potential “hallucinations” that models may produce, helping users make more informed decisions through comparison. As Yupp CEO Pankaj Gupta stated, side-by-side outputs are particularly beneficial for users concerned about generation errors, as they can cross-verify results.
The platform now supports over 500 AI models, covering the fields of text and image generation, including well-known models such as ChatGPT, Claude, Gemini, DeepSeek, Grok, Llama, and many emerging models. To further optimize the experience, Yupp has also launched the “QuickTake” feature, which can distill lengthy replies into a concise tweet.
In addition, Yupp places a high priority on user privacy: all chat records are private by default unless the user actively makes them public; even when shared publicly, no personal information is disclosed. Users can control the content and scope of sharing at any time.
Yupp will use user feedback for free and measure model usage through the “Yupp Points” system. New users receive 5000 points upon registration, and can earn more points by scoring model responses, selecting preferences, and explaining their reasons. The higher the quality of feedback, the greater the rewards, ensuring users can sustainably use high-end models like Claude Opus 4 or OpenAI o3 for free. The platform promises that points will only increase and that all current models can be experienced for free.
After each question, users will receive two model responses and can earn a “digital scratch card” through feedback, rewarding Yupp points ranging from 0 to 250. Every 1000 points can be exchanged for 1 dollar, with a maximum daily withdrawal of 10 dollars and a monthly maximum of 50 dollars. Points can be redeemed for more than 20 currencies, including dollars and euros, with partners including Stripe, PayPal, and Coinbase. At the same time, the platform integrates Base Ethernet L2 and Solana stablecoins to provide global users with instant, fee-free rewards.
As Pankaj Gupta said, the high-quality feedback generated by users is far more valuable for AI companies’ model fine-tuning and reinforcement learning than the rewards themselves. Although users’ monthly earnings may only be equivalent to a few cups of coffee, this paid annotation data is crucial for AI iteration.
To incentivize more people to participate, Yupp also established a referral reward: the referrer receives 5000 points, and the referred person receives 1000 points; currently, new registered users can receive 5000 points, and the referred person additionally receives 2500 points.
To address the existing issues of insufficient transparency in rankings, lack of fairness, and uneven access to evaluation data, Yupp has launched a beta version of the AI ranking and the “Yupp VIBE (Vibe Intelligence Benchmark) Score” rating system. This system aggregates preference data generated by global users in natural interactions, aiming to provide robust and reliable evaluation results.
Yupp’s evaluation principles include:
The platform not only collects binary preferences but also encourages users to point out the advantages and disadvantages of replies (such as “to the point”, “fast speed”, “good style”, etc.), and conducts cluster analysis based on users’ age, education, occupation, and other information to show the preference differences among different groups.
On a technical level, Yupp is exploring the use of Blockchain, cryptographic primitives, and zero-knowledge proofs to ensure that the evaluation process is fair, transparent, and verifiable. At the same time, the platform has partnered with professional AI data providers to calibrate scorers through archival verification and multi-layer quality detection to eliminate malicious data.
The recent leaderboard has been updated, showcasing the VIBE scores of models such as GPT‑4.5 Preview, Claude Opus 4, and Claude Sonnet 4, along with their win rates, dislike rates, speed, latency, context window, and cost metrics.
Yupp officially launched on June 13, 2025, after six months of internal testing. Since its launch, the product has been continuously iterating:
Yupp’s mission is to “empower humanity to shape the future of AI.” Pankaj Gupta believes that the development of AI requires the participation and contribution of everyone. Through multi-perspective AI responses and user feedback, Yupp not only helps users make better decisions but also provides a continuous driving force for AI evolution.
It is worth mentioning that one of Yupp’s main competitors is the open AI model evaluation platform LMArena (website:https://lmarena.ai/),The website is very popular among AI professionals, but the platform is currently in the stage of commercial exploration and does not provide direct material rewards or points incentive mechanisms for user participation by leveraging Blockchain technology.
Overall, Yupp has opened up a new path for AI assessment with its crowdsourced model, incentive mechanism, and evaluation system driven by real user preferences. It not only offers users a free and diverse AI interactive experience but also converts user feedback into high-value training data, promoting continuous optimization of the model. With an experienced team and top-tier capital backing, Yupp is expected to play a key role in the future AI ecosystem, realizing the vision of “AI for everyone, shaped by everyone.”
However, for Yupp, which has just launched, how to continuously ensure data quality, resist potential cheating behaviors under the participation of a large number of users, and strike a balance between commercialization and user incentives will still be a direction that needs to be explored and optimized in its future development.
As AI applications penetrate various industries, accurately assessing model performance and enhancing user trust has become a pressing issue. Traditional evaluations often rely on centralized mechanisms, making it difficult to cover diverse scenarios and failing to reflect true user preferences; at the same time, the problem of model “hallucination” frequently arises, causing users to often fall into information silos when making choices.
In this context, Yupp, as a new platform, is attempting to reshape the discovery, comparison, and utilization of AI models with its unique crowdsourcing model and incentive mechanism, bringing a paradigm shift to the field of AI evaluation. This article will delve into Yupp’s core mechanisms, technical highlights, team background, and its potential impact on the AI ecosystem.
Yupp is focused on solving the long-standing evaluation challenges in the AI field, dedicated to building a “trustless” AI feedback market—allowing diverse user feedback to circulate freely under the protection of blockchain and crypto-economic incentives, thereby forming a scalable, fair, and transparent model evaluation layer. By incentivizing the distribution of high-quality manually annotated data, Yupp can promptly capture the real needs and preferences of users in different scenarios, helping AI developers optimize model performance in an iterative manner.
The project was founded in June 2024 by Pankaj Gupta (Co-founder and CEO) and Gilad Mishne (Co-founder and Head of AI), with Chief Scientist Jimmy Lin (Professor at the University of Waterloo) also participating in the core team. The three had previously worked together at Twitter in 2010, where they built and optimized large-scale recommendation and search systems, and later gained extensive experience at Google and Coinbase.
Due to its vision of decentralization and transparency of data value, which can meet the dual demands of AI manufacturers for credible evaluation and user participation, as well as benefiting from the rich experience of its core team, Yupp has gained high recognition from well-known figures in the tech industry and top venture capitalists.
Last week, Yupp announced the completion of a $33 million seed round financing, led by A16z partner Chris Dixon. Other investors include Google Chief Scientist Jeff Dean, Twitter co-founder Biz Stone, Pinterest co-founder Evan Sharp, Perplexity CEO Aravind Srinivas, Stanford University’s Dan Boneh, Chris Re, Nick McKeown, and Balaji Prabhakar, among 45 well-known angels and corporate executives, as well as Coinbase Ventures.
As a centralized AI evaluation platform, Yupp adheres to the philosophy of “Every AI for everyone,” allowing users to easily discover, compare, and utilize the latest AI models. Unlike traditional single responses, Yupp returns answers from two (or even more) models simultaneously for each prompt, forming an “AI parliament.” This design not only meets users’ demands for diverse choices but also effectively identifies potential “hallucinations” that models may produce, helping users make more informed decisions through comparison. As Yupp CEO Pankaj Gupta stated, side-by-side outputs are particularly beneficial for users concerned about generation errors, as they can cross-verify results.
The platform now supports over 500 AI models, covering the fields of text and image generation, including well-known models such as ChatGPT, Claude, Gemini, DeepSeek, Grok, Llama, and many emerging models. To further optimize the experience, Yupp has also launched the “QuickTake” feature, which can distill lengthy replies into a concise tweet.
In addition, Yupp places a high priority on user privacy: all chat records are private by default unless the user actively makes them public; even when shared publicly, no personal information is disclosed. Users can control the content and scope of sharing at any time.
Yupp will use user feedback for free and measure model usage through the “Yupp Points” system. New users receive 5000 points upon registration, and can earn more points by scoring model responses, selecting preferences, and explaining their reasons. The higher the quality of feedback, the greater the rewards, ensuring users can sustainably use high-end models like Claude Opus 4 or OpenAI o3 for free. The platform promises that points will only increase and that all current models can be experienced for free.
After each question, users will receive two model responses and can earn a “digital scratch card” through feedback, rewarding Yupp points ranging from 0 to 250. Every 1000 points can be exchanged for 1 dollar, with a maximum daily withdrawal of 10 dollars and a monthly maximum of 50 dollars. Points can be redeemed for more than 20 currencies, including dollars and euros, with partners including Stripe, PayPal, and Coinbase. At the same time, the platform integrates Base Ethernet L2 and Solana stablecoins to provide global users with instant, fee-free rewards.
As Pankaj Gupta said, the high-quality feedback generated by users is far more valuable for AI companies’ model fine-tuning and reinforcement learning than the rewards themselves. Although users’ monthly earnings may only be equivalent to a few cups of coffee, this paid annotation data is crucial for AI iteration.
To incentivize more people to participate, Yupp also established a referral reward: the referrer receives 5000 points, and the referred person receives 1000 points; currently, new registered users can receive 5000 points, and the referred person additionally receives 2500 points.
To address the existing issues of insufficient transparency in rankings, lack of fairness, and uneven access to evaluation data, Yupp has launched a beta version of the AI ranking and the “Yupp VIBE (Vibe Intelligence Benchmark) Score” rating system. This system aggregates preference data generated by global users in natural interactions, aiming to provide robust and reliable evaluation results.
Yupp’s evaluation principles include:
The platform not only collects binary preferences but also encourages users to point out the advantages and disadvantages of replies (such as “to the point”, “fast speed”, “good style”, etc.), and conducts cluster analysis based on users’ age, education, occupation, and other information to show the preference differences among different groups.
On a technical level, Yupp is exploring the use of Blockchain, cryptographic primitives, and zero-knowledge proofs to ensure that the evaluation process is fair, transparent, and verifiable. At the same time, the platform has partnered with professional AI data providers to calibrate scorers through archival verification and multi-layer quality detection to eliminate malicious data.
The recent leaderboard has been updated, showcasing the VIBE scores of models such as GPT‑4.5 Preview, Claude Opus 4, and Claude Sonnet 4, along with their win rates, dislike rates, speed, latency, context window, and cost metrics.
Yupp officially launched on June 13, 2025, after six months of internal testing. Since its launch, the product has been continuously iterating:
Yupp’s mission is to “empower humanity to shape the future of AI.” Pankaj Gupta believes that the development of AI requires the participation and contribution of everyone. Through multi-perspective AI responses and user feedback, Yupp not only helps users make better decisions but also provides a continuous driving force for AI evolution.
It is worth mentioning that one of Yupp’s main competitors is the open AI model evaluation platform LMArena (website:https://lmarena.ai/),The website is very popular among AI professionals, but the platform is currently in the stage of commercial exploration and does not provide direct material rewards or points incentive mechanisms for user participation by leveraging Blockchain technology.
Overall, Yupp has opened up a new path for AI assessment with its crowdsourced model, incentive mechanism, and evaluation system driven by real user preferences. It not only offers users a free and diverse AI interactive experience but also converts user feedback into high-value training data, promoting continuous optimization of the model. With an experienced team and top-tier capital backing, Yupp is expected to play a key role in the future AI ecosystem, realizing the vision of “AI for everyone, shaped by everyone.”
However, for Yupp, which has just launched, how to continuously ensure data quality, resist potential cheating behaviors under the participation of a large number of users, and strike a balance between commercialization and user incentives will still be a direction that needs to be explored and optimized in its future development.