LogoLogo
GitHub
  • Quickstart
    • What is eOracle?
    • eOracle Vision
    • Use eOracle
  • Build on eOracle
    • Introduction to eOracle Stack
    • What is an OVS
    • eOracle Features
    • eOracle Data Processing Flow
    • Builders Workflow
      • Set-up
      • Off-chain Computation
      • On-chain Components
      • Target Chains Publishing
    • Smart Contracts Overview
      • eOracle Chain contracts
      • Target Contracts
    • Aggregation Library
      • Median
      • Clustering
      • TWAP
      • Robust Aggregation 101
    • eOracle Cryptographic Broadcaster
    • Incentives Management
      • πŸ₯’On-chain-Subjective Slashing Framework
    • Active Specialized Blockchain Oracles
      • EtherOracle - Peg Securing Oracle by Etherfi
      • Pulse - Risk Oracle By Bitpulse
      • ECHO - Social Media Oracle
      • Borsa - Intent Optimisation
  • ePRICE
    • Introduction to ePRICE
    • Integration Guide
    • Risk Management and Market Integrity
    • Feed Addresses
      • Arbitrum
      • Arbitrum Sepolia
      • Base
      • Base Sepolia Testnet
      • Berachain
      • Blast
      • BNB Smart Chain
      • BOB
      • B Squared
      • B Squared Testnet
      • Hemi
      • Ink Sepolia
      • Ink Mainnet
      • Linea
      • Linea Sepolia
      • Manta
      • Manta Sepolia Tesnet
      • Mode
      • Mode Testnet
      • Monad Testnet
      • Morph
      • Morph Holesky
      • Polygon zkEVM
      • Polygon zkEVM Cardona Testnet
      • Plume
      • Plume Testnet
      • Scroll
      • Soneium
      • Sonic
      • Soneium Testnet
      • TAC Turin Testnet
      • Taiko Mainnet
      • Unichain
      • Unichain Sepolia
      • Zircuit
      • Zircuit Testnet
      • zkLink Nova Mainnet
    • API Reference
      • 🧩Examples
      • 🧩Off-chain Examples
    • Advanced
      • πŸ€–Automating eOracle consumption
      • πŸ’±Getting a different currency pair
  • EO Token
    • The EO Token
    • Ecosystem Participants
    • EO Token Utility
    • EO Token Flywheel
    • Security and Enforcement
    • A New Chapter in Oracle Design
  • Understand eOracle
    • eOracle Trust Model
    • Architecture Overview
    • Data Processing
    • Security
      • Cryptoeconomic Security
      • Aegis - Validator Set Configuration
  • Operators
    • Installation
    • Registration
    • Running the Client
    • Monitoring
  • πŸ”Concepts
    • EigenLayer
    • Data Validators
    • Chain Validators
    • eBFT
    • OVS
    • eOracle Chain
Powered by GitBook
On this page
  • Problem Definition
  • Considerations
  • Possible solutions
  • Summary
Export as PDF
  1. Build on eOracle
  2. Aggregation Library

Robust Aggregation 101

This document analyzes how to aggregate multiple data reports into a single reliable estimate. We review aggregation methods and their properties, focusing on resistance to errors and manipulation.

PreviousTWAPNexteOracle Cryptographic Broadcaster

Last updated 4 months ago

When protocols encounter tasks that cannot be solved on-chain (oracle tasks) and require a decentralized solution, it's crucial to distinguish between two fundamentally different purposes of decentralization:

  1. Decentralization for Better Results - When there's no single "right" answer, decentralizing the methodology itself improves outcomes. This applies to scenarios where different approaches provide complementary insights or where aggregating multiple viewpoints improves accuracy.

  2. Decentralization for Security β€”When the task is well-defined but requires protection against manipulation or failure. This applies to operations requiring high reliability and fault tolerance where a single point of failure must be avoided.

The distinction between these purposes is fundamental: the first seeks to improve quality through diverse methodologies, while the second ensures integrity through distributed execution of a single, well-defined process.

This document analyzes how to aggregate multiple data reports into a single reliable estimate. We review aggregation methods and their properties, focusing on resistance to errors and manipulation, to find an optimal strategy balancing accuracy and efficiency. We place special emphasis on the weighted median since it is the most common aggregation method for price feed oracles.

Problem Definition

We first consider a simple setting where data origins from a single-source.

Let ptp_tpt​ be the relevant quantity at time ttt, e.g., the BTC/USD price. Notice that ptp_tpt​ is unknown. Instead, we can access an estimate rtr_trt​ from a known and agreed-upon source, for instance, interactive brokers. A set of NNN fetchers fetch rt1,rt2,...,rtNr_t^1,r_t^2,...,r_t^Nrt1​,rt2​,...,rtN​ for us, where rtir_t^irti​ denotes the quantity reported by fetcher iii. The aggregate StS_tSt​ is the number we forward as our estimate of rtr_trt​ (and essentially ptp_tpt​).

The question is how to aggregate rt1,rt2,...,rtNr_t^1,r_t^2,...,r_t^Nrt1​,rt2​,...,rtN​ into one number, StS_tSt​ , representing the price according to that source at time ttt .

Considerations

  1. Time-variance. Since time is continuous, fetchers access the source at slightly different times. We don’t expect the time differences to be significant; more particularly, the time differences should not exceed one second, which is the rate of our blockchain i\o.

  2. Simplicity. It is crucial due to runtime considerations and explaining to users how our mechanism works (KISS/Occam’s razor).

  3. (Honest) mistakes. Although we have incentive mechanisms to punish/reward fetchers’ behavior, mistakes are unavoidable. For instance, downtime, latency, etc. These can happen even if fetchers are honest and thus should be accounted for.

  4. Malicious behavior. Our solution should be as robust as possible to attacks. Namely, it should minimize the consequences of an attack and facilitate punishing attackers.

To quantify the last point, the Breakdown Point of an aggregate is the minimal ratio of reports by a malicious actor that allows it to achieve arbitrary deviation from rtr_trt​.

Possible solutions

We review below several options for our selection of an aggregation. All of them are special cases of minimizing an objective function. While there are infinite such functions, our analysis focuses only on median, average, and their trimmed counterparts.

  1. Simple Average (Mean)

    • Pros: Easy to calculate; treats all reports equally; good for consistent data without outliers.

    • Cons: Can be skewed by outliers: its breakdown point is zero.

  2. Weighted Average

    • Pros: Accounts for the varying significance of each report (e.g., based on stake); more accurate if reports are not equally reliable.

    • Cons: More complex to calculate, can still be skewed.

  3. Median

    • Pros: Less affected by outliers than the mean; simple to understand and calculate.

    • Cons: May not reflect the influence of large/small prices if they are outliers.

  4. Mode (most common value)

    • Pros: Represents the most frequently occurring price; useful in markets with standard pricing.

    • Cons: Vary widely or if there is no repeating price.

  5. Trimmed Mean

    • Pros: Excludes outliers by trimming a specified percentage of the highest and lowest values before averaging; balances the influence of outliers.

    • Cons: Arbitrariness in deciding what percentage to trim; could exclude relevant data.

  6. Quantile-based Aggregation

    • Pros: Can focus on a specific part of the distribution (e.g., median is the 50% quantile); useful for risk management strategies.

    • Cons: Not representative of the entire data set; can be as sensitive to outliers as the mean.

Weighted median

weighted median enjoys the robustness of the median and the ability to consider different significance levels as the weighted average. Its breakdown point is 50% of the weight; below that, an adversary can only manipulate the result within the range of correctly reported values (as we prove later on). The weighted median allows us to incorporate the stake of the different fetchers.

To demonstrate the robustness of the weighted median, we present the following theorem. It proves that as long as the majority of the stake belongs to honest fetchers, the aggregate will always be between the honest reports; namely, an attacker with a minority weight (stake) cannot shift the aggregate too much.

Averaging Over Time

Recall that prices are susceptible to noise and volatility. Therefore, financial applications often average prices over time. Well-known methods include Moving Average, Exponential Smoothing, Time-weighted Average Price (TWAP), and Volume-weighted Average Price (VWAP).

Our current service does not implement such time averages. We allow our customers the flexibility of the computation at their end.

Multi-Source Aggregation

There are several ways by which we can set the weights.

  1. Volume-Weighted Average Price (VWAP):

    • Description: VWAP is calculated by taking the dollar amount of all trading periods and dividing it by the total trading volume for the current day. In your case, it involves weighting each source's rate by its volume, giving more influence to sources with higher trading volumes.

    • Advantages: Reflects more liquidity and is a common benchmark used by traders. It gives a fair reflection of market conditions over the given period.

    • Disadvantages: More susceptible to volume spikes, which can distort the average price.

  2. Liquidity-Adjusted Weighting:

    • Description: Here, the rate from each source is weighted based on its liquidity. This method requires a clear definition and measurement of liquidity, which can include factors like bid-ask spread, market depth, and the speed of price recovery after a trade.

    • Advantages: Provides a more realistic view of the market by acknowledging that more liquid markets better reflect the true market price.

    • Disadvantages: Liquidity can be harder to measure accurately and may vary quickly, making it challenging to maintain an accurate aggregate price in real-time.

Summary

This document explores single-source price aggregation in oracle systems, covering both result improvement and security aspects of decentralization. It analyzes various aggregation methods, focusing on weighted median for its manipulation resistance and stake-based weighting capabilities, while also examining time-based averaging and multi-source aggregation approaches.

The document highlights breakdown points in aggregation methods, demonstrates weighted median's security when honest fetchers hold majority stake, and evaluates trade-offs between different aggregation strategies.

Mathematically, let wiw^iwi be the (positive) stake of fetcher iii, and assume that βˆ‘i=1Nwi=1\sum_{i=1}^Nw^i=1βˆ‘i=1N​wi=1. Also, for simplicity, assume that rt1,rt2,...,rtNr_t^1,r_t^2,...,r_t^Nrt1​,rt2​,...,rtN​ are sorted. The weighted median is an element rtkr_t^krtk​ such that

βˆ‘i=1kβˆ’1wi≀12Β andΒ βˆ‘i=k+1Nwi≀12\sum_{i=1}^{k-1} w^i \leq \frac{1}{2} \text{ and } \sum_{i=k+1}^{N} w^i \leq \frac{1}{2}i=1βˆ‘kβˆ’1​wi≀21​ andΒ i=k+1βˆ‘N​wi≀21​

Therefore, our aggregate At=rtkA_t=r_t^kAt​=rtk​ for such kkk.

Theorem: Let HHH be the set of honest fetchers, for HβŠ‚[N]H\subset [N]HβŠ‚[N] such that βˆ‘i∈Hwi>12\sum_{i\in H}w^i > \frac 1 2βˆ‘i∈H​wi>21​, and let MMM be the set of malicious fetchers, for MβŠ‚[N]M\subset[N]MβŠ‚[N] such that \sum_{i\in M}w^i < \frac{1}{2}$ and $H\cup M=[N].

Then, the weighted median aggregate AtA_tAt​ always satisfies

At∈[min⁑i∈Hrti,max⁑i∈Hrti].A_t\in [\min_{i\in H} r^i_t, \max_{i\in H} r^i_t].Atβ€‹βˆˆ[i∈Hmin​rti​,i∈Hmax​rti​].

Recall that in the reasonable case we do not expect high variance among (honest) reports; thus, the interval [min⁑i∈Hrti,max⁑i∈Hrti][\min_{i\in H} r^i_t, \max_{i\in H} r^i_t][mini∈H​rti​,maxi∈H​rti​] will be small. This ensures that our aggregate is robust to manipulations.

We are now facing a similar problem, but now each quantity is given by a source (and not a fetcher). We have a set of NNN sources 1,...,N1,...,N1,...,N, where each source has a price StiS_t^iSti​. Along the different prices St1,St2,...,StNS_t^1,S_t^2,...,S_t^NSt1​,St2​,...,StN​ we have additional weight wiw_iwi​ per source. Weights capture our confidence in that source, volume, liquidity, etc. The weight-adjusted price is given by

At=βˆ‘i=1NwiStiβˆ‘i=1NwiA_t=\frac{\sum_{i=1}^N w^i S^i_t}{\sum_{i=1}^N w^i}At​=βˆ‘i=1N​wiβˆ‘i=1N​wiSti​​
Page cover image