How do blockchains get data?

Blockchains are inherently self-sufficient. The entire deterministic model of a traditional blockchain is hinged on the fact that during transaction execution (which updates the “state” of the system) blockchains cannot perform any logic which is derived from sources external to the blockchain. All external data to the overall system must have, at some point, come from the input to a transaction, which has been recorded into a block.

On the Ethereum blockchain, for example, developers are allowed to deploy smart contracts that perform logic given some inputs. The executing logic in a smart contract cannot do anything outside of the blockchain. It cannot reach out and hit a web service from the internet. The only way to get data into a smart contract is to pass it in with the transaction. The only way to update blockchain state is to trigger that state change by sending a new transaction into the system.

Consider what would happen if a smart contract were allowed to hit an API endpoint to retrieve some data that was used in the smart contract’s execution. If the contract were deployed today into a new block, the API endpoint might return

{ "foo": "bar" }

But then tomorrow, the API operator changes the endpoint response to return

{ "foo": "baz" }

A month from now someone is newly syncing the Ethereum blockchain, the block containing our smart contract is executed, and the API returns a different response than what it returned a month ago. The state of the newly synced blockchain will be different than the state of a blockchain that existed last month.

This is no longer a fully self-deterministic blockchain. My blockchain looks different than your blockchain after the same sync, with the same blocks.

Said another way: given the full set of blocks, a node must be able to recreate the final state of the blockchain from scratch, with no internet connection.

So what does that mean for a developer, who wants to create and deploy a smart contract to be used for auditability purposes in their existing app? Oracles, developer. You need to make an oracle.

What is an oracle?

An oracle is a service that provides “trusted” data to a smart contract, through transactions. “Trusted” because, trust is a personal issue. Two entities might not “trust” data in the same way, given some specific implementation of an oracle.

Oracles are typically web services that implement some blockchain-specific functionalities, such as hashing and signing some data, or creating and submitting new transactions to the network.

Let’s see a simple example!

We’ll create three services to implement a simple circular “oracle” workflow.

At the bottom, on the blockchain, will be a smart contract with a whitelisted address as a contract parameter. This smart contract will implement a function called updateWeather that only responds to transactions from the whitelisted address. The function accepts weather data as input parameters, and echos out the data in an “event”, which is an Ethereum specific concept. Events should be thought of much like stdout logging in traditional software development. Events being emitted from a smart contract can be subscribed to asynchronously in a javascript application.

Living on the web will be two nodejs processes. One of them is the “oracle”. It exists on a runtime loop that retrieves weather data from an open weather API, then submits the weather data to the smart contract for historical audit-ability purposes.

The other nodejs process simply subscribes to weather events emitted from the smart contract, and console.logs the results. The events, as described above, are emitted every time the special “oracle weather” function is successfully executed.

Simple data flow of data moving from web-service-based Oracle, to Smart Contract, to another server logging events


The following code has been greatly simplified for ease-of-understanding. It has been stripped of proper error handling, and is in no way suitable for production environments.

Smart Contract

The contract exposes one public oracleAddress, which gets set via an input parameter in the constructor.

contract WeatherOracle {
  address public oracleAddress;
  constructor (address _oracleAddress) public {
    oracleAddress = _oracleAddress;
  // ...

Next we’ll define an Event, which will be emitted during a successful transaction on the weatherUpdate function. For sake of simplicity, the event will just emit a single string, the temperature.

event WeatherUpdate (string temperature);

And finally the updateWeather function. It has public visibility, which means it can be called from an outside transaction.

function updateWeather (string temperature) public {
  require(msg.sender == oracleAddress);
  emit WeatherUpdate (temperature);

Notice the require statement. Execution will only continue past this line if the msg.sender (address that sent the transaction) is equal to the publicly set oracleAddress whitelisted address.

That’s it!

Oracle Service

Our oracle is a simple nodejs service. It uses the request library to call an external weather API, parses the response, crafts and submits a transaction to the deployed smart contract, then waits to do it all again.

Hit the API endpoint (which is stored in an environment variable), to kickstart the workflow.

const options = { uri: process.env.WEATHER_URL, json: true };
const start = () => {

Parse the response.

const parseData = (body) => {
  return new Promise((resolve, reject) => {
    const temperature = body.main.temp.toString();
    resolve({ temperature });

Craft an Ethereum transaction that calls the updateWeather function on the deployed smart contract. Note that account() is an asynchronous function that loads an Ethereum account from elsewhere-defined configs, and contract is a javascript object that represents the location and interface of the deployed WeatherOracle smart contract. These smart-contract-specific functions are brought to you by the web3 npm package :)

const updateWeather = ({ temperature }) => {
  return new Promise((resolve, reject) => {
    account().then(account => {
      contract.updateWeather(temperature, { from: account }, (err, res) => {

Finally we just restart the process after a timeout, based on an environment config. The wait function will resolve after the given timeout.

const restart = () => {

That’s it! The above code implements a simple service that fetches data from an API and feeds it into a smart contract.

Did you catch the secret sauce?

When crafting the Ethereum transaction, we said it was { from: account }. This account object is a javascript object that is the full account (read: private key) that is signing the transaction, and which must contain some ETH as gas to pay for the transaction.

Defined as an environment variable on the service is a private key, which is used to instantiate the account object. This private key MUST be the key behind the pubic address used to instantiate the WeatherOracle smart contract, due to the require line in the smart contract’s updateWeatherfunction.

If any other address creates a transaction that calls updateWeather on the contract, the transaction will fail and the event won’t be emitted.

Speaking of emitting events, let’s make sure those work.

Event Consumer

This is yet another simple nodejs service. Again, contract is a javascript object that represents the location and interface of the deployed WeatherOracle smart contract. Calling the WeatherUpdate event name and passing in a callback is all you need for asynchronous event listening.

const consume = () => {
  contract.WeatherUpdate((error, result) => {
    console.log("BLOCK NUMBER: ");
    console.log("  " + result.blockNumber);
    console.log("WEATHER DATA: ");

As this service runs it will periodically output data to stdout, as valid transactions get mined into blocks.

BLOCK NUMBER:  3424586
{ temperature: '74.75' }

And there you have it.


If you’d rather just get the full projects to see the code in action, find them on GitHub