Measuring AI Freedom
By Evan Ellis
The problem of human freedom and free will has raged for all of recorded history. At its heart is the debate between Determinism and Libertarianism, which are the notions that either one course of events is possible, or many are possible and the future is a product of our “will”, whatever that may be. How we answer this question affects our notions of moral responsibility and achievement: if the future is predetermined, how can we hold anyone accountable for their actions?
We can draw many similarities between the human mind and computer “agents” in Reinforcement Learning (RL). As RL agents become more and more capable, it is both informative and necessary to develop notions of freedom from their perspective. If robots take their place in society, how do we measure their freedom? This is AI Ethics for General Intelligence.
Freedom as Entropy
Our approach begins with the methods first proposed in “Free Will Belief as a Consequence of Model-Based Reinforcement Learning” by Erik M. Rehn in late 2021. Rehn distinguishes between two types of freedom:
Physical Freedom: Freedom over the physical world. This is bound by natural laws and, outside of quantum mechanics, is deterministic. Since both humans and agents are bound by natural laws, physical freedom is conterversial, so I won’t be diving into it today.
Value Freedom: The more useful of the two types of freedom, Value Freedom is a measure of how unpredictable an agent is. In a universe full of deterministic processes, Value Freedom sets people apart from natural forces. Since agents have stochastic action-selection (such as in the Boltzmann rational model), we consider them unpredictable. The amount of unpredictability can be measured to give a value freedom. Rehn focuses the paper on measuring Value Freedom. It is a powerful metric that matches our common-sense understanding of freedom. Consider the following scenario:
Shiv and Ashwin are getting tacos, and Ashwin recommends the Al Pastor. Shiv likes all tacos, and so the value of getting any specific one is roughly equivalent to the value of any other. He is free to choose, either accepting Ashwin’s suggestion or ignoring it. Either way, Shiv is happy.
In this example, Shiv’s choice is highly unpredictable—even to Shiv. He has little preference for any taco over any other, so he has complete free will. The universe does not determine which taco Shiv will pick—Shiv does. Shiv, in this case, has a high value freedom.
In the second example, Ashwin turns the tables:
Shiv and Ashwin are getting tacos, and Ashwin tells Shiv to order the Al Pastor, or he will kidnap Shiv’s firstborn. Shiv likes all tacos, but he expects to like his firstborn far more, and so the value of getting the Al Pastor is many magnitudes larger than the value of any other taco. Shiv is being manipulated by Ashwin, and he only has one clear choice of taco.
In this example, Shiv’s choice is highly predictable. He has little control over his order because Ashwin has biased his preference towards the Al Pastor. In our common-sense understanding of free will, Shiv’s decision is not free. Something other than Shiv decided his order.
In Reinforcement Learning, we can train our agents to predict Q-values, which are the expected reward of taking a certain action in a certain state. In the first example, each action/state combination has roughly the same Q-value. In the second, the action “Order the Al Pastor” has a disproportionately higher Q-Value.
Using the Q-Value model, we can derive the probability of taking an action in a given state using the Boltzmann Rational Model, which is more commonly known as Softmax. It assumes that an agent is probabilistic in nature, but is more likely to prefer high-reward actions:
Where P(ai) is the probability of taking action i in the state.
Rehn defines the Value Freedom of an agent in a certain state as the “information entropy of the action selection distribution.” In plain English, the Value Freedom is how surprised we will be when an agent takes an action. In the second example, we can determine beforehand that Shiv will pick the Al Pastor, so it will be of no surprise when he does so. This is a low Value Freedom. In the first example, Shiv is just as likely to follow Ashwin’s recommendation as he is to pick anything else, so we can expect to be more surprised by his choice.
In mathematical terms, the Value Freedom is:
If you’re interested in understanding this equation, I suggest The First Principle of Information Theory, a gentle introduction to Information Theory.
Causality and memory play important roles in our understanding of free will. Consider Case 1 where Shiv had the freedom to choose any taco. After Shiv has made his pick, however, he is inclined to stick with it. He now has a strong preference for his pick, whereas he didn’t have one before. The reason for this is subtle and very human: we prefer the decision we have made, even if we were uncertain about making it. Shiv’s decision was still a product of his free will, even though his value freedom has decreased now that he prefers his pick.
This leads to the main point of Rehn’s argument: free will, whether it exists in the greater sense or not, is essential for learning from cause and effect. It is a Reinforcement Learning tool that our minds use to learn from our successes and mistakes. Causality is why we put criminals in jail and attribute successes to the successful: it is an integral component of learning from the past.
However, our perception of Value Freedom is grounded in our learned Q-Values which are often inaccurate. We may think many actions are equally good, but there is only one clear choice. Consider the following scenario:
Shiv and Ashwin have just ordered tacos. Shiv liked all tacos the same, so he thinks his choice of the Baja is freely made. Ashwin later kidnaps Shiv’s firstborn because Shiv didn’t pick the Al Pastor.
This raises a point that Rehn’s equation does not capture: Shiv believes he had the freedom to say any order, but, unbeknownst to him, he only could order the Al Pastor. Shiv was not free to choose the taco. His Q-Values were innacurate because he lacked complete information.
A New Model
In our equation for Value-Freedom, we need to incorporate a measurement of accuracy: the accuracy of the agent to approximate its q-values. We create a new value, the “Knowledge” parameter κ as an inverse KL Divergence of our learned action probabilities, ζ, from the true action probability distribution φ:
This equation encodes the general behavior we need for the knowledge parameter κ. An ignorant agent with a poor understanding of which choices maximize its reward has less freedom, because the KL-Divergence is high. However, the KL-Divergence is unbounded, so we rewrite κ with the sigmoid function modified to be high when the divergence is low (I removed the - in front of KL-Divergence:
We can now rewrite our Value-Freedom using the new knowledge parameter κ:
Value freedom decreases if the agent misunderstands its choices because its knowledge parameter decreases. In the third example, Shiv’s KL-Divergence with QTRUE was high, because he believed each action was equally good, but in reality there was only one clear choice: the Al Pastor. By adding the knowledge parameter κ\kappaκ, we incorporate “freedom of outcome,” not just “freedom of choice.” Many choices may be possible, but some will lead to the same outcome. κ adjusts for this, as well as the agent’s understanding of its actions.
Measuring Understanding
When we measure Value Freedom using the new model, our choice of QTRUE will have a big impact on the metric. But what is QTRUE even saying?
QTRUE is the distribution of true rewards for each action an agent can take in each state. Actions that have good outcomes have high rewards, and actions that have bad outcomes have low rewards. How we measure good and bad outcomes is the domain of morality and religion. Somehow, in our attempt to measure freedom, we have designed an equation which relies on the distinct fields of morality and religion.
For example, if set QTRUE from the Catholic point of view, then you are more free if you choose actions that a perfect Catholic would: getting baptised, going to Mass and confessional, and following the Pope. People who don’t do these things have a lower knowledge (κ) factor and are less free.
Education plays a major role in Value Freedom because it changes the knowledge factor κ. Someone who is unable to make accurate estimations of good actions is not as free as someone who can. It gives new meaning to the importance of education. Societies that pride themselves on being free must have a good education system—without it they cannot be free.
Conclusion
Modeling our core freedoms mathematically is informative for both ourselves and Reinforcement Learning Agents that we train. Future societies may be able to create intelligence so smart that it deserves liberties of its own. How we measure those liberties will play a major role in how we understand the world around us.