Let’s go over a few things freeciv!

Setting up the foundation right is the key to great development

I used to write on Medium before this, these are ports.

As I talked in my previous post, I have a keen interest in setting up a freeciv learning environment in python. But many people thought that the details were either far and wide or too many ideas were clustered into it. Over the next few posts I wanted to clarify any issues and lay the groundwork for the project.

Setting up an Open Source Project is more than easy these days, all you have to do is sign up a Github account and boom, you are ready! But the hard part is running and maintaining the codebase, and even harder is making people care enough to contribute anything to your project and I want to focus on the latter here. Through these posts I want to clarify my ideas and discuss the issues raised. This time around I want to go over these three points:

  • Action Spaces: Why discrete action space of freeciv is important
  • Game Structure: How the vastness of game is similar to real world
  • Multiplayer Collaborative Idea: How freeciv support different kinds of interaction and what is so special about it

    And in each of those points I will also put sub-points to make understanding easier.

    I am not asking the very hard questions here, like why care about anything. But why care about this project?

    1. Action Spaces

    Every action in our universe can be broken down into two types, a continuous action space and a discrete action space. In layman terms discrete spaces are where hard decision need to be taken and continuous spaces a distribution is to be generated.

  • Most of the applications in real life domain that involve use of hardware (Robotics, Self Driving Cars, etc.) require a distribution of values to be generated. Movement vectors used to operate them is a set of force values and directions. This domain is still in its early stages, but the applications nonetheless will be far felt. It will continue to improve over time, though at a slower rate and the reason for that is the simulators are just not good enough (ironically agent trained on robot simulator rarely gives same results in real life robot). And our algorithms just aren’t good at sample efficiency and training on real life robots would mean extreme long training durations making it unfeasible.

    Self driving car operates in the continuous distribution

  • Discrete action spaces have a different task, all those applications where hard decisions are to be made we have a discrete space. Video games a great example of this, they need a very specific single action that needs to be taken at any given moment. So far most of the research that we have seen has been in this domain, DQN which revolutionised the idea of deep RL takes actions in discrete domain. Medicine, where we require roll out of actions over time is an example of real world application in discrete spaces. Other applications are, optimising video quality in streaming services which are affected due to poor network conditions, notification systems on social networking sites to notify you only the important stuff, etcetera.

    Montezuma’s Revenge is an Atari game with discrete action space. This also happens to be a notoriously difficult game for AI to master due to its sparse reward structure.

    We do not need a single agent that is good at everything, in fact the future will be filled with ensemble architecture where multiple agents each good at a particular thing collaborate to achieve goals far superior to what human can, this is what we mean by super-intelligence. Freeciv being a video game operates in discrete spaces. This is one of the reasons for choosing it, it will be one of the most challenging learning environments. Though there is still a possibility of using continuous actions to be taken in this. More details on this later!

    2. Game Structure

    Saying freeciv is vast is an understatement. There are so many things you can do that it’s hard to remember them all, also the action taken at any given moment can affect the results at later stage in the game.

  • Each unit has dynamic number of moves i.e. every unit has different actions that it can take, each city can build new units and items in it. In the image below we can see there are 12 of actions for a single unit. This challenges AI to optimise according to the unit it is controlling at that given moment. Now you can have a complicated single network to run this, but I think better results can be obtained by using ensemble.

    The game at turn 314 is so advanced, though the game in ‘default’ mode can run to 5000 turns. The small circles at the bottom show the actions that can be takes for any unit. 12 actions can be taken for this particular unit, 1 less than the Atari games! This is an amazing example of the vastness of this game.

  • Other than the actions, taking a look at technology tree can give you the scale of the variety of decisions that have to be taken and its consequences. In order to progress through game agent needs to obtain new and different technologies. Each technology is a node in a big tree, to get one, you need to complete the previous ones. This poses a challenge for AI to do structural planning for long time steps. This is very different than current state of the art environments where a single action is chosen by looking at the current frame.

    A sample tech-tree in freeciv. As you can see in the right, to get Electricity, you need to have Metallurgy and Magnetism. This complex intermixing of possibilities make life tricky for our small agent.

  • I also want to talk for a moment about the cities. There is clear correlation between the quality of your cities and your power. Cities lie at the heart of this game, improving them improves the odds of your success. Take a look at a simple city below, this is one big city. Each tile in the ‘city-radius’ has values associated with it like food and gold. The better quality you have for the tiles, the better are your cities and in turn your civilisation.

    A sample city, more details about the cities here

  • Due to fog of war (and unclear worklists, resources) position/status of the enemy and overall map is not fully known — hence risky investments (explorer teams) are required to gain information.

    2. Game Structure

    Freeciv can be played be large number of players, upto 126 in freeciv-desktop and upto 500 in freeciv-web to be specific. There are rules according to which the game is to be played, these are called rulesets. In the classic/default ruleset there are three ways to win:

  • As in other games of conquest and expansion, you are declared the winner by default once the last city and unit of every other civilization is destroyed.
  • Once technological progress has brought you into the space age, you may launch a spacecraft destined for Alpha Centauri; the first civilization whose craft reaches the system wins.
  • In the absence of other means to determine victory, the game will end after 5000 turns if no spacecraft have yet been launched. The surviving civilisations are then rated, and the one with the highest score is the winner.

    Due to the open source nature of project the rulesets can be modified to the researchers needs. We intend to make doing this a breeze so researchers can spend most of their time developing their models and don’t have to worry about boring stuff. This also allows for extreme customisation that can make this environment even better.

    Look at the minimap at bottom left, the opportunity to see how various AIs will compete with each other is one thing that really excites me about this!

  • Diplomacy is a huge part of this game and the interactions that take place can be very important. It is in favour of players to be open and trade to boost the economies of nations, but later on in the game become enemies and fight each other. All of this can of course be changed by changing the structure of the game in configuration files *.fcfg. This kind of collaboration can yield great results in ensemble AI systems, where many small AI systems combine to make the whole system work.

    With the large number of players that can play this game there is a opportunity to study how collaborative systems would work. With recent developments in the area of competitive self-play and collaborative-play, it will be interesting to see how world will pan out when large number of AIs are left alone in an earth-like world. Maybe we will understand how smart intelligence will emerge out of this mess.

    Cheers!