R Marries NetLogo : Introduction to the RNetLogo Package

The RNetLogo package delivers an interface to embed the agent-based modeling platform NetLogo into the R environment with headless (no graphical user interface) or interactive GUI mode. It provides functions to load models, execute commands, push values, and to get values from NetLogo reporters. Such a seamless integration of a widely used agent-based modeling platform with a well-known statistical computing and graphics environment opens up various possibilities. For example, it enables the modeler to design simulation experiments, store simulation results, and analyze simulation output in a more systematic way. It can therefore help close the gaps in agent-based modeling regarding standards of description and analysis. After a short overview of the agent-based modeling approach and the software used here, the paper delivers a step-by-step introduction to the usage of the RNetLogo package by examples.


Agent-and individual-based modeling
Agent-based models (ABMs) or individual-based models (IBMs), as they are called in ecology and biology, are simulation models that explicitly represent individual agents, which can be, for example, humans, institutions, or organisms with their traits and behavior (Grimm and Railsback 2005;Gilbert 2008;Thiele, Kurth, and Grimm 2011). A key characteristic of this modeling approach is that simulation results emerge from the more or less complex interactions among the agents. Therefore, such models are useful when local interactions on the micro level are essential for the description of patterns on the macro level.
The origins of the ABM approach go back to the late 1970s (e.g., Hewitt 1976) with the development of so-called multi-agent systems (MASs) in computer science as a part of the distributed artificial intelligence (DAI) research area (Green, Hurst, Nangle, Cunningham, Somers, and Evans 1997;Sycara 1998). Their wider use in computer science began only in the 1990s (Luck, McBurney, and Preist 2003;Wooldridge 2002;Weiss 1999). Definitions of the term MAS and what an agent is, can be found for example in Wooldridge (2002) and Jennings (2000). Examples for the use of MASs with intelligent agents in the field of computer science include computer games, computer networks, robotics for manufacturing, and traffic-control systems (for examples, see Oliveira 1999;Luck et al. 2003;Shen, Hao, Yoon, and Norrie 2006;Moonen 2009).
With increasing importance of questions about coordination and cooperation within MASs the connections to social sciences arose (Conte, Gilbert, and Sichman 1998) and the field of agent-based social simulation (ABSS), that is, an agent-based modeling approach as part of computational sociology became a 'counter-concept' to the classical top-down system dynamics and microsimulation approaches (Gilbert 1999;Squazzoni 2010). ABSS is mainly used for theory testing and development (Macy and Willer 2002;Conte 2006) and applied to simulations of differentiation, diffusion, and emergence of social order in social systems (for examples, see listings in Macy and Willer 2002;Squazzoni 2010) as well as to questions about demographic behavior (Billari and Prskawetz 2003). The most famous models in social sciences are Schelling's segregation model (Schelling 1969) and the Sugarscape model of Epstein and Axtell (1996).
Strongly related to the development of ABMs in social sciences is the establishment of the ABM approach in economics, which is called agent-based computational economics (ACE) and related to the field of cognitive and evolutionary economics. The aims of ACE can be divided into four categories: empirical understanding, normative understanding, qualitative insight as well as theory generation and methodological advancement (for details, see Tesfatsion 2006). It was applied, for example, to the reproduction of the classical cobweb theorem (e.g., Arifovic 1994), to model financial/stock markets (see LeBaron 2000, for a review) as well as to the simulation of industry and labor dynamics (e.g., Leombruni and Richiardi 2004).
In contrast to ABSS and ACE, the agent-based modeling approach has a slightly longer tradition in ecology (Grimm and Railsback 2005). The development of so-called individualbased models is less closely related to the developments of MASs, because ecologists early became aware of the restrictions in classical population models (differential equation models) and looked for alternatives. Over the last three to four decades hundreds of IBMs were developed in ecology (DeAngelis and Mooij 2005). For reviews see, for example, Grimm (1999) and DeAngelis and Mooij (2005).

Links to statistics
Links to statistics can be found in agent-based modeling along nearly all stages of the modeling cycle. Often, models are developed on the basis of empirical/field data. This gives the first link to statistics as data are analyzed with statistical methods to derive patterns, fit regression models and so on to construct and parameterize the rules and to prepare input as well as validation data.
Often, agent-based model rules depend on statistical methods applied during a simulation run. In very easy cases, for example, animal reproduction could depend on the sum of the food intake in a certain period but it is also possible for agent behaviors to be based on correlation, regression, network, point pattern analysis etc.
The third link comes into play when the model is formulated and implemented and some parameters of the model are unknown. Then, methods of inverse modeling with different sampling schemes, Bayesian calibration, genetic algorithms and so on can be used to obtain feasible parameter values.
In the next stage, the model application, methods like uncertainty and sensitivity analysis provide important tools to gain an understanding of the systems' behavior and functioning, i.e., to open the black box of complexity.
The last link to statistics is the further analysis of the model output using descriptive as well as inferential statistics. Depending on the type of model, this can include correlation analysis, hypothesis testing, network analysis, spatial statistics, time series analysis, survival analysis etc.
The focus in this article is on those parts where statistical methods are applied in combination with the model runs.

NetLogo
Wilensky's NetLogo (Wilensky 1999) is an agent-based modeling tool developed and maintained since 1999 by the Center for Connected Learning and Computer-Based Modeling at Northwestern University, Illinois. It is an open-source software platform programmed in Java and Scala and especially designed for the development of agent-based simulation models. It comes with an integrated development and simulation environment. It provides many predefined methods (so-called primitives and reporters) for behavioral rules of the agents. Because it has a Logo-like syntax and standard agent types (turtles, patches, links), in combination with a built-in GUI, it is very easy to learn. Due to its simplicity and relatively large user community, it is becoming the standard platform for communicating and implementing ABMs that previously has been lacking.
For an introduction to NetLogo see its documentation (Wilensky 2013). An introduction into agent-based modeling using NetLogo can be found, for example, in Railsback and Grimm (2012) or Wilensky and Rand (2014).

R
R (R Core Team 2014a) is a well-known and established language and open source environment for statistical computing and graphics with many user-contributed packages.

Note on this article
This work is a mixture of scientific article and tutorial for a scientific tool; writing styles differ between these two elements, but section headings indicate what element each section contains.

Introducing RNetLogo
RNetLogo (Thiele 2014) is an R package that links R and NetLogo; i.e., any NetLogo model can be run and controlled from R and simulation results can be transferred back to R for statistical analyses. This is desirable as NetLogo's support of systematic design, performance, and analysis of simulation experiments is limited. In general, much more could be learned from ABMs if they were embedded in a rigorous framework for designing simulation experiments (Oh, Sanchez, Lucas, Wan, and Nissen 2009), storing simulation results in a systematic way, and using statistical toolboxes for analyzing these results. RNetLogo can be used to bridge this gap since R (together with the enormous number of packages) delivers such tools. Such a seamless integration was already the scope of the NetLogo-Mathematica Link (Bakshy and Wilensky 2007a), which was designed to make use of Mathematica's functionality for"advanced import capabilities, statistical functions, data visualization, and document creation. With NetLogo-Mathematica Link, you can run all of these tools side-by-side with NetLogo" (Bakshy and Wilensky 2007b). RNetLogo offers such a framework for two freely available open source programs with fast-growing communities. RNetLogo itself is open-source software published under the GNU GPL license.
RNetLogo consists of two parts: R code and Java code ( Figure 1). The R code is responsible for offering the R functions, for connecting to Java, and for doing data transformations, while the Java code communicates with NetLogo.
To connect the R part of RNetLogo to the Java part the rJava package for R (Urbanek 2010) is used. The rJava package offers the ability to create objects, call methods and access class members of Java objects through the Java Native Interface (JNI, Oracle 2013) from C. The Java part of the RNetLogo package connects to the Java Controlling API of NetLogo. This API allows controlling NetLogo from Java (and Scala) code (for details, see Tisue 2012).
When NetLogo code is given to an RNetLogo function, i.e., to the R part of RNetLogo, it is submitted through rJava to the Java part of RNetLogo, and from there to NetLogo's Controlling API and thence to NetLogo. In case of reporters, i.e., primitives with return values, the return value is collected by the Java part of RNetLogo, transformed from Java to R by rJava and sent through the R part of RNetLogo to R.
The functions that handle NetLogo code, like NLCommand or NLReport, expect it as a string. Some other functions, e.g., NLGetAgentSet, construct such strings internally from the different function arguments in the R part of RNetLogo. This string is then sent to the Java part of Figure 1: RNetLogo consists of two parts: an R and a Java part. The R part adds the RNetLogo functions to R and uses rJava to connect the Java part. The Java part connects to NetLogo via the Controlling API of NetLogo.
RNetLogo and from there it is evaluated through NetLogo's Controlling API.
When the submitted NetLogo code is not valid NetLogo throws an exception of type 'Logo-Exception' or 'CompilerException' containing the corresponding error message. This exception is further thrown by the Java part of RNetLogo, handled by rJava, and requested finally by the R part of RNetLogo and printed to R's command line. Runtime errors in Net-Logo, like 'java.lang.OutOfMemoryError', are reported in the same manner. A message in R's command line is printed. But errors where the JVM crashes can cause crashes in rJava, which can affect the R session as well.
Some functions of RNetLogo, like NLDoCommand or NLDoReportWhile, require further control flow handling, i.e., loops and condition checkings, which are done by the Java part of RNet-Logo. The methods command and report of class org.nlogo.workspace.Controllable of NetLogo's Controlling API are used as interfaces to NetLogo. All other things are done by the R and the Java part of RNetLogo.

What else?
If only the integration of R calculations into NetLogo (i.e., the other way around) is of interest, a look at the R-Extension to NetLogo at http://r-ext.sourceforge.net/ (see also  can be useful. If we want to use the R-Extension within a NetLogo model controlled by RNetLogo, we should use the Rserve-Extension instead (available at http://rserve-ext.sourceforge. net/), because loading the R-Extension will crash as it is not possible to load the JRI library when rJava is active.  (2006). The following sections provide an introduction to the usage of RNetLogo, however, there are some pitfalls described in Section 5 one should be aware before starting own projects.

Loading NetLogo
To use the RNetLogo package the first time in an R session we have to load the package, like any other packages, with

R> library("RNetLogo")
When loading RNetLogo it will automatically try to load rJava. If this runs without any error we are ready to start NetLogo (if not, see Section 3.1). To do so, we have to know where NetLogo is installed. What we need is the path to the folder that contains the NetLogo.jar file. On Windows machines this could be C:/Program Files/NetLogo 5.0.5/. Here, we assume that the R working directory (see, e.g., functions setwd()) is set to the path where NetLogo is installed. Now, we have to decide whether we want to run NetLogo in the background without seeing the graphical user interface (GUI) and control NetLogo completely from R or if we want to see and use the NetLogo GUI. In the latter case, we can use NetLogo as it was started independently, i.e., can load models, change the source code, click on buttons, see the NetLogo View, inspect agents, and so on, but also have control over NetLogo from R. The disadvantage of starting NetLogo with GUI is that we cannot run multiple instances of NetLogo in one R session. This is only possible in the so called headless mode, i.e., running NetLogo without GUI (see Section 3.6 for details). Linux and Mac users should read the details section of the NLStart manual page (by typing help(NLStart)).
Due to NetLogo's Controlling API changes with the NetLogo version, we have to use an extra parameter nl.version to start RNetLogo for NetLogo version 4 (nl.version = 4 for NetLogo 4.1.x, nl.version = 40 for NetLogo 4.0.x). The default value of nl.version is 5, which means that we do not have to submit this parameter when using NetLogo 5.0.x. Since NetLogo 5.0.x operates much faster on lists than older versions it is highly recommended to use it here (see also the RNetLogo package vignette "Performance Notes and Tests").
To keep it simple and comprehensible we start NetLogo with GUI by typing: R> nl.path <-getwd() R> NLStart(nl.path) If everything goes right, a NetLogo Window will be opened. We can use the NetLogo window as if it had been started independently, with the exception that we cannot close the window through clicking. On Windows, NetLogo appears in the same program group at the taskbar as R. If possible, arrange the R and NetLogo windows so that we have them side by side ( Figure 2), and can see what is happening in NetLogo when we submit the following code.

Loading a model
We can now open a NetLogo model by just clicking on "File -> Open..." or choosing one of the sample models by clicking on "File -> Models Library". But to learn to control NetLogo from R as when starting NetLogo in headless mode, we type in R: R> model.path <-file.path("models", "Sample Models", "Earth Science", + "Fire.nlogo") R> NLLoadModel(file.path(nl.path, model.path)) The Forest Fire model (Wilensky 1997a) should be loaded. This model simulates a fire spreading through a forest. The expansion of the fire depends on the density of the forest. The forest is defined as a tree density value of the patches, while the fire is represented by turtles. If we want, we can now change the initial tree density by using the slider on the interface tab and run the simulation by clicking on the setup button first and then on the go button. In the next section, we will do the same by controlling NetLogo from R.

Principles of controlling a model
In a first step, we will change the density value, i.e., the position of the density slider, by submitting the following statement in R:

R> NLCommand("set density 77")
The slider goes immediately to the position of 77 percent. We can now execute the setup procedure to initialize the simulation. We just submit in R:

R> NLCommand("setup")
And again, the command is executed immediately. The tick counter is reset to 0, the View is green and first fire turtles are found on the left side of the View. Please notice that the NLCommand function does not press the setup button, but calls the setup procedure. In the Forest Fire example it makes no difference as the setup button also just calls the setup procedure, but it is possible to add more code to a button than just calling a procedure. But we can copy and paste such code into the NLCommand function as well.
We now want to run one tick by executing the go procedure. This is nothing new; we just submit in R:

R> NLCommand("go")
We see that the tick counter was incremented by one and the red line of the fire turtles on the left of the View extended to the next patch.
As we have seen, the NLCommand function can be used to execute any command which could be typed into NetLogo's command center. We can, for example, print a message into NetLogo's command center with the following statement: R> NLCommand("print \"Hello NetLogo, I called you from R.\"") The backslashes in front of the quotation marks are used to "mask" the quotation marks; otherwise R would think that the command string ends after the print and would be confused. Furthermore, it is possible to submit more than one command at once and in combination with R variables. We can change the density slider and execute setup and go with one NLCommand call like this: R> density.in.r <-88 R> NLCommand("set density ", density.in.r, "setup", "go") In most cases, we do not want to execute a go procedure only a single time but for, say, ten times (ticks). With the RNetLogo package we can do this with: R> NLDoCommand(10, "go") Now we have run the simulation for eleven ticks and maybe want to have this information in R. Therefore, we execute: R> NLReport("ticks") [1] 11 As you might expect, we can save this value in an R variable by typing: R> ticks <-NLReport("ticks") R> print(ticks)

[1] 11
This was already the basic functionality of the RNetLogo package. In the following section we mostly modify and/or extend this basic functionality.
NetLogo users should note that there is no "forever button". To run a simulation for several ticks we can use one of the loop functions (NLDoCommand, NLDoCommandWhile, NLDoReport, NLDoReportWhile) or write a custom procedure in NetLogo that runs the go procedure the desired number of times when called once by R.
To quit a NetLogo session, i.e., to close a NetLogo instance, we have to use the NLQuit function. If we used the standard GUI mode without assigning the NetLogo instance to an R variable, we can write:

R> NLQuit()
Otherwise, we have to specify which NetLogo instance we want to close by specifying the R variable storing it. Please note that there is currently no way to close the GUI mode completely. That is why we cannot run NLStart again in the same R session when NetLogo was started with its GUI.

Advanced controlling functions
In Section 3.4, we used the NLDoCommand function to run the simulation for ten ticks. Here, we will run the model for ten ticks as well, but we will collect the percentage of burned trees after every tick automatically: R> NLCommand("setup") R> burned <-NLDoReport(10, "go", "(burned-trees / initial-trees) * 100") R> print(unlist (burned) This code ran the simulation for ten ticks and wrote the result of the given reporter (the result of the calculation of the percentage of burned trees) after every tick into the R list burned.
If we want to run the simulation until no trees are left and know the percentage of burned trees in every tick, we can execute: R> NLCommand("setup") R> burned <-NLDoReportWhile("any? turtles", "go", + c("ticks", "(burned-trees / initial-trees) * 100"), + as.data.frame = TRUE, df.col.names = c("tick", "percent burned")) R> plot(burned, type = "s") The first argument of the function takes a NetLogo reporter. Here, the go procedure will be executed while there are turtles in the simulation, i.e., any? turtles reports true. Moreover, we have used not just one reporter (third argument) but a vector of two reporters; one returning the current simulation time (tick) and a second with the percentage of burned trees. Furthermore, we have defined that our output should be saved as a data frame instead of a list and we have given the names of the columns of the data frame by using a vector of strings in correspondence with the reporters. At the end, the R variable burned is of type data.frame and contains two columns; one with the tick number and a second with the corresponding percentage of burned trees. By using the standard plot function, we graph the percentage of burned trees over time ( Figure 3).
We start by loading the required packages and get the patches or, more precisely, the colors and coordinates of the patches: R> library("sp", "gstat") R> patches <-NLGetPatches(c("pxcor", "pycor", "pcolor"), "patches") Next, we convert the patches data.frame to a 'SpatialPointsDataFrame' and then use this 'SpatialPointsDataFrame' to create a 'SpatialPixelsDataFrame': R> coordinates(patches) <-~pxcor + pycor R> gridded(patches) <-TRUE Now, we convert pcolor to a factor, define the colors for the plot and create it (not shown here, similar to Figure 6): R> patches$pcolor <-factor(patches$pcolor) R> col <-c("black", "white") R> spplot(patches, "pcolor", col.regions = col, xlab = "x", ylab = "y") We see that it is possible to get the whole NetLogo View. As we can see in its manual page, we can save the result of NLGetPatches into a list, matrix or, like here, into a data frame. Furthermore, we can reduce the patches to a subset, e.g., all patches that fulfill a condition, as we have done in the NLGetAgentSet example.
There are two other functions that operate the other way around. With NLSetPatches and NLSetPatchSet we can push an R matrix/data frame into the NetLogo patches. NLSetPatches function works only if we fill all patches, i.e., if we use a matrix which has the dimension of the NetLogo World. For filling just a subset of patches we can use the NLSetPatchSet function.
The following example shows the usage of the NLSetPatches function. We reuse the patches.matrix variable from NLGetPatches, change the values from 0 (black) to 15 (red) and use this new matrix as input for the NetLogo patch variable pcolor (Figure 7): R> my.matrix <-replace(patches.matrix, patches.matrix == 0, 15) R> NLSetPatches("pcolor", my.matrix) Another function, NLGetGraph, makes it possible to get a NetLogo network built by NetLogo links into an igraph network. This function requires the R package igraph (Csárdi and Nepusz 2006). As an example, we can use the Small World model from NetLogo's Models Library. We build the NetLogo link network and transform it into an igraph network and finally plot it.
We start by loading as well as setting up the model and get the graph from NetLogo: R> model.path <-file.path("models", "Sample Models", "Networks", + "Small Worlds.nlogo") R> NLLoadModel(file.path(nl.path, model.path)) R> NLCommand("setup", "rewire-all") R> my.network <-NLGetGraph() Figure 7: A screenshot while NLSetPatches is executing. The color of the NetLogo patches on the right hand side is changed gradually from black to red. There are two further functions, which are not presented here in detail. The first one is the NLSourceFromString function, which enables us to create or append model source code from strings in R. A usage example is given in the code sample folder (No. 16) of the RNetLogo package. Another helper function to send a data frame into NetLogo lists is NLDfToList. The column names of the data frame have to be equivalent to the names of the lists in the NetLogo model. The code sample folder (No. 9) includes a usage example.
3.6. Headless mode/Multiple NetLogo instances As mentioned above, it is possible to start NetLogo in background (headless mode) without a GUI. For this, we have to execute the NLStart function with a second argument. This will fail if we do not open a new R session (after using RNetLogo in GUI mode) because, as mentioned above, we cannot start several NetLogo sessions if we have already started one in GUI mode.
The NLStart function will save the NetLogo object reference in an internal variable in the local environment .rnetlogo. If we want to work with more than one NetLogo model/instance at once, we can specify an identifier (as a string) for the NetLogo instance in the third argument of NLStart.
We start with the creation of three NetLogo instances (maybe beside the one with the default identifier which is _nl. All functions presented until now take as last (optional) argument (nl.obj) a string which identifies a specific NetLogo instance created with NLStart. Therefore, we can specify which instance we want to use. When working in headless mode, the first thing to do is always to load a model. Executing a command or reporter without loading a model in headless mode will result in an error. Therefore, we load a model into all instances: R> model.path <-file.path("models", "Sample Models", "Earth Science", + "Fire.nlogo") R> NLLoadModel (

Application examples
The following examples are (partly) inspired by the examples presented for the NetLogo-Mathematica Link (see Bakshy and Wilensky 2007b). These are all one-directional examples (from NetLogo to R), but the package opens up the possibility of letting NetLogo and R interact and send back results from R (e.g., statistical analysis) to NetLogo and let the model react to them. Even manipulation of the model source by using the NLSourceFromString function is possible. This opens up the possibility to generate NetLogo code from R dynamically.

Exploratory analysis
A simple parameter sensitivity experiment illustrates exploratory analysis with RNetLogo, even though NetLogo has a very powerful built-in tool, BehaviorSpace (Wilensky 2012), for this simple kind of experiment. Here, we will use the Forest Fire model (Wilensky 1997a) from NetLogo's Models Library and explore the effect of the density of trees in the forest on the percentage of burned trees as described in Bakshy and Wilensky (2007b).

Database connection
There are R packages available to connect R to all common database management systems, e.g., RMySQL ( In the following example we use the RSQLite package (James 2011), which provides a connection to SQLite databases (Hipp 2012), because this is a very easy-to-use database in a single file. It does not need a separate database server and is, therefore, ideal for agent-based modeling studies, where no large database management systems (DBMS) are used. The database can store the results of different simulation experiments in different tables together with metadata in one file. This makes it very easy to share simulation results. There are small and easy-to-use GUI programs available to browse and edit SQLite databases; see, for example, the SQLite Database Browser (Piacentini 2012).
Finally, we delete/clear the query and close the connection to the database:

Analytical comparison
The example application of Bakshy and Wilensky (2007b) compares results of an agent-based model of gas particles to velocity distributions found by analytical treatments of ideal gases. To reproduce this, we use the Free Gas model (Wilensky 1997b) of the GasLab model family from NetLogo's Models Library. In this model, gas particles move and collide with each other without external constraints. Bakshy and Wilensky (2007b) compared this model's results to the classical Maxwell-Boltzmann distribution. R itself is not a symbolic mathematical software but there are packages available which let us integrate such software. Here, we use the Ryacas package (Goedman, Grothendieck, Højsgaard, and Pinkus 2010) which is an interface to the open-source Yacas Computer Algebra System (Pinkus, Winitzki, and Niesen 2007).
We start with the agent-based model simulation. Because this model is based on random numbers we run repeated simulations.
We start by loading the Ryacas package:

R> library("Ryacas")
We can install Yacas, if currently not installed (only for Windows -see Ryacas/Yacas documentation for other systems) with:

R> yacasInstall()
Next, we get the mean energy from the NetLogo simulation and define the function B and register it in Yacas: R> energy.mean <-NLReport("mean [energy] of particles") R> B <-function(v, m = 1, k = 1) + v * exp((-m * v^2) / (2 * k * energy.mean)) R> yacas(B) Then, we define the integral of function B from 0 to infinity and register the integral expression in Yacas: R> normalizer.yacas <-yacas(N(B.integr)) R> normalizer <-Eval(normalizer.yacas) R> print(normalizer$value) [1] 50 In a further step, we calculate the theoretical probability values of particle speeds using Equation 1. We do this from 0 to the maximum speed observed in the NetLogo simulation.

Advanced plotting functionalities
R and its packages deliver a wide variety of plotting capabilities. As an example, we present a three-dimensional plot in combination with a contour map. We use the "Urban Site -Sprawl Effect" model (Felsen and Wilensky 2007) from NetLogo's Models Library. This model simulates the growth of cities and urban sprawl. Seekers (agents) look for patches with high attractiveness and also increase the attractiveness of the patch they stay on. Therefore, the attractiveness of the patches is a state variable of the model, which can be plotted in R.

Time sliding visualization
As agent-based models are often very complex, more than three dimensions could be relevant for their analysis. With the RNetLogo package it is possible to save the output of a simulation in R for every tick and then click through, or animate, the time series of these outputs, for example a combination of the model's View and distributions of state variables. As a prototype, we write a function to implement a timeslider to plot turtles. This function can be extended to visualize a panel of multiple plots by tick. With a slider we can browse through the simulation steps. To give an example, we use the Virus model (Wilensky 1998) from NetLogo's Models Library to visualize the spatial distribution of infected and immune agents as well as boxplots of the time period of infection and the age in one plot panel. We first load the required package rpanel (Bowman, Crawford, Alexander, and Bowman 2007) and define a helper function to set the plot colors for the logical variables (sick, immune) of the turtles: R> library("rpanel") R> color.func <-function ( Next, we define the main function containing the slider and what to do if we move the slider. The input is a list containing data frames for every tick. When the slider is moved, we send the current position of the slider (i.e., the requested tick) to the plotting function, extract the corresponding data frame from the timedata list and draw a panel of four plots using this data frame.

Endless loops
If we use the functions NLDoCommandWhile and NLDoReportWhile, we should double check our while-condition. Are we sure that the condition will be met some time? To prevent endless loops, these functions take an argument max.minutes with a default value of 10. This means that the execution of these functions will be interrupted if it takes longer than the submitted number of minutes. If we are sure that we do not submit something that will trigger an endless loop, we can switch off this functionality by using a value of 0 for the max.minutes argument. This will speed up the operation because the time checking operation will not be applied.

Data type
The general mapping of NetLogo data types to R data types in RNetLogo is given in Table 2.
We should think about the data types we are trying to combine. For example, an R vector takes values of just one data type (e.g., string, numeric/double or logical/boolean) unlike a NetLogo list, which can contain different data types. Here are some examples.
First, we get a NetLogo list of numbers:

R> NLReport("(list 24 23 22)")
Second, we get a NetLogo list of strings: R> NLReport("(list \"foo1\" \"foo2\" \"foo3\")") Third, we try to get a NetLogo list of combined numbers and a string: R> NLReport("(list 24 \"foo\" 22)") The first two calls of NLReport will run as expected but the last call will throw an error, because NLReport tries to transform a NetLogo list into an R vector, which will fail due to the mixed data types. This is also relevant in particular for the columns of data.frames.

Data structure
Since RNetLogo does not restrict how NetLogo reporters are combined, it is very flexible but makes it necessary to think very carefully about the data structure that will be returned. How a NetLogo value is transformed in general is already defined in Table 2.
But this becomes more complex for iteration functions like NLDoReport where the return values of one iteration are combined with the results of another iteration, especially when requesting the result as a data frame instead of a list.
For example, it makes a difference in the returned data structure when we request two values as a NetLogo list or as two single reporters in a vector (Table 3). Requesting the values as a NetLogo list returns a top-level list containing a vector of two values for all requested iterations. Requesting two single reporters returns these in a list as an entry of a top-level list. Therefore, this results in a nested list structure. There is not a wrong or preferred solution, it just depends on what we want to do with the result.
Requesting the result of NLDoReport as a data frame converts the top-level list to a data frame in a way that the top-level list entries become columns of the data frame and one iteration is represented by a row. This becomes problematic when nested NetLogo lists are requested (Table 4). In such a case, the nested NetLogo lists are transformed into R lists and the resulting data frame contains lists in its columns. Such a data structure is a valid, but uncommon, data frame and some functions, like write.table, can operate only with a data frame that contains just simple objects in its columns. To make a data frame with nested lists fit for functions like write.table we have to use the I(x) function for the affected columns to treat them 'as is' (see help(I) for details, e.g., my.df$col1 <-I(my.df$col1)).
Furthermore, using an agentset in an NLDoReport iteration with data frame return value can become problematic. As long as the number of members of the agentset does not change, it can be requested without problems in a data frame. The data frame contains one column for each agent and one row for each iteration. If the number of agents changes during the iterations the resulting data frame is not correct as it contains entries that do not exist. The number of columns equals the maximum number of agents over all iterations. For those iterations that
R> res <-NLDoReport(3, "go", "[who] of turtles", as.data.frame = TRUE) R> str(res) 'data.frame': 3 obs. of 7 variables: $ X1: num 2 4 0 $ X2: num 0 2 6 $ X3: num 3 0 4 $ X4: num 1 3 1 $ X5: num 2 1 5 $ X6: num 0 4 3 $ X7: num 3 2 2 The first iteration contains four turtles, the second five and the third seven turtles. The returned data frame therefore contains seven columns. Entries in columns for the first and the second row (i.e., iteration) are repeated from the first columns. But fortunately we are warned by R that the length of the vectors differ. When we cannot be sure that the number of return values is always the same over the iterations we should use the default list data structure instead of the data frame return structure. Furthermore, if we want to request an agentset, we should better use the NLGetAgentSet function in an R loop, as shown in Section 4.5, because it returns the requested values in a sorted order; for agents by their who number and in case of patches from upper left to lower right.
These examples illustrate that it is necessary to think about the data structure that is required for further analyses and which function can process such a data structure.

Working directory
We should avoid changing the working directory of R manually, because NetLogo needs to have the working directory pointed to its installation path. As the R working directory and the Java working directory depend on each other, changing the R working directory can result in unexpected behavior of NetLogo. Therefore, we should use absolute paths for I/O processes in R instead of submitting setwd(...). Note that the RNetLogo package changes the working directory automatically when loading NetLogo and changes back to the former working directory when the last active NetLogo instance is closed with NLQuit.

Discussion
This article gave a theoretical and practical introduction to the RNetLogo package. The reader should be well-prepared to start his/her own projects based on RNetLogo after studying the examples. Since there are so many interesting packages available in R with connections to many other programs, it is really amazing what this connection offers to both, R users and NetLogo users.
Note that there are code samples for all functions in the example folder (RNetLogo/examples/ code_samples) of the RNetLogo package. Furthermore, there are some example applications in the example folder, similar to those presented here.
As presented the RNetLogo package successfully links the statistical computing environment R with the agent-based modeling platform NetLogo. Thereby it brings together the world of statistics and data analysis with the world of agent-based modeling. From the viewpoint of an R user it opens up the possibility to access a rule-based modeling language and environment. Therefore, (nearly) all types of agent-based and system-dynamics models can be easily embedded into R. NetLogo's Models Library gives a nice impression of what kind of models can be built, from deterministic to stochastic, from non-spatial to spatial models, from 2D to 3D, from cellular automata over network models and artificial neural networks to L-systems and many others more.
Bringing simulation models to R is not entirely new. There are, on the one hand, other modeling environments, like Repast (North, Collier, and Vos 2006), that open the possibility to send data to R. But the ability to control simulation experiments from R is new for such modeling tools. NetLogo was selected because it is very easy to learn, very well designed, and much better documented than other ABM platforms. It has a very active user community and seems to be the most appropriate basis for all kinds of modelers, from beginners to professionals and from ecology over social sciences to informatics. On the other hand, there are packages available to build simulation models directly in R, like simecol (Petzoldt and Rinke 2007). Especially simecol is fast and very flexible and a good choice in comparison to implementations in pure R but it does not provide specific support for making model development and simulation efficient as agent-based model environments like NetLogo and Repast do.
Some first use-cases of RNetLogo have been presented in this article. Beside the advanced visualization possibilities and connections to other software an important application area is the design and analysis of simulation experiments in a systematic, less ad-hoc, way. R delivers all necessary functions of the design of experiments (DoE) principles. With RNetLogo the technical connection between all kinds of DoE and ABM is available.
There are already ready-to-use solutions for model analysis/DoE techniques available for agent-based modeling, like BehaviorSearch (Stonedahl and Wilensky 2013), MEME (Iványi, Gulyás, Bocsi, Szemes, and Mészáros 2007), and openMOLE (Reuillon, Chuffart, Leclaire, Faure, Dumoulin, and Hill 2010), but they are less flexible and adaptable than R. Often, for one task, several packages in R are available and if not, writing own functions is flexible and fast, especially because many scientists know R already from its application for data analysis. Since RNetLogo does not restrict the user to predefined analysis functions it opens up a large flexibility. But RNetLogo can only check the submitted NetLogo code strings at runtime. This is a disadvantage, although the NetLogo code strings are typically simple and lack of automated checking encourages well-designed analysis. Nevertheless, RNetLogo requires the user to understand data types and structures of both NetLogo and R.
RNetLogo pushes the documentation and therefore the reproducibility of agent-based modeling studies, a key feature of science, to a new level. Using RNetLogo in conjunction with tools like Sweave (Leisch 2002), odfWeave (Kuhn, Weston, Coulter, Lenon, and Otles 2010) or SWord (Baier 2009) will contribute to replicability and reproducibility of agent-based simulation studies by automatic and self-documented report generation. For example, Sweave can embed R code in a L A T E X text document. When compiling the Sweave document, the R code is evaluated and the results (not only numeric but also images) can be embedded automatically in the L A T E X document. The RNetLogo package opens up the possibility to embed not only results of R, but also the result of a NetLogo simulation. We can create a self-documented report with NetLogo simulations and R analytics (with or without source code). For an example see the Sweave code of this article.
Since models become more complex their computational requirements are increasing as well. A lot of these requirements are compensated by increasing computational power, but the use of modern model development and analysis techniques for stochastic models, like Bayesian calibration methods, make a large number of repeated simulations necessary. Using RNetLogo includes, of course, an overhead when converting model results from NetLogo to R and vice versa, but there are already techniques available to spread such repetitions to multi-cores and computer clusters (see the RNetLogo package vignette "Parallel Processing with RNetLogo").
To sum up, I expect that this contribution will make agent-based modeling with NetLogo more popular and easier in the R community and will support the methodological developments towards rigorous model development, testing and analysis in the ABM community.