Nature includes many distributed systems made up of units that interact using local rules, including, for example, insect swarms, neural networks, developing cells, bird flocks, and even human populations. The collective effect of these simple local actions is often complex global behavior. A major interest in biology is understanding these systems.

A standard approach to advancing this goal is to apply the most
common
tools from science: differential equations and statistical analysis.
For many of the type distributed systems described above, however, these tools
seem to fall short. With this in mind, there has been a growing interest in
recent years in applying *algorithmic methods*, primarily innovated
in the study of computer science, to better describe and analyze distributed
behavior in nature.

This topic has many names, depending on who you ask, including *natural
algorithms*, *biological algorithms*, and *biological
distributed algorithms*.
The core approach of this topic is to attempt to describe the local rules in a
natural distributed system as an algorithm, and then study the behavior or
fundamental
capabilities of the
resulting system using existing analytical tools from computer science.

These tools allow us to ask and answer deep questions about these systems that cannot be so easily probed by differential equations, statistical analysis, or numerical simulation; for example, identifying the number of bits of memory needed to break symmetry among cells communicating with chemical signaling (the answer turns out to be the logarithm of the inverse of the error bound), or the computational power of a soup of reacting chemicals (under the right conditions, these soups can compute anything a general purpose computer can compute).

A secondary motivation for studying biological algorithms is that they might generate new ideas and approaches for designing electronic computational systems that feature some of the nice properties observed in nature.

This course is a doctoral seminar that will review and discuss the rapidly evolving literature from the study of biological algorithms. Students will come away with a cutting-edge understanding of the main topics, results, and techniques in the field, as well as an understanding of the most promising open problems.

The course is a reading course. We will discuss two or three papers per class meeting. For each paper, we will cover the problems studied and their motivation, as well as the solutions proposed and any results. We will selectively dive into technical details as a group in places where I feel it is tractable and illuminating.

Most of the papers we read will come from the computer science perspective, though we will occasionally read papers from biology, physics, and related fields, where relevant.

A week before each class meeting, I will email a set of discussion questions for each of the papers to be covered in that meeting. This email will also include guidance on what sections to focus on in the readings and what sections can be skimmed or skipped. Students must email me responses to these questions before the class meeting begins. The goal of these questions is to both ensure you do the reading and to help guide your attention to the areas most fruitful for discussion.

In addition to preparing readings, each student will be assigned one class meeting toward the end of the semester in which to prepare and lead a discussion on a topic of their choosing within biological algorithms. This will be your opportunity to practice surveying this literature and extracting interesting trends and directions on your own.

Grading in a doctoral seminar is pass/fail. To achieve a pass requires three things: (1) good answers to most of the reading response questions; (2) participating in discussion; (3) a respectable effort in organizing your assigned class meeting.

Though there isn't a formal research requirement for this seminar, I encourage interested students to use this seminar to indentify promising research topics to pursue beyond the scope of the course. I am happy to discussion research collaborations. In addition, a fantastic NSF-funded workshop on this topic will be held this summer in Washington D.C. -- presenting a perfect venue to present or discuss early stage results and learn about cutting-edge developments.

I will post the readings for each class meeting in the schedule below. Notice, I'll be filling in and modifying the details of this schedule on the fly as the semester progresses, so please check often for changes.

As of now, the tentative list of topics I hope to cover with these readings includes:

- Social insect computation
- Beeping networks
- Population protocols
- Chemical reaction networks
- Cellular computation
- Neurons as general purpose computing units
- Noisy computation
- Chemical reaction networks
- Averaging protocols
- The brain as computer

A paper published in the journal *Science* in 2011
claimed that cells in fly embryos might be solving a well-known problem
in computer science as part of their development. The paper studied the
communication capabilities of these cells and proposed an algorithm that
the cells might be executing to solve the problem. A series of follow-up
papers within the computer science community grappled with the same model
and problem, highlighting tensions between the computer scientist's instinct
to identify the best possible solution, and nature's instinct to find
something
simple that works. We are starting with this topic as it provides a good
entry into the general approaches used in the study of biological algorithms,
and the tensions they create.

*Because this is the first
class, I do not expect the students to have completed this reading in advance.*

- A Biological Solution to a Fundamental Distributed Computing Problem.
- Beeping an Maximal Independent Set.
- Feedback from nature: an optimal distributed algorithm for maximal independent set selection.

A famous 1990 paper by Mirolla and Strogatz (cited over 1800 times), appearing in an applied mathematics journal, opens unexpectedly: “Fireflies provide one of the most spectacular examples of synchronization of nature.” The authors wanted to explore how simple local rules could lead to such impressive global displays.

The three papers we’ll read this week follow this thread of investigation, starting with the classical systems biology approach of Mirolla and Strogatz, to a more algorithmic, but still bio-inspired approach applied by a group of Harvard computer scientists, to finally, a purely computer science approach applied a pair of optimization minded MIT-based theoreticians.

These readings provide a good example of the power gained when shifting
one’s conception of a natural phenomenon from a system of equations to a
distributed system of interacting algorithms. They also, however, provide
another example of the tension between a biologist’s interest in
understanding nature, and a computer scientist’s interest in identify
the *optimal* solution to a problem for a given set of
constraints.

**Readings**

- Synchronization of Pulse-Coupled Biological Oscillators
- DESYNC: Self-Organizing Desynchronization and TDMA on Wireless Sensor Networks
- Deploying Wireless Networks with Beeps

**Reading Instructions**

For the Mirollo paper, read through Section 2.2 and understand the results proved. You do not have to understand the techniques. For the Degesys paper, read through Section 3. You can skim the discussions of how the algorithms are applied to wireless networks. Also look at the results proved in Setion 4. You do not need to know the details of how they are proved, but glance at the techniques used. Finally, for the Cornejo paper, read through the algorithm description in Section 4. Also take a look at the algorithms and results proved in the sections that follow. You do not need to look deeply at the proof details.

**Reading Questions**

- Summarize at a high-level the different types of analytical approaches taken in the analysis in each of the three papers (e.g., types of areas of math used).
- Cornejo and Kuhn are unhappy with the model used in the Degesys paper (they actually cite another paper that builds on the Degesys paper and uses the same model). Why?
- Does the complaint from the above question invalidate the Degesys model from being used to study biological phenonomenon? Why?
- Rate these papers in terms of their relevance to understanding biological systems like firefly flashing, and explain your ratings.

During the first two weeks of this seminar we have studied systems made up of components that communicate with simple signals (e.g., protein expression, beeps and/or flashes). The pair of papers assigned for this week apply techniques from computer science to understand the computational power of these bio-inspired systems.

These papers highlight how concepts of complexity and computability from theoretical computer science can apply to biological systems, and identify interesting trade-offs between properties such as logic complexity, synchrony, and signal diversity in determining the power of a system to solve computational problems.

**Readings**

**Reading Instructions**

For the Gilbert paper on the computational power of beeps, understand the model definition and its motivation. Understand the definition of the leader election problem. Understand the basic operation of Algorithm 1 in Section 3.2. For the sub-sections that follow in Section 3 (3.3, 3.4, 3.5), understand the main theorem statements in each section and get a rough sense of the algorithmic strategies that combines with Algorithm 1 to obtain the results. Finally, understand what is being claimed by the main theorem statement in Section 4. There is no need to dive into proof details in this paper, though you might find it useful to get a general sense of some of the techniques.

For the Emek paper on stone age computing, understand the model (which is a little tricky in its details) and its motivations. You should understand the problems solved in Sections 3 - 6, but it is sufficient to just get a general sense of how they solve them. There is no need to dive into the proof details.

**Reading Questions**

- A major difference between the models is assumptions on timing: one is synchronous one is asynchronous. What would happen to the leader election strategies in the Gilbert paper if assumptions on round synchrony were loosened?
- The Gilbert paper proved that you need a non-constant number of states to solve leader election with non-constant error probability. The Emek paper solves the MIS problem (which in a single hop network is the same as leader election) with probability 1 in the limit using machines that have only a constant number of states. What explains this apparent contradiction?
- The Gilbert paper assumes a
*one-many*principle in receiving information, whereas the Emek paper assumes a*one-two-many*principle. Define what these mean and argue one of the following two points: (1) from a computational perspective they are essentially the same; or (2) there are non-trivial differences in computational power between these two.

Earlier in the 20th century, breakthrough biology research on ants focused
on observing colony behavior and decoding ant biology (this is the period
in which E.O. Wilson identified chemical communication mechanisms).
Later in the century attention turned toward understanding *how*
ants could create complex societies through simple local rules.
This systems biology questions is perfectly suited for the application of
an algorithmic approach. This week, we will study a sampling of the resulting papers.

Notice how the papers grapple with the right amount of complexity to use in describing the ants: too little power, and nothing interesting is possible, too much, and the ants can compute anything. What's the right approach?

**Readings**

- Deborah Gordon's TED Talk: The emergent genius of ant colonies
- Collaborative Search on the Plane without Communication
- Trade-offs Between Selection Complexity and Performance when Searching the Plane without Communication
- Task Allocation with Ant Colonies
- Of Robot Ants and Elephants: A Computational Comparison

**Reading Instructions**

Watch the Gordon video (or read the transcript). It provides a good introduction to some key ant colony behavior and how biologists study them (Gordon is a leader of a more algorithmic approach to the topic). For the four technical papers, your goal should be to understand: (1) how they model the ants; (2) the problems they have the ants solving; (3) the algorithmic strategies used to solve the problems; (3) the bounds and impossibility results established. You do not have to dive deep into the technical details of the proofs, but do get a sense of the type of techniques used. For the Ants and Elephants paper, in particular, focus mainly on the Elephant Gun algorithm, as that's the result we will mainly discuss.

**Reading Questions**

- In the Feinerman paper, they prove that no uniform search algorithm is O(log(n)) competitive. What does that mean? What is the implication for understanding real any colonies?
- The Lenzen paper introduces a metric called selection complexity. Why is it useful for studying biological algorithms?
- Come up with another definition of selection complexity that captures the same general goals.
- Which of the result(s) in these papers do you find to be most relevant to biology, and why?

One of the most impressive displays of distributed intelligence in nature is the ability of insect swarms to construct complex structures (think: wasp nests and termite mounds). It is widely understood that no single insect has the cognitive capabilities to understand all aspects of these constructions, so they represent a classical example of an emergent result of simple rules.

This week's readings include two biology papers on particular distributed construction strategies, and a computer science paper that encodes similar rules into a robot swarm. Our goal is to understand the types of distributed algorithms deployed in these scenarios, and to probe what role (if any) algorithmic analysis might play in increasing our understanding of these behaviors.

**Readings**

- Decentralized Control of Construction Behavior in Paper Wasps
- Coordination in Distributed Building
- Distributed Construction by Mobile Robots with Enhanced Building Blocks

**Reading Instructions**

For the biology papers, they key is to get a feel for the types of local rules these insects seem to be deploying along with the techniques used to try to identify and validate these rules. For the computer science paper, get a sense of the types of algorithms they are deploying on the swarm and what is required to make them function.

**Reading Questions**

- How do the biologists determine which rules they believe the insects they study actually deploy?
- How do the computer scientists know their algorithms will work?
- What role do you think algorithmic analysis might have in understanding distributed building strategies?

One of the most successful existing intersections between the study of
distributed
algorithms and natural systems is the topic of *population protocols.*
The population protocol model assumes a collection of finite state machines
that interact in a pairwise fashion as selected by some scheduler.
These networks can *compute* in the sense that we can treat the multiset of
initial states
as an input, and ask the network to stabilize to a configuration where the
multiset of stable states is the corresponding output.

Because this model is precise in its definition, computer scientists could
probe its exact computational power. It turns out, however, that many other
systems (including those from the natural world, such as *chemical reaction
networks*)
can be simulated by this model, so these results provided insight into the
power of many other systems. This week we will learn about this model,
its computational power, and the types of strategies used to solve problems
in this setting.

**Readings**

- Computation in Networks of Passively Mobile Finite-State Sensors
- Stably Computable Predicates are Semilinear
- A Simple Population Protocol for Fast Robust Approximate Majority

**Reading Instructions**

Start with the first paper in the list as this was the original paper on the topic. It is worth reading the whole (main body) of the paper, which avoids a lot of technical details, and which summarizes a lot about this model, its motivation, and what you can do with it. Pay particular attention to the strategy in Section 4.2: we will see it again.

Then proceed with the *Stably Computable Predicates* paper. This
paper was important as it precisely bounded the power of the population
protocol model (under certain assumptions). Read the paper through Section 3
to understand *what* they are proving. You can skip all the technical
details
of the proof that then follow.

Finally, for the approximate majority paper, learn the problem they solve and
their simple solution. Pay attention to the discussion in the introduction
about
what makes a *good* population protocol solution. You do not need
to follow the proof, but maybe glance at it to get a sense of the type
of analysis they are deploying.

**Reading Questions**

- The first reading shows how to simulate a bounded space TM using population protocols. The second paper shows population protocols can only compute semilinear predicates. Bounded TM's can compute things that are not semilinear predicates. What explains this contradiction?
- Describe the transition function for a population protocol that has only
two states (
*leader*and*not leader*), and that guarantees that if all agents start in the*leader*state, then the system will eventually converge to a configuration with exactly one agent in the*leader*state.

The behaviors of chemical reactants in a well-mixed solution have
been closely studied for over a century.
In more recent years, however,
an interdisiplinary collection of scientists have been studying
how these *chemical reaction networks* can be made to compute solutions
to problems.

Part of the interest in this work is understanding the
fundamental power of natural systems. Another part of hte interest is figuring
out how to design biological computing devices.
This work has strong intersections with computer science (including with the
study of *population protocols*, which we reviewed last week in this
seminar.)
This week we try to understand some of these connections.

**Readings**

- Computation with Finite Stochastic Chemical Reaction Networks
- Deterministc Function Computation with Chemical Reaction Networks
- Speed Faults in Computation by Chemical Reaction Networks

**Reading Instructions**

Start with the *Computation with Finite Stochastic Chemical Reaction
Networks* paper.
Read Sections 1 - 3. Pay careful attention to the model of SCRN's described in
Section 2. You can skim Section 3 at a level of detail that still allows
you to understand the basic ideas behind the counter machine
simulation.

For the *Deterministic Function* paper, it is sufficient to read
enough
to understand what this paper is doing that is different than
the paper above, and what is different from what was shown in the population
protocol
papers read last week.

Finally, for the *Speed Faults* paper, read the introduction, which
captures
a lot of information information. Then read enough to understand what they
mean by a *speed fault* and what is meant by the main theorems.

**Reading Questions**

- The
*Finite Stochastic*paper simulates Turing-universal computation with chemical reaction networks. How is a notion of a TM tape handled in a system made up only of molecules? - Many of these papers cite an equivalence between what can be reliably computed with a CRN and what can be computed stably with a population protocol. Discuss why such an equivalence is true? How would you show such equivalency?

This week we will continue our discussion of chemical reaction networks. A big part of our focus will be to complete our discussion of the three papers from last week (one class is not enough to cover all three). We will then tackle the below paper which applies the CRN model to help understand a phenomenon from real biological systems (as oppose to the main focus of many CRN papers which is to develop computational systems that can be implemented in a synthetic biology context).

**Readings**

**Reading Instructions**

Try to understand what phenomenon this paper studies and the role CRN's play in this investigation. It is unnecessary to follow all the technical details.

**Reading Questions**

- What is
*bet hedging*in the context of this paper? - What role do CRNs play in this paper's attempt to understand this phenomenon?

One of the most obvious intersections between biology and computation is the information processing capability of neurons. This intersection is studied in computational neuroscience by first defining a biologically plausible neuron model, then studying the capabilities of neural networks of this type from a theoretical/numerical perspective, and finally using these insights to generate testable hypotheses about brain behavior.

**Readings**

- Networks of Spiking Neurons
- Efficient Simulation of Finite Automata by Neural Nets
- Turing Computability with Neural Nets

**Reading Instructions**

Start by reading Section 1 of the "Networks of Spiking Neurons" paper: this will provide the background regarding the three main types of neuron models used in these types of studies. Next, move on to the "Efficient Simulation of Finite Automata" paper. Read the introduction for a summary of the results and their motivation, then read the definition of Mealy Machines from Section 3 and the definition of Threshold Machines (their term for the neuron model they study) from Section 4. If you want, you can try to get a rough sense of their strategy for simulating Mealy Machines with neurons. Next read the "Turing computability with Neural Nets" paper, which is a short summary of a longer technical report. They core ideas to come away with from this paper are the neuron model used, and the basic strategy used to simulate a TM with these neurons. Finally, return to the "Spiking Neurons" paper and skim over the four theorem statements proved, to get a sense for the power of this model versus some of the other models considered.

**Reading Questions**

- The "Spiking Neuron" paper defined three generations of neuron models. Which generation of models is used by the "Finite Automata" paper and which generation is used by the "Turing Machine" paper?
- The "Turing Machine" paper does not directly simulate a TM, but instead simulates another computational model that is known to be the same power as TM. What is this model?
- The authors of the "Efficient Simulation" paper note that their results might be "disturbing." Why?

In this week's discussion we will tackle two new papers about computation with neurons. The first present some interesting new theory results on the computational power of these objects in the presence of failures, while the second is a good example of a cutting-edge application paper that is making use of artifical neural networks. Reading these back to back should provide some interesting insight into the potential connections and large gaps between the theoretical and practical engagement with this topic.

**Readings**

Earlier in the seminar we studied results that simulated turing machines using neural networks. A lot of effort was expended to encode memory using only the properties of the neurons. In some sense, this is silly, as we know the brain already has well developed mechanisms for storing and retreiving information, and hoping that an artificial neural network will efficiently converge to solutions that leverage fragile TM simulation strategies is too optimistic. The neural turing machine model studied today directly models an information storage and retrieval capability (while maintaining the differentiability needed for learning) with the goal of making it easier for learning algorithms to achieve powerful artifical neural networks. These machines provide the foundation for many of Deep Mind's recent breakthroughs, and in the context of this course, help highlight the gap between theoretical computational power and practically exploitable computational power.

**Readings**

- Neural Turing Machines
- Hybrid computing using a neural network with dynamic external memory (web link)

During the class, we studied different population protocols. In this session, we will study a trade-off between the time complexity of the population protocol and space complexity of the agents. In the first paper, we will see interesting lower bound on the time complexity of the population protocol, with certain number of states. We will also see protocols with O(log^2 n) states that could solve LE and majority problem in logarithmic time. In the second and third paper, we will see with different assumptions, we can get better results.

**Readings**

- Time-Space Trade-offs in Molecular Computation
- A Simple Population Protocol for Fast Robust Approximation Majority
- Minimizing Message Size in Stochastic Communication Patterns