Nuvem

What is the cloud?

Virtualized resources on tap

Scaling out of the box

Distributed, multi-vendor, computing

Reproductible configurations

Reproductible science

A new application development and deployment paradigm

Grab all the code!

git.io/cloudEC

Address your tweets @jjmerelo + #ppsn18

Why add cloud to evolutionary algorithms?

✓ It's new!

Well...

✓ No sunk cost!

✓ It scales!

➡ It changes the algorithmic paradigm

♻ Let Nature be your guide

JavaScript = its native language

Let's do Javascript!

Menu → developer → console

firefox console

Say hello to these nice folks!

console.log('¡Hola, chavales!')

Or the much more annoying

alert('¿Qué pasa, coleguis?');

This is an object. That, too.

console.log('Buenos días'.length)

Arrays are objects, and the other way round

console.log(['Buenos días','Buenas tardes','Buenas noches'].pop())

Chromosomes and fitness

var chromosome = '1001100110011';
var fitness_of = new Object;
fitness_of[chromosome] = chromosome.match(/1/g).length;  
var rr = function (chromosome) {
  var fitness = 0;
  for (var i = 0; i < chromosome.length; i+=4 ) {
	var ones = (chromosome.substr(i, 4).match(/1/g) || []).length;
	fitness += ( ones == 0 || ones == 4 ); 
  }
  return fitness;
}; 

JavaScript is:

Standard, (reasonably) fast and

Everywhere

Yes, also in your PS4

(Almost) forget about loops

function do_ea() {
 eo.generation();
 generation_count++;
 if( (eo.fitness_of[eo.population[0]] < traps*conf.fitness.b )
  && ( generation_count*conf.population_size < conf.max_evaluations)) {
  setTimeout(do_ea, 5);
 } else {
  console.log( "Finished ", log );
 }
}

A whole algorithm in a browser

The browser is the new operating system

And why not in the server too?

node.js is an asynchronous JS interpreter.

NodEO is an EA library.

var eo = new nodeo.Nodeo( { population_size: population_size,
			    chromosome_size: chromosome_size,
			    fitness_func: utils.max_ones } );
do {
    eo.generation();
    console.log( eo.population[0] );
} while ( eo.fitness_of[eo.population[0]] < chromosome_size );

Cloud is about reproducible infrastructure

Let's containerize

var hiff = new HIFF.HIFF();
var eo = new nodeo.Nodeo( { population_size: conf.population_size,
			    chromosome_size: chromosome_size,
			    fitness_func: hiff } );
logger.info( { start: process.hrtime() } );
evolve(generation_count, eo, logger, conf, check );

A container does one thing

if ( typeof process.env.PAPERTRAIL_PORT !== 'undefined' 
      && typeof process.env.PAPERTRAIL_HOST !== 'undefined' ) { 
    logger.add(winston.transports.Papertrail, 
	       {
		   host: process.env.PAPERTRAIL_HOST,
		   port: process.env.PAPERTRAIL_PORT
	       }
	      )
	}
var check = function( eo, logger, conf,  generation_count ) {
    if ( (eo.fitness_of[eo.population[0]] < conf.fitness_max ) 
          && (generation_count*conf.population_size < conf.max_evaluations )) {
	logger.info( { "chromosome": eo.population[0],
		       "fitness" : eo.fitness_of[eo.population[0]]} );
	evolve( generation_count, eo, logger, conf, check);
    } else {
	logger.info( {end: { 
	    time: process.hrtime(),
	    generation: total_generations,
	    best : { chromosome : eo.population[0],
		     fitness : eo.fitness_of[eo.population[0]]}}} );
	conf.output = conf.output_preffix+".json";
	process.exit();
    }
};

Describe infrastructure: package.json

{
  "name": "hiffeitor",
  "scripts": {
    "test": "mocha",
    "start": "./callback-ea-HIFF.js"
  },
  "dependencies": {
    "nodeo": "^0.2.1",
    "winston": "^2.2.0",
    "winston-logstash": "^0.2.11",
    "winston-papertrail": "^1.0.2"
  },
  "devDependencies": {
    "flightplan": "^0.6.14"
  }
}

Introducing docker

Lightweight virtualization

Portable infraestructure

Using docker

docker pull jjmerelo/cloudy-ga

Containerizing through Dockerfile

FROM node:alpine 
MAINTAINER JJ Merelo "jjmerelo@gmail.com"
RUN echo "Building a docker environment for NodEO"

#Download basic stuff
RUN apk update && apk upgrade && apk add python make g++ 
		  
RUN mkdir app
ADD https://github.com/JJ/cloudy-ga/raw/master/app/callback-ea-HIFF.js app
ADD https://github.com/JJ/cloudy-ga/raw/master/app/package.json app
ADD https://github.com/JJ/cloudy-ga/raw/master/app/hiff.json app
WORKDIR /app
RUN npm i
RUN chmod +x callback-ea-HIFF.js
CMD npm start

Bring your own container

sudo docker build --no-cache -t jjmerelo/cloudy-ga:0.0.1

... and run it

 sudo docker run -t jjmerelo/cloudy-ga:0.0.1  
     -e "PAPERTRAIL_PORT=7777" 
     -e "PAPERTRAIL_HOST=logs77.papertrailapp.com"

Logging matters

Papertrail

Use CoreOS

Ready to run on Azure or anywhere

It's not programming as usual

Reactive programming

Algorithm + stream = application in the cloud

Decoupled processing and data structures

Before

do {
    eo.generation();
} while ( eo.fitness_of[eo.population[0]] < chromosome_size ); 

Decoupling

var random_chromosome = function() {
    return utils.random( chromosome_size );
};
var population = new Population();
population.initialize( population_size, random_chromosome); 
var eo = new fluxeo( this_fitness,
		     new Tournament( tournament_size,
		        population_size-2 ),
                     check);

Algorithm on population

eo.algorithm( population, function ( population ) {
    logger.info( {
	end: { time: process.hrtime(),
	    generation: total_generations,
	    best : { chromosome : population.best,
		fitness : population.fitness(population.best)  }
	}
    });
});

Running in the cloud

Infrastructure as a service

Create instance

Starting
		       Azure instance

Set up with Ansible

- hosts: "{{target}}"
  tasks:
    - name: install prerrequisites
      command: apt-get update -y && apt-get upgrade -y
    - name: install packages
      apt: pkg={{ item}}
      with_items:
        - git 
        - npm
    - name: Create profile
copy: content="export PAPERTRAIL_PORT={{PAPERTRAIL_PORT}}}" 
      dest=/home/cloudy/.profile

Run the playbook

ansible-playbook git.playbook.yml 
        -e "target=azuredeb" 
        -u ubuntu 
        -i ./hosts.txt -vvvv
Running ansible

Ready to run ✓

Running in  azure

But there's something missing here

Deploying to the cloud

Let's use FlightPlan

plan.target('azure', {
  host: 'cloudy-ga.cloudapp.net',
  username: 'azureuser',
  agent: process.env.SSH_AUTH_SOCK
});
// Local
plan.local(function(local) {
    local.echo('Plan local: push changes');
    local.exec('git push');
});

... And after setup

plan.remote(function(remote) {
    remote.log('Pull');
    remote.with('cd cloudy-ga',function() {
	remote.exec('git pull');
	remote.exec('cd app;npm install .');
    });
    remote.with('cd /home/azureuser/cloudy-ga/app',function() {
	remote.exec('npm start');
    });
});

IaaS have free tiers

But it generally is pay-as-you-go

Great if you do small amounts of computation

Browsers communicate using HTTP commands

PUT, GET, POST, DELETE

Ajax, a standard browser-server communication framework

HTTP petitions from a standard object.

Asynchronous!

There's freemium PaaS

Heroku, OpenShift and Google AppSpot

it's openshift capture

Pool-based evolutionary algorithms: not so canonical any more

Detaching population from operations

Reactive programming.

pool schema

Three good things about pool-based EAs

1. Self-organizing clients

2. Fully asynchronous

3. Persistent population

Island models can be used too

The cloudy server

app.put('/experiment/:expid/one/:chromosome/:fitness/:uuid', 
  function(req, res){
        // stuff here
	logger.info("put", { chromosome: req.params.chromosome,
			     fitness: parseInt(req.params.fitness),
			     IP: client_ip,
			     worker_uuid:req.params.uuid} );
        res.send( { length : Object.keys(chromosomes).length });
     }
app.get('/random', function(req, res){
	var keys = Object.keys(chromosomes );
	var one = keys[ Math.floor(keys.length*Math.random())];
	res.send( { 'chromosome': one } );
	logger.info('get');    
});

Check out

✓ Asynchronous

✓ Uses Logger

Changes in the client: draw from pool

rest.get( conf.url + 'random' ).on('complete', function( data ) {
		if ( data.chromosome ) {
		    population.addAsLast( data.chromosome );
		}
});

Put into pool

var this_request = conf.url
		+ 'experiment/0/one/' + population.best() + "/" 
		+ population.fitness(population.best()) + "/"
		+ UUID;
	    rest.put( this_request ).on("complete", function( result, response ) {
		if ( response.statusCode == 410 ) {
		    finished = true;
		    experiment_id = result.current_id;
		}
});
papertrail log

Logs glue everything together

Go serverless

type Individual struct {
	Chromosome string
}

func main() {
	p := &Individual{Chromosome: ""}
	json.NewDecoder(os.Stdin).Decode(p)
	count_ones := 0
	for i := 0; i < len(p.Chromosome); i++ {
		if p.Chromosome[i] == '1' {
			count_ones++;
		}
	}
	fmt.Printf("{ eval: %d }", count_ones)
}
Client Log

Combine with asynchronous queue

KafkEO scheme

Visit us at the poster session!

Vagrant for orchestration

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/xenial64"
  config.vm.provision "shell", inline: <<-SHELL
     apt-get update
     apt-get upgrade -y
  SHELL
  config.vm.provision "main", type: "ansible" do |ansible|
    ansible.extra_vars = { target: "all" }
    ansible.playbook = "playbook.yml"
  end
  # and the rest...
end

All together

✓ Get servers ➡ PaaS, Loggers

✓ Create/provision boxes ➡ Vagrant/Ansible

✓ Deploy/run ➡ FlightPlan (or serverless)

Take this home

  1. Cloud is the new (grid|cluster)
  2. There is (almost) free lunch
  3. Reactive programming
  4. I ❤ logs

Questions?

Code: git.io/cloudEC

Tweet out (of follow) @jjmerelo

Credits