Using snow (and snowfall) with AWS for parallel processing in R

In relation to my earlier similar SO question , I tried using snow/snowfall on AWS for parallel computing.

What I did was:

  • In the sfInit() function, I provided the public DNS to socketHosts parameter like so sfInit(parallel=TRUE,socketHosts =list(""))
  • The error returned was Permission denied (publickey)
  • I then followed the instructions (I presume correctly!) on in the 'Passwordless Secure Shell (SSH) login' section
  • I just cat the contents of the .pem file that I created on AWS into the ~/.ssh/authorized_keys of the AWS instance I want to connect to from my master AWS instance and for the master AWS instance as well

Is there anything I am missing out ? I would be very grateful if users can share their experiences in the use of snow on AWS.

Thank you very much for your suggestions.

UPDATE: I just wanted to update the solution I found to my specific problem:

  • I used StarCluster to setup my AWS cluster : StarCluster
  • Installed package snowfall on all the nodes of the cluster
  • From the master node issued the following commands
  • hostslist <- list("","")
  • sfInit(parallel=TRUE, cpus=2, type="SOCK",socketHosts=hostslist)
  • l <- sfLapply(1:2,function(x)system("ifconfig",intern=T))
  • lapply(l,function(x)x[2])
  • sfStop()
  • The ip information confirmed that the AWS nodes were being utilized

Answers 1

  • I believe @Anatoliy is correct: you're using an X.509 certificate. For the precise steps to take to add the SSH keys, look at the "Types of credentials" section of the EC2 Starters Guide.

    To upload your own SSH keys, take a look at this page from Alestic.

    It is a little confusing at first, but you'll want to keep clear which are your access keys, your certificates, and your key pairs, which may appear in text files with DSA or RSA.

Related Articles