Deterministically creating and tagging EC2 instances
I am creating 3 EC2 instances, and subsequently iterating and tagging each of them. Sometimes the tag request fails, although the instance later appears to be running.
Could this be a timing issue? Should I wait a few seconds after creating the instance before tagging it? Is there a deterministic way to wait for it to start?
Answers 1
Update 20140512
AWS has meanwhile added more detailed documentation on Troubleshooting API Request Errors, including a section addressing Eventual Consistency, which basically confirms the analysis in my initial answer below:
Please note: Most AWS SDKs meanwhile apply these suggestions automatically, including options to adjust the default retry policy or add a custom implementation even - see Error Retries and Exponential Backoff in AWS for guidance on how to implement it yourself, if need be.
Update 20130719
The eventually consistent design of the AWS API is increasingly encountered by various large scale AWS users, who naturally need to look deeper and work around it accordingly, see for example the following articles:
Initial Answer
As already commented by @datasage, the AWS APIs apparently need to be generally treated as eventually consistent only - this is certainly unexpected when first encountered, but actually not too surprising for a large scale service in hindsight, i.e. an engineering resp. operational tradeoff to address the CAP theorem.
See also my comment on Alex Ciminian's question Implementing idempotency for AWS Spot Instance Requests, where he discusses his test results regarding similar consistency issues:
For details on the mentioned cases you might want to look into Frequent polling of AWS API causes throttle limit, where I summarize our analysis and approach to improve the handling via the available but limited retry/backoff functionality within the AWS SDK for Java - the solution is all but ideal, but it seems to considerably improve things for the time being.
On a similar note, the redesigned AWS SDK for PHP 2 introduced dedicated “Waiter” objects that allow you to poll a resource until it is in a desired state to address the problem, see section Waiters within the Quick Start for details: