How to look for updated rows when using AWS Glue?
I'm trying to use Glue for ETL on data I'm moving from RDS to Redshift.
As far as I am aware, Glue bookmarks only look for new rows using the specified primary key and does not track updated rows.
However that data I am working with tends to have rows updated frequently and I am looking for a possible solution. I'm a bit new to pyspark, so if it is possible to do this in pyspark I'd highly appreciate some guidance or a point in the right direction. If there's a possible solution outside of Spark, I'd love to hear it as well.