well spotted! in a normal setting, you would typically want to match by ID as here: "MERGE INTO fruits as USING raw_fruits ON fruits.id = raw_fruits.id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT *".
however, because this is a synthetic dataset, the IDs are sometimes the same for different fruit :D so I had to use a fruit name to make the merge logic working as intended
python is only used to operationalize it all -- easier to work with Glue from awswrangler, but you're right this is not strictly needed 👍