Manual Database Update

Updating production database directly must be the last resort. Using support API, bulk uploads are better option. But if you are unlucky enough that you have to do database update directly then...

  • For everyone's sanity the SQL should be present in https://github.com/avniproject/data-fixes repository.

  • Ensure that the database updates SQL is checked in another environment and end to end testing has been done after the update.

  • Ensure that code review has been done for the SQLs.

  • For transaction data last modified date time should be modified. Increment like the following (any data reported by database like number of rows updated can be put into comment when the queries are run).

  • update individual set 
    	last_modified_date_time = current_timestamp + (random() * 5000 * (interval '1 millisecond'))
    where foo = bar;
    
  • If the number of rows updated exceed 10000 records then we should ideally go for downtime.

  • In transaction tables a manual update history is maintained for troubleshooting. This column should not be updated directly rather use the
    append_manual_update_history
    function to update. This function maintain the previous history records and appends the new value, maintaining date of update.

  • update program_enrolment 
    	set manual_update_history = append_manual_update_history(program_enrolment.manual_update_history, 'avni-server#676')
     where foo = bar;
    
  • Use data-fixes Github project for SQL scripts.

  • If ETL is enabled for the org for which the updates are being run, disable it before executing the updates and re-enable it after the updates are committed to avoid issues with the ETL tables not being updated.

  • In the SQL script follow the following pattern

  • set role org;
    
    begin transaction; -- so that you can rollback if something is goes wrong
    
    -- Before queries
    select count(*) from individual where observations->>'dfds' > 2; -- example. enter the data that you see here in test environment in comment
    
    -- DML statements
    -- insert/update etc
    
    -- After queries
    select count(*) from individual where observations->>'dfds' > 2; -- example. enter the data that you see here in test environment in comment
    
    
    -- rollback;
    commit;
    
  • If your SQL script is likely to update a lot of rows in the database and the users may also be updating the data at the same time then the data update should be done in production by bringing down the server first. Note that this approach is not valid for API based update. How to handle the concurrent update, when updating via API - has to be figured out.

    • For smaller number of rows but with concurrency Select for update or higher transaction isolation level can be used.