forked from KvasirSecurity/Kvasir
-
Notifications
You must be signed in to change notification settings - Fork 0
De duping Tables
grutz@jingojango.net edited this page Oct 10, 2013
·
1 revision
Sometimes an import will go crazy, usually because of a bug or web2py scheduler weirdness. This can cause duplicates. Here's how to clean them out.
NOTE This has only been validated for PostgreSQL databases. Others may work or may not.
What constitutes a record duplication? Lets take the t_host_os_refs table as an example. This table includes the following fields:
id
f_certainty
f_class
f_family
f_hosts_id
f_os_id
In some rare cases multiple records may appear in the database where f_hosts_id, f_os_id and f_certainty are all the same. In these cases we only want to pick the lowest id record and purge the rest.
./web2py.py -S appname -M -R applications/appname/private/dedupe.py -A -f f_hosts_id -f f_os_id -f f_certainty -d t_host_os_refs
./web2py.py -S appname -M -R applications/appname/private/dedupe.py -A -f f_services_id -f f_vulndata_id -f f_status -f f_exploited t_service_vulns