Why the filesystem scanner may perform very slow when scanning folder with long names?

Problem description:

When an object is scanned, the filesystem scanner checks in the MC database if the object was already scanned by a previous scan. This check is done based on the file path (id_on_source_column). This column is indexed so checking the file existence shouldn't take more than a couple of milliseconds. But this is not always simple with Oracle optimizer. If you have for example 100.000 files located under a root path whose length is greater than 32 characters (Ex: D:\migration-center\path longer than 32 characters...) Oracle optimizer may decide NOT to use the index but to use full table scan. This is mainly because the selectivity for a query is calculated based on the index histogram, but in the histogram Oracle keeps only the first 32 characters of each value. In this case the optimizer thinks there are more than 100.000 rows in the table that meet the path criteria and decide to perform a full table scan. More tehnical details can be found in this nice article: http://hemantoracledba.blogspot.ro/2009/08/histograms-on-larger-columns.html

Solution:

Run the following commands in SQLPLUS connected as an user that have execute privileges on "SYS.dbms_stats" package. User SYSTEM or SYS should work but FMEMC may not always work. This will force Oracle to always use the index when checking if an object was already scanned.

BEGIN
dbms_stats.gather_table_stats( 'FMEMC', 'SOURCE_OBJECTS', method_opt=>'for columns size 1 ID_IN_SOURCE_SYSTEM' );
END;
/

 

In order to make this fix permanent, the following commands need to be executed:

BEGIN
dbms_stats.set_table_prefs('FMEMC', 'SOURCE_OBJECTS','METHOD_OPT', 'FOR ALL COLUMNS SIZE AUTO, FOR COLUMNS SIZE 1 ID_IN_SOURCE_SYSTEM');
END;
/
 

Note: The solution works only with Oracle 11g. 

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk