Problem: Do you know what your SharePoint Search Crawler is doing?
Solution: If you want to monitor what the SharePoint crawler is doing, you can watch the Threads Accessing Network, Filtering Threads and Idle Threads from the OSS Search Gatherer category in Performance Monitor during a content crawl.
In my development VM, I see that the number of filtering threads stays at around 20 and the number of idle threads around 12 for the duration of the crawl which means that an average of 8 threads are being used by the gatherer process.
In SharePoint 2010, you can control the content crawl rate at the search service application level by using Crawler Impact Rules. By default, the number of simultaneous requests changes dynamically based on the server hardware and utilization. Crawler impact rules are often used to throttle the request rate for external websites. You can manage crawler impact rules in Central Administration > Search Service Application > Crawler Impact Rules.
Next, let’s create a new crawler impact rule to limit the number of simultaneous requests to 2 using the * site name wildcard to apply the rule to all sites.
If we keep an eye on the performance counters this time, we can see that now the difference between the number of filtering and idle threads during a crawl equals to 2 (11 filtering threads and 9 idle threads in the example below).
The performance counters confirmed that our new crawler impact rule is working. Be careful when you delete a crawler impact rule though! I’ve seen it a number of times in different SharePoint farms that SharePoint remembers and keeps using the last deleted crawler impact rule. I verified it by monitoring the performance counters – the deleted crawler impact rule remains in effect until the the SharePoint Server Search 14 service is restarted (or a new rule is added that overrides the deleted rule settings). So remember – restart the SharePoint Server Search 14 windows service after deleting a crawler impact rule!