Today I learned something astonishing about cache implementation.
What I thought before: When I have a cache with 20,000 entries and it fills up, I would define two marks. Let’s call them “high watermark” and “low watermark”. When the number of entries reaches the high watermark I will start to delete the most unused entries until we reach the low watermark. Let’s call this process “garbage collection”, furthermore.
The values for high and low watermark should be well chosen. So maybe I will choose a value of 19,500 entries for the high watermark. By that I will probably have enough time to do my garbage collection before the cache is completely filled up. And I will definitely choose a low watermark other than 0. Because with emptying the whole cache regularly used entries cannot be used anymore and everything gets slow.
So I would choose a value not too low and not too high for the low watermark. Not too low because I would like a big part of the cache to be usable after garbage collection. Not too high because I do not want to run garbage collection every 15 seconds and I also do not want to look if to run a garbage collection every 0.001 seconds. I would name the process of finding good values for high and low watermark “tuning”.
This is how I would build a cache. Maybe I’m wrong. That is possible. And that seems to be what Check Point is thinking.
Today I stumbled over the cache table for Check Point’s URL Filtering (
urlf_cache_tbl). This table seems to be quite funny.
First, you cannot delete this table with issuing
fw tab -t urlf_cache_tbl -x
like other kernel tables. There is an own article (sk64280: How to clear URL Filtering kernel cache?) in the Secure Knowledge database for that.
Second, you will not see any peak value for this table when you issue
fw tab -t urlf_cache_tbl -s
like in other kernel tables.
So, how would you recognise the table reaching its default limit of 20,000 entries? That leads us to:
Third, you will spot a cache table overrun when you see the table reaching nearly 20,000 entries and some moments later have round about 0 entries. That is what Check Points states in sk90422: How to modify URL Filtering cache size?.
- High watermark is 20,000.
- Low watermark is 0.
- No peak value for table usage will be recorded.
- Garbage collection means emptying the complete table.
Check Point confesses that this behaviour may lead to high load for specific processes and even to timeout and failure. And then they advise to use a greater cache table. But without using the peak entry for the table they make it more difficult than necessary to detect cache table overruns.
Seems not like a good implementation to me. Or am I missing something important?