Age | Commit message (Collapse) | Author |
|
There were a number of places in the res_pjsip stack that were getting
all endpoints or all aors, and then filtering them locally.
A good example is pjsip_options which, on startup, retrieves all
endpoints, then the aors for those endpoints, then tests the aors to see
if the qualify_frequency is > 0. One issue was that it never did
anything with the endpoints other than retrieve the aors so we probably
could have skipped a step and just retrieved all aors. But nevermind.
This worked reasonably well with local config files but with a realtime
backend and thousands of objects, this was a nightmare. The issue
really boiled down to the fact that while realtime supports predicates
that are passed to the database engine, the non-realtime sorcery
backends didn't.
They do now.
The realtime engines have a scheme for doing simple comparisons. They
take in an ast_variable (or list) for matching, and the name of each
variable can contain an operator. For instance, a name of
"qualify_frequency >" and a value of "0" would create a SQL predicate
that looks like "where qualify_frequency > '0'". If there's no operator
after the name, the engines add an '=' so a simple name of
"qualify_frequency" and a value of "10" would return exact matches.
The non-realtime backends decide whether to include an object in a
result set by calling ast_sorcery_changeset_create on every object in
the internal container. However, ast_sorcery_changeset_create only does
exact string matches though so a name of "qualify_frequency >" and a
value of "0" returns nothing because the literal "qualify_frequency >"
doesn't match any name in the objset set.
So, the real task was to create a generic string matcher that can take a
left value, operator and a right value and perform the match. To that
end, strings.c has a new ast_strings_match(left, operator, right)
function. Left and right are the strings to operate on and the operator
can be a string containing any of the following: = (or NULL or ""), !=,
>, >=, <, <=, like or regex. If the operator is like or regex, the
right string should be a %-pattern or a regex expression. If both left
and right can be converted to float, then a numeric comparison is
performed, otherwise a string comparison is performed.
To use this new function on ast_variables, 2 new functions were added to
config.c. One that compares 2 ast_variables, and one that compares 2
ast_variable lists. The former is useful when you want to compare 2
ast_variables that happen to be in a list but don't want to traverse the
list. The latter will traverse the right list and return true if all
the variables in it match the left list.
Now, the backends' fields_cmp functions call ast_variable_lists_match
instead of ast_sorcery_changeset_create and they can now process the
same syntax as the realtime engines. The realtime backend just passes
the variable list unaltered to the engine. The only gotcha is that
there's no common realtime engine support for regex so that's been noted
in the api docs for ast_sorcery_retrieve_by_fields.
Only one more change to sorcery was done... A new config flag
"allow_unqualified_fetch" was added to reg_sorcery_realtime.
"no": ignore fetches if no predicate fields were supplied.
"error": same as no but emit an error. (good for testing)
"yes": allow (the default);
"warn": allow but emit a warning. (good for testing)
Now on to res_pjsip...
pjsip_options was modified to retrieve aors with qualify_frequency > 0
rather than all endpoints then all aors. Not only was this a big
improvement in realtime retrieval but even for config files there's an
improvement because we're not going through endpoints anymore.
res_pjsip_mwi was modified to retieve only endpoints with something in
the mailboxes field instead of all endpoints then testing mailboxes.
res_pjsip_registrar_expire was completely refactored. It was retrieving
all contacts then setting up scheduler entries to check for expiration.
Now, it's a single thread (like keepalive) that periodically retrieves
only contacts whose expiration time is < now and deletes them. A new
contact_expiration_check_interval was added to global with a default of
30 seconds.
Ross Beer reports that with this patch, his Asterisk startup time dropped
from around an hour to under 30 seconds.
There are still objects that can't be filtered at the database like
identifies, transports, and registrations. These are not going to be
anywhere near as numerous as endpoints, aors, auths, contacts however.
Back to allow_unqualified_fetch. If this is set to yes and you have a
very large number of objects in the database, the pjsip CLI commands
will attempt to retrive ALL of them if not qualified with a LIKE.
Worse, if you type "pjsip show endpoint <tab>" guess what's going to
happen? :) Having a cache helps but all the objects will have to be
retrieved at least once to fill the cache. Setting
allow_unqualified_fetch=no prevents the mass retrieve and should be used
on endpoints, auths, aors, and contacts. It should NOT be used for
identifies, registrations and transports since these MUST be
retrieved in bulk.
Example sorcery.conf:
[res_pjsip]
endpoint=config,pjsip.conf,criteria=type=endpoint
endpoint=realtime,ps_endpoints,allow_unqualified_fetch=error
ASTERISK-25826 #close
Reported-by: Ross Beer
Tested-by: Ross Beer
Change-Id: Id2691e447db90892890036e663aaf907b2dc1c67
|
|
A few of the CLI commands weren't checking for enough arguments
and were SEGVing.
Change-Id: Ie6494132ad2fe54b4f014bcdc112a37c36a9b413
|
|
This change introduces the configuration option 'full_backend_cache'
which changes the cache to be a full mirror of the backend instead
of a per-object cache. This allows all sorcery retrieval operations
to be carried out against it and is useful for object types which
are used in a "retrieve all" or "retrieve some" pattern.
ASTERISK-25625 #close
Change-Id: Ie2993487e9c19de563413ad5561c7403b48caab5
|
|
Change-Id: If83d63cf11cbc6df9b15251848b01feb570ade49
|
|
A deadlock can happen when a sorcery object is being expired from the
memory cache when at the same time another object is being placed into the
memory cache. There are a couple other variations on this theme that
could cause the deadlock. Basically if an object is being expired from
the sorcery memory cache at the same time as another thread tries to
update the next object expiration timer the deadlock can happen.
* Add a deadlock avoidance loop in expire_objects_from_cache() to check if
someone is trying to remove the scheduler callback from the scheduler.
ASTERISK-25441 #close
Change-Id: Iec7b0bdb81a72b39477727b1535b2539ad0cf4dc
|
|
Make sorcery_memory_cache_close() call remove_all_from_cache() instead of
partially inlining it.
ASTERISK-25441
Change-Id: I1aa6cb425b1a4307096f3f914d17af8ec179a74c
|
|
Basically you should shutdown in the opposite order of how you setup since
later setup pieces likely depend on earlier setup pieces. e.g.,
Registering your external API with the rest of the system should be the
last thing setup and the first thing unregistered during shutdown.
Change-Id: I5715765b723100c8d3c2642e9e72cc7ad5ad115e
|
|
Change-Id: I8cd32dffbb4f33bb0c39518d6e4c991e73573160
|
|
Change-Id: Ibca6574dc3c213b29cc93486e01ccd51f5caa46c
|
|
In Jenkins there is currently a sporadic test failure of a
variable number of sorcery memory cache unit tests. I have not
been able to reproduce this on the build agents themselves or
on my development machine.
My working theory is that the stale unit test is causing a
sorcery instance to persist longer than expected, causing subsequent
tests to fail when setting up and initializing the next
sorcery instance.
To see if this is the case this change moves the stale unit test
to execute last so no subsequent unit tests can have issues
initializing their sorcery instance.
Change-Id: Ifd6550a949613be774b75fa5db12c02110f82c4a
|
|
Analyzing the code shows that the unit test summary and description
strings should not end with a new-line character. Where these strings are
used in the code a new-line is provided for output.
Change-Id: I2f4f37988ec363c8d1c5077a2fc8ca841c5cd30c
|
|
To prevent confusion I am removing the prefetch option until such
time as it is implemented. All other functionality, however, has
been implemented.
ASTERISK-25067
Change-Id: I9ce6aa3e5c6c5bc3c5baa8ff90fa036d73939895
|
|
memory cache."
|
|
This change implements the expire_on_reload option for memory caches.
If enabled and a reload is performed all objects within the cache
will be expired and the cache emptied.
ASTERISK-25067
Reported by: Matt Jordan
Change-Id: Id46aa1957d660556700e689e195eed57c989b85e
|
|
This change adds a CLI command which can perform memory cache thrashing as well
as unit tests which perform thrashing under the following configurations:
1. Low number of unique objects that go stale after 1 second
2. Low number of unique objects that expire after 1 second
3. Low number of unique objects which are constantly updated
4. Large number of unique objects which exceed a defined cache size
5. Large number of unique objects which exceed a defined cache size
that also expire and go stale rapidly
6. Large number of unique objects which expire and go stale rapidly
7. Large number of unique objects
For all of the above there are a large number of threads constantly
attempting to retrieve random objects and each test runs for a few
seconds.
ASTERISK-25067
Reported by: Matt Jordan
Change-Id: I8c8ceff977332c80ed4a31f10d694d48552b2f78
|
|
This change adds a testsuite event for when a refresh occurs.
This is useful as it provides a guaranteed mechanism of knowing when
it has occurred instead of waiting an arbitrary amount of time.
ASTERISK-25067
Reported by: Matt Jordan
Change-Id: Iaa6b8d2d6bab7f99ee08e1c8908b8272a8987e65
|
|
This change adds the following CLI commands and AMI actions:
sorcery memory cache show
sorcery memory cache dump
sorcery memory cache expire
sorcery memory cache stale
SorceryMemoryCacheExpire
SorceryMemoryCacheExpireObject
SorceryMemoryCacheStale
SorceryMemoryCacheStaleObject
These allow both examination and manipulation of sorcery memory
caches from external sources.
Cached objects can be explicitly expired from a cache or marked
as stale. If expired they are immediately removed. If marked as
stale they will be background refreshed when next retrieved.
ASTERISK-25067
Reported by Matt Jordan
Change-Id: I68e03cfd8c34b5e07f4b6ee4fd93a3f4a00a3d9e
|
|
This change introduces a check of object_lifetime_stale when retrieving
cached objects. If the amount of time the object has been in the cache
exceeds the lifetime, then a task is scheduled to update the cached
object based on an object retrieved from other sorcery wizards instead.
To prevent the cached object from being retrieved during a refresh,
thread-local storage is used to mark the thread as being a stale object
update. This results in the cache returning no object, leading to
sorcery querying other wizards for the object instead.
A test has been added for stale objects as well. This test ensures that
stale objects are retrieved the same as freshly-cached objects. The test
also ensures that after an object is stale, changes in the backend are
reflected in the cache, to include if the object has been deleted from
the backend.
ASTERISK-25067
Reported by Matt Jordan
Change-Id: I9bd7c049adf6939bfe2899f393c2bfbbf412d217
|
|
This makes the "object_lifetime_maximum" option operational.
On the addition of an object to an empty memory cache a scheduled
task is created which, when invoked, expires objects from the cache
which have exceeded their lifetime. If more objects have been added
the remaining life of the oldest object is used to schedule the
next invocation of the scheduled task.
If the oldest object is removed from the cache before it can be
expired automatically the scheduled task is cancelled, if possible,
and the lifetime of the next oldest is used to schedule the task.
If during these two operations no additional objects exist in the
cache then no task is scheduled.
An additional unit test has been added which verifies this
functionality.
ASTERISK-25067
Reported by: Matt Jordan
Change-Id: I87409674674a508e7717ee20739ca15cec6ba7b6
|
|
This makes the "maximum_objects" option operational.
A heap has been added alongside the hash table in the cache. When
objects are added to the cache, they are also added to the heap.
Similarly, when objects are removed from the cache, they are removed
from the heap.
The heap's use comes into play when an item is to be added to a "full"
cache. When the cache is full, the oldest item is removed from the
cache, using the heap to determine the oldest item.
A unit test has been added that verifies that the maximum_objects option
works as expected and that the oldest object is removed from the cache
when an object beyond the maximum is added.
ASTERISK-25067 #close
Reported by Matt Jordan
Change-Id: I490658830e9c4cbf0b3051e4cdc4913cf9f1b73a
|
|
This change adds a basic res_sorcery_memory_cache module which implements
configuration option parsing, configuration file parsing for threading,
sorcery interface implementation, and unit tests.
Objects can be added, updated, deleted, and retrieved from the memory
cache. Automatic expiration and stale handling will be added in the
future.
Note that unit tests exist within the module itself in case the
threading done as a result of expiration results in asynchronous
actions (which it likely will). Providing access and a notification
mechanism for an external test module would be complicated and
not worth it.
ASTERISK-25067 #close
Reported by: Matt Jordan
Change-Id: Id8a6a357ef5a83d466f81eee56a67d13eeb118b9
|